mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2012-05-21, 15:01   #452
bcp19
 
bcp19's Avatar
 
Oct 2011

7×97 Posts
Default

I like the V5UserID item, with something like that it seems an easy step to be able to either have the spider send results like P95 does, or possibly even incorporate it into the program.
bcp19 is offline   Reply With Quote
Old 2012-05-21, 15:27   #453
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

2×67×73 Posts
Default

Quote:
Originally Posted by bcp19 View Post
I like the V5UserID item, with something like that it seems an easy step to be able to either have the spider send results like P95 does, or possibly even incorporate it into the program.
If I understand what Bdot has done here, the information will be in the results string itself. Thus, the current submission spider will be sending the data to PrimeNet for it to use when it's ready for it.

This also means GPU72 will be able to be extended to determine which computer sent the results as well.
chalsall is online now   Reply With Quote
Old 2012-05-21, 15:36   #454
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

2·67·73 Posts
Default

Quote:
Originally Posted by TheJudger View Post
I have to check but personally I'm not really a fan of file locking... two many failures in the past...
Personally I would really like to see WORKTODO.ADD functionality added to both programs. This is not mutually exclusive of file locking, but in my opinion is a safer way to add work to a running system.

One creator/writer; one reader/deleter. The spider wakes up and checks "worktodo.txt" to see if any more work is needed. If not, it goes back to sleep. If more work is needed, it next checks to see if "worktodo.add" already exists. If it does, it again goes to back to sleep. If it doesn't, it attempts to get new work and places it into a file like "worktodo.adt".

The last step is to move (rename) "worktodo.adt" to "worktodo.add", and goes back to sleep. No race conditions; no locks.

For people like Dubslow who like to order new work with old, file locking is very useful. For most people, workdodo.add functionality is fine and sane.
chalsall is online now   Reply With Quote
Old 2012-05-21, 20:31   #455
Bdot
 
Bdot's Avatar
 
Nov 2010
Germany

3·199 Posts
Default

Quote:
Originally Posted by TheJudger View Post
Let us extend this to SievePrimesMin + SievePrimesMax in mfakt?.ini:
SIEVE_PRIMES_MIN <= SievePrimesMin < SievePrimesMax <= SIEVE_PRIMES_MAX
With SIEVE_PRIMES_M[IN|AX] hardcoded and fix and SievePrimesM[in|ax] usertuneable in mfakt?.ini. (Something that I've on my todo for 0.19)
Yes, that's like it's implemented in mfakto (SievePrimesMax already came in version 0.10). Currently SIEVE_PRIMES_MIN=256 and SIEVE_PRIMES_MAX=1000000 (the later is possible because mfakto always uses 4620 classes, with only 420 classes this could overflow the 24 bits per FC offset).

Quote:
Originally Posted by TheJudger View Post
I have to look at this, fancy stuff!
I guess, this github diff should be close to what you want to look at.

Quote:
Originally Posted by TheJudger View Post
Yes, we are talking together, usually via PM in german (which is easier for both of us I guess). It is a good idea to have both, mfaktc and mfakto, similar/identical in places where it is doable. Ofcourse this is not the case for the GPU code and CUDA/OpenCL specific stuff. An it is no secret that my focus is on the performance while I tend to ignore the "useless stuff" like an user interface.

Oliver
That's allright if you allow others to take care of it

Quote:
Originally Posted by bcp19 View Post
I like the V5UserID item, with something like that it seems an easy step to be able to either have the spider send results like P95 does, or possibly even incorporate it into the program.
Yes, the automatic primenet/gpu72 integration would be nice, and having the IDs will certainly help. But this was the smallest part
Quote:
Originally Posted by chalsall View Post
If I understand what Bdot has done here, the information will be in the results string itself. Thus, the current submission spider will be sending the data to PrimeNet for it to use when it's ready for it.

This also means GPU72 will be able to be extended to determine which computer sent the results as well.
Yes, the UID can now be part of the results line. However, as long as we use primenet's manual submit page, the UID is ignored there. So far, only mersenne-aries can make use of it ...

Quote:
Originally Posted by chalsall View Post
Personally I would really like to see WORKTODO.ADD functionality added to both programs. This is not mutually exclusive of file locking, but in my opinion is a safer way to add work to a running system.
You shall have it with the next version, I promise
Bdot is offline   Reply With Quote
Old 2012-05-24, 22:10   #456
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

722110 Posts
Default

Quote:
Originally Posted by Bdot View Post
As CUDA code is not as separated from the C-code as OpenCL, merging may also be challenging in some cases.
Just taking an initial look at it, TheJudger does a very good job of keeping them separate.
Code:
bill@Gravemind:~/mfaktc-0.18/src∰∂ ls
checkpoint.c     mfaktc.c         selftest-data.c   tf_72bit.h       tf_debug.h
checkpoint.h     my_intrinsics.h  sieve.c           tf_96bit.cu      timer.c
compatibility.h  my_types.h       sieve.h           tf_96bit.h       timer.h
Makefile         params.h         signal_handler.c  tf_barrett96.cu  timeval.h
Makefile.win     read_config.c    signal_handler.h  tf_barrett96.h
Makefile.win32   read_config.h    tf_72bit.cu       tf_common.cu
Dubslow is offline   Reply With Quote
Old 2012-05-25, 12:20   #457
Bdot
 
Bdot's Avatar
 
Nov 2010
Germany

10010101012 Posts
Default

Quote:
Originally Posted by Dubslow View Post
Quote:
Originally Posted by Bdot
As CUDA code is not as separated from the C-code as OpenCL, merging may also be challenging in some cases.
Just taking an initial look at it, TheJudger does a very good job of keeping them separate.
I perfectly agree with you on that!. It seems, my note was easy to misunderstand ...

I was referring to a conceptual difference between OpenCL and CUDA:

In CUDA, you compile the device code right into your binary, enabling shared header files, for instance. In the .cu files you can (and usually will) have CPU code and GPU code mixed on function level.

Using OpenCL, you usually provide the GPU source code to the GPU-compiler at runtime of the binary. If you want to share header files between the binary and the GPU code, you need to ship them, for example.

For this reason I needed a different source file structure for mfakto, making merges between mfakto and mfaktc more difficult. This is, what I wanted to say, no more, no less .
Bdot is offline   Reply With Quote
Old 2012-05-25, 17:26   #458
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

160658 Posts
Default

Quote:
Originally Posted by Bdot View Post
Using OpenCL, you usually provide the GPU source code to the GPU-compiler at runtime of the binary. If you want to share header files between the binary and the GPU code, you need to ship them, for example.




Doesn't that rather defeat the purpose of compiling?
Dubslow is offline   Reply With Quote
Old 2012-05-25, 23:12   #459
Bdot
 
Bdot's Avatar
 
Nov 2010
Germany

3·199 Posts
Default

Quote:
Originally Posted by Dubslow View Post
Doesn't that rather defeat the purpose of compiling?
You compile and link the stuff that runs on the CPU and drives the GPU. However, as OpenCL's claim is to run on a wide variety of devices, it is impractical to have pre-compiled device-code for all possible platforms. Rather, the device vendors ship the compiler in their drivers, and the GPU-code is compiled at runtime. During the build of mfakto, the OpenCL files are not touched. You can easily modify them before starting mfakto, and your changes will be compiled and executed. An approach somewhere between Java and shell scripts .
Bdot is offline   Reply With Quote
Old 2012-05-25, 23:18   #460
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

160658 Posts
Default

Quote:
Originally Posted by Bdot View Post
However, as OpenCL's claim is to run on a wide variety of devices, it is impractical to have pre-compiled device-code for all possible platforms.
Ah, okay, so whereas nVidia knows exactly which cards are CUDA-capable and what they can each do (and it's the only driver provider), OpenCL is (in theory) supposed to be agnostic of whatever device it's running on, which potentially includes a lot more than AMD GPUs, up to and including regular old CPUs. Makes sense (Still, the compilers can't be capable of too much optimization, otherwise you'd have to wait five minutes between when you start the program and when it starts running, especially for more complex code.)
Dubslow is offline   Reply With Quote
Old 2012-05-26, 04:59   #461
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

3×3,221 Posts
Default

Quote:
Originally Posted by Dubslow View Post
...which potentially includes a lot more than AMD GPUs...
Which includes - certainly, not potentially - the NV GPUs too, they are all OpenCL-able, at least at theoretical level...
LaurV is offline   Reply With Quote
Old 2012-05-26, 06:49   #462
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

11100001101012 Posts
Default

Quote:
Originally Posted by LaurV View Post
at least at theoretical level...
http://www.mersenneforum.org/showpos...&postcount=336
Quote:
Originally Posted by Bdot View Post
BTW, testing mfakto on Nvidia turns out to be way more effort than it might be worth. Nvidia's OpenCL compiler is buggy and not yet complete. I had to remove all printf's even though they were in inactive #ifdefs. And once that was done, the compiler crashes.
Code:
Error in processing command line: Don't understand command line argument "-O3"!
Code:
(0) Error: call to external function printf is not supported
Code:
Select device - Get device info - Compiling kernels .Stack dump:
0.      Running pass 'Function Pass Manager' on module ''.
1.      Running pass 'Combine redundant instructions' on function '@mfakto_cl_barrett79'

mfakto-nv.exe has stopped working

"No plan survives contact with the enemy."
Dubslow is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
gpuOwL: an OpenCL program for Mersenne primality testing preda GpuOwl 2719 2021-08-05 22:43
mfaktc: a CUDA program for Mersenne prefactoring TheJudger GPU Computing 3497 2021-06-05 12:27
LL with OpenCL msft GPU Computing 433 2019-06-23 21:11
OpenCL for FPGAs TObject GPU Computing 2 2013-10-12 21:09
Program to TF Mersenne numbers with more than 1 sextillion digits? Stargate38 Factoring 24 2011-11-03 00:34

All times are UTC. The time now is 01:05.


Fri Aug 6 01:05:26 UTC 2021 up 13 days, 19:34, 1 user, load averages: 2.48, 2.42, 2.34

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.