mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Software (https://www.mersenneforum.org/forumdisplay.php?f=10)
-   -   llrCUDA Version 3.8.7 is released! (https://www.mersenneforum.org/showthread.php?t=28170)

Jean Penné 2022-10-20 18:00

llrCUDA Version 3.8.7 is released!
 
Hi All,

I uploaded to-day the new version 3.8.7 of llrCUDA on my personal site jpenne.free.fr

What is new in this version :

I added two new ABC formats, principally to help PRP searchers.
- k*b^n+c format with k, b, c fixed, for example : ABC 22*17^n+13
- (k*b^n+c)/d format with k, b, c, d fixed, for example : ABC (16^n+619)/5
- In Proth or LLR tests, even values of k yield a false result...
These bugs are now fixed.
May 2022 : The range of available FFT lengthes has been extended using
SSE2 Woltman tables. This allowed this version to test successfully
M82589933 in less than 8 days!

Here are some of previous updates :

- The maximum value of bits per input double word that was defaulted to 35.0 is now defaulted to 37.0.
moreover, it may be changed by user, using -oMAXBPD=xx.x option.
It can be useful to decrease this value if too much round off errors occur...

- In previous Version 3.8.4, one call to free() function was missing in Gerbicz
error checking code ; this caused an important memory leak...
This issue is now fixed here!
No much new feature, but some improvements related to reliability and speed.

Please, let me know if you have any problem to run the binary on Linux and/or to build it on your system.

I wish you many successes in prime hunting!
Best Regards,
Jean

Jean Penné 2022-10-21 12:48

Fixing a Round off errors cause.
 
Hi,

While testing k*2^n+c or (k*2^n+c)/d numbers using this new version, I saw many Round off errors...
This was due to an under estimate of the number of bits per words in the double words array form of the number, when abs(c) > 1
This drawback is fixed to-day.

Best Regards,
Jean

pepi37 2022-10-22 13:48

I download to try new llrCUDA on Linux since it can be used for PRP searching. Linux box has installed latest nvidia drivers ,and all GPU works , running GFN search.


root@OMICRON:~/LLR# ./llrCUDA
./llrCUDA: error while loading shared libraries: libcufft.so.8.0: cannot open shared object file: No such file or directory


So can you build static app? I try to compile it, but got many errors since first line in make file point to non exit directory in my installation.

Jean Penné 2022-10-23 12:10

Linking with current directory
 
[QUOTE=pepi37;616270]I download to try new llrCUDA on Linux since it can be used for PRP searching. Linux box has installed latest nvidia drivers ,and all GPU works , running GFN search.


root@OMICRON:~/LLR# ./llrCUDA
./llrCUDA: error while loading shared libraries: libcufft.so.8.0: cannot open shared object file: No such file or directory


So can you build static app? I try to compile it, but got many errors since first line in make file point to non exit directory in my installation.[/QUOTE]

I did not succeed to build a static application, but I could cheat by uploading a llrCUDA binary with libcudart and libcufft libraries included in your current directory. It results in a voluminous compressed file (117 MO), but I hope it will resolve your problem...
Would you inform me about that.

Best Regards,

Jean

pepi37 2022-10-23 14:22

[QUOTE=Jean Penné;616317]I did not succeed to build a static application, but I could cheat by uploading a llrCUDA binary with libcudart and libcufft libraries included in your current directory. It results in a voluminous compressed file (117 MO), but I hope it will resolve your problem...
Would you inform me about that.

Best Regards,

Jean[/QUOTE]
Waiting for link,and of course I will inform you of result.
Best regards

pepi37 2022-10-25 17:03

I download your copy with so files, but since my system is on latest drivers , cuda 8 was not working. Then I need to download latest cuda toolkit (11.8) recompile all, and got few warnings when I compile llrcuda , but at the end it looks working



[CODE]root@THETA:~/llrcuda# ./llrCUDA -d -q"7567567*2^67679-1"
Starting Fermat PRP test of 7567567*2^67679-1
Using complex zero-padded rational base DWT, FFT length = 8704, a = 3
7567567*2^67679-1 is not prime. RES64: 2CC12983FF83092B. Time : 52.935 sec.[/CODE]



You need to add option for devices ( since I can only use one device on multidevice box) ( switch like -d1 , -d2...)
It looks too slow on any other base then 2.
Maybe it is slower because those errors while compiling....but I remember in past that llrCUDA was never fast as cpu llr

Jean Penné 2022-10-27 10:35

Static binary for llrCUDA 3.8.7
 
Hi,

I succeeded to-day to build a static binary for llrCUDA 3.8.7 by updating correctly the Makefile.
So, I uploaded the new binaries and the source directory accordingly.

Best Regards,

Jean

sweety439 2022-10-28 04:10

[QUOTE=pepi37;616474]

[CODE]root@THETA:~/llrcuda# ./llrCUDA -d -q"7567567*2^67679-1"
Starting Fermat PRP test of 7567567*2^67679-1
Using complex zero-padded rational base DWT, FFT length = 8704, a = 3
7567567*2^67679-1 is not prime. RES64: 2CC12983FF83092B. Time : 52.935 sec.[/CODE][/QUOTE]

7567567*2^67679-1 is divisible by 5, trial factoring (or sieving progress) can show that it is composite immediately, this number need not to use LLR or PFGW

pepi37 2022-10-28 11:45

[QUOTE=sweety439;616679]7567567*2^67679-1 is divisible by 5, trial factoring (or sieving progress) can show that it is composite immediately, this number need not to use LLR or PFGW[/QUOTE]
I dont care is that candidate have factor or not, I just write it to test speed.

sweety439 2022-10-28 13:42

[QUOTE=pepi37;616696]I dont care is that candidate have factor or not, I just write it to test speed.[/QUOTE]

For numbers with small prime factors (i.e. prime factors < 10^4), LLR and PFGW will return composite for a very short time (<0.01 second)

pepi37 2022-10-29 06:39

[QUOTE=sweety439;616707]For numbers with small prime factors (i.e. prime factors < 10^4), LLR and PFGW will return composite for a very short time (<0.01 second)[/QUOTE]
Once and again, I need to see time that llrCUDA need, to compare with llr...

Jean Penné 2022-10-29 16:27

Can you use the static binary?
 
[QUOTE=pepi37;616757]Once and again, I need to see time that llrCUDA need, to compare with llr...[/QUOTE]

I succeeded finally to build a static binary, but I don't know if it works at your home...
Would you inform me about that? Thank you by advance!

Also would you note that llrCUDA can be faster than CPU LLR only for base two and large numbers (at least 1M decimal digits).
For smaller numbers, the GPU parallelism is not an advantage enough...

Regards,

Jean

pepi37 2022-10-29 20:40

[QUOTE=Jean Penné;616789]I succeeded finally to build a static binary, but I don't know if it works at your home...
Would you inform me about that? Thank you by advance!

Also would you note that llrCUDA can be faster than CPU LLR only for base two and large numbers (at least 1M decimal digits).
For smaller numbers, the GPU parallelism is not an advantage enough...

Regards,

Jean[/QUOTE]
Of course I will inform you:now I am on trip, in few days will be at home

Jean Penné 2022-12-03 07:14

[QUOTE=pepi37;616801]Of course I will inform you:now I am on trip, in few days will be at home[/QUOTE]

Have you now tested the static binary? I hope it is OK for you!

Regards,
Jean


All times are UTC. The time now is 13:58.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.