mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   llrCUDA Version 3.8.7 is released! (https://www.mersenneforum.org/showthread.php?t=28171)

Jean Penné 2022-10-20 18:03

llrCUDA Version 3.8.7 is released!
 
Hi All,

I uploaded to-day the new version 3.8.7 of llrCUDA on my personal site jpenne.free.fr

What is new in this version :

I added two new ABC formats, principally to help PRP searchers.
- k*b^n+c format with k, b, c fixed, for example : ABC 22*17^n+13
- (k*b^n+c)/d format with k, b, c, d fixed, for example : ABC (16^n+619)/5
- In Proth or LLR tests, even values of k yield a false result...
These bugs are now fixed.
May 2022 : The range of available FFT lengthes has been extended using
SSE2 Woltman tables. This allowed this version to test successfully
M82589933 in less than 8 days!

Here are some of previous updates :

- The maximum value of bits per input double word that was defaulted to 35.0 is now defaulted to 37.0.
moreover, it may be changed by user, using -oMAXBPD=xx.x option.
It can be useful to decrease this value if too much round off errors occur...

- In previous Version 3.8.4, one call to free() function was missing in Gerbicz
error checking code ; this caused an important memory leak...
This issue is now fixed here!
No much new feature, but some improvements related to reliability and speed.

Please, let me know if you have any problem to run the binary on Linux and/or to build it on your system.

I wish you many successes in prime hunting!
Best Regards,
Jean

Jean Penné 2022-10-21 12:50

Fixing a Round off errors cause.
 
Hi,

While testing k*2^n+c or (k*2^n+c)/d numbers using this new version, I saw many Round off errors...
This was due to an under estimate of the number of bits per words in the double words array form of the number, when abs(c) > 1
This drawback is fixed to-day.

Best Regards,
Jean

Jean Penné 2022-10-27 11:59

Static binary for llrCUDA 3.8.7
 
Hi,

I succeeded to-day to build a static binary for llrCUDA 3.8.7 by updating correctly the Makefile.
So, I uploaded the new binaries and the source directory accordingly.

Best Regards,

Jean

Honza 2022-12-03 19:59

Please, Windows build, anyone?

Jean Penné 2022-12-04 13:27

[QUOTE=Honza;618923]Please, Windows build, anyone?[/QUOTE]

Not yet, sorry...

Jean

henryzz 2022-12-04 22:19

[QUOTE=Honza;618923]Please, Windows build, anyone?[/QUOTE]

It should work in WSL2.

Honza 2022-12-05 08:07

[QUOTE=henryzz;618993]It should work in WSL2.[/QUOTE]

What distribution would you recommend?

Mark Rose 2022-12-05 10:00

[QUOTE=Honza;619019]What distribution would you recommend?[/QUOTE]

WSL2 is basically a Linux distribution inside of Windows. You need to install it.

henryzz 2022-12-05 10:57

[QUOTE=Honza;619019]What distribution would you recommend?[/QUOTE]

I have only used the ubuntu version of WSL/WSL2. It was one of the first available. I suspect that support is the best for it.
I have run other CUDA apps(and possibly a version of this) on it.

Honza 2022-12-05 10:58

[QUOTE=Mark Rose;619023]WSL2 is basically a Linux distribution inside of Windows. You need to install it.[/QUOTE]

It did install it.
[url]https://www.c-sharpcorner.com/article/how-to-install-windows-subsystem-for-linux-wsl2-on-windows-11/[/url]

After restart, it explicitly says that you need to install specific distro.
(when trying to start WSL, it says the same error on command line).

I know it's a mess will all different distros and libriries etc.
My idea was to use statically linked libs.

But which distro before I mess up my Windows system?

Windows binary would much easier.
I don't know if there is performance penatly because it would run in WSL and what other limitations there arebut still willing to try.

Mark Rose 2022-12-05 20:29

I would pick Ubuntu. Ubuntu and Microsoft have been working together on the development of WSL.

I haven't touched Windows in about a decade though. Others with more experience may chime in.

Honza 2022-12-06 20:30

I've managed to install WSL / Ubuntu.


Running test.
./sllrCUDA -d -q"6171*2^1658919+1"

After a minut of so, it printed actual start of a test.
Starting Proth prime test of 6171*2^1658919+1
Using complex irrational base DWT, FFT length = 229376, a = 7
Iteration: 1 / 1658930 [0.00%].

After another minute of so, it finaly got going with ETA about 10 minutes (CPU AMD 5950X is finished in ~3 minutes)

About halfway finished, it catched error.

Starting Proth prime test of 6171*2^1658919+1
Using complex irrational base DWT, FFT length = 229376, a = 7
Gerbicz error check passed at iteration 1000013.

Honza 2022-12-06 20:55

I did another test, same error after couple of minutes

(ETA on RTX 2080 about 35 minutes, test finished on AMD 3950X in 12 minutes)

root@Honza:/mnt/d/_ubuntu/llrcuda387slinux64# ./sllrCUDA -d -q"2511*2^3349104+1"
Starting Proth prime test of 2511*2^3349104+1
Using complex irrational base DWT, FFT length = 458752, a = 5
Gerbicz error check passed at iteration 1000012.

slandrum 2022-12-06 21:58

[QUOTE=Honza;619139]I did another test, same error after couple of minutes

(ETA on RTX 2080 about 35 minutes, test finished on AMD 3950X in 12 minutes)

root@Honza:/mnt/d/_ubuntu/llrcuda387slinux64# ./sllrCUDA -d -q"2511*2^3349104+1"
Starting Proth prime test of 2511*2^3349104+1
Using complex irrational base DWT, FFT length = 458752, a = 5
Gerbicz error check passed at iteration 1000012.[/QUOTE]

I don't see any error. The last line said that it ran a test looking for errors, and the test passed - meaning it didn't find any errors. If it had detected an error, the test would have failed.

Honza 2022-12-07 07:51

[QUOTE=slandrum;619142]I don't see any error. The last line said that it ran a test looking for errors, and the test passed - meaning it didn't find any errors. If it had detected an error, the test would have failed.[/QUOTE]

My bad, wrong interpretation.
What got me confused again is that it stays there for a while before printing anything else.
Those pauses between stages are unusuall comparing to CPU LLR and in contrast with GPU version exepected to be faster or at least not having (longer) pauses.

./sllrcuda -d -q"1095*2^330702+1"
Starting Proth prime test of 1095*2^330702+1
Using complex irrational base DWT, FFT length = 40960, a = 13
1095*2^330702+1 is prime! (99555 decimal digits) Time : 177.281 sec.

OK, Linux version can run on WSL.
I don't have a way to test it's performance to native Win binary.
It's considerably slower and less emergy efficient comparing to CPU.
(haven't tryed giant candidates like SoB)


All times are UTC. The time now is 14:49.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.