mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing > GpuOwl

Reply
 
Thread Tools
Old 2020-09-24, 12:23   #2476
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

53016 Posts
Default

Quote:
Originally Posted by Prime95 View Post
And another proof failure http://mersenne.org/M108979987
I'll look into hardening the proof generation over the following days.
preda is offline   Reply With Quote
Old 2020-09-25, 02:33   #2477
petrw1
1976 Toyota Corona years forever!
 
petrw1's Avatar
 
"Wayne"
Nov 2006
Saskatchewan, Canada

22×1,117 Posts
Default

Why does it ignore
-log 50000
in config.txt.?

I still get progress updats every 10000.

Code:
-user petrw1 -cpu colab -device 0 -log 50000 -maxAlloc 4000
petrw1 is online now   Reply With Quote
Old 2020-09-25, 03:44   #2478
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

24×83 Posts
Default

Quote:
Originally Posted by petrw1 View Post
Why does it ignore
-log 50000
in config.txt.?

I still get progress updats every 10000.

Code:
-user petrw1 -cpu colab -device 0 -log 50000 -maxAlloc 4000
For what task? P-1 is special, and is being reworked anyway.
preda is offline   Reply With Quote
Old 2020-09-25, 06:15   #2479
petrw1
1976 Toyota Corona years forever!
 
petrw1's Avatar
 
"Wayne"
Nov 2006
Saskatchewan, Canada

22·1,117 Posts
Default

Quote:
Originally Posted by preda View Post
For what task? P-1 is special, and is being reworked anyway.
yes P1 thx
petrw1 is online now   Reply With Quote
Old 2020-09-25, 13:06   #2480
Xyzzy
 
Xyzzy's Avatar
 
"Mike"
Aug 2002

2×5×11×71 Posts
Default

Quote:
Originally Posted by Xyzzy
Code:
2020-06-05 17:13:16 gfx1012-0 OpenCL compilation in 3.10 s
2020-06-05 17:13:17 Radeon Pro W5500-0 77936867 OK        0 loaded: blockSize 400, 0000000000000003
2020-06-05 17:13:21 Radeon Pro W5500-0 77936867 OK      800   0.00%; 2982 us/it; ETA 2d 16:34; 1579c241dc63eca6 (check 1.27s)
2020-06-05 17:23:18 Radeon Pro W5500-0 77936867 OK   200000   0.26%; 2991 us/it; ETA 2d 16:35; f0b04b45b0855bd2 (check 1.28s)
2020-06-05 17:33:15 Radeon Pro W5500-0 77936867 OK   400000   0.51%; 2979 us/it; ETA 2d 16:10; c03f94396a5aa29e (check 1.27s)
2020-06-05 17:43:17 Radeon Pro W5500-0 77936867 OK   600000   0.77%; 3004 us/it; ETA 2d 16:32; b9decd65ca71b629 (check 1.28s)

2020-09-04 13:24:28 GeForce GTX 1080 Ti-0 OpenCL compilation in 2.02 s
2020-09-04 13:24:29 GeForce GTX 1080 Ti-0 77936867 OK        0 loaded: blockSize 400, 0000000000000003
2020-09-04 13:24:32 GeForce GTX 1080 Ti-0 77936867 OK      800   0.00%; 2481 us/it; ETA 2d 05:43; 1579c241dc63eca6 (check 1.04s)
2020-09-04 13:32:54 GeForce GTX 1080 Ti-0 77936867 OK   200000   0.26%; 2514 us/it; ETA 2d 06:18; f0b04b45b0855bd2 (check 1.04s)
2020-09-04 13:41:12 GeForce GTX 1080 Ti-0 77936867 OK   400000   0.51%; 2483 us/it; ETA 2d 05:29; c03f94396a5aa29e (check 1.05s)
2020-09-04 13:49:27 GeForce GTX 1080 Ti-0 77936867 OK   600000   0.77%; 2473 us/it; ETA 2d 05:07; b9decd65ca71b629 (check 1.06s)

2020-09-04 17:42:56 GeForce GTX 980 Ti-0 OpenCL compilation in 1.83 s
2020-09-04 17:42:58 GeForce GTX 980 Ti-0 77936867 OK        0 loaded: blockSize 400, 0000000000000003
2020-09-04 17:43:04 GeForce GTX 980 Ti-0 77936867 OK      800   0.00%; 4221 us/it; ETA 3d 19:23; 1579c241dc63eca6 (check 1.73s)
2020-09-04 17:57:13 GeForce GTX 980 Ti-0 77936867 OK   200000   0.26%; 4258 us/it; ETA 3d 19:56; f0b04b45b0855bd2 (check 1.75s)
2020-09-04 18:11:28 GeForce GTX 980 Ti-0 77936867 OK   400000   0.51%; 4263 us/it; ETA 3d 19:49; c03f94396a5aa29e (check 1.75s)
2020-09-04 18:25:42 GeForce GTX 980 Ti-0 77936867 OK   600000   0.77%; 4262 us/it; ETA 3d 19:34; b9decd65ca71b629 (check 1.75s)
2060 Super
Code:
2020-09-25 01:11:43 GeForce RTX 2060 SUPER-0 OpenCL compilation in 1.35 s
2020-09-25 01:11:45 GeForce RTX 2060 SUPER-0 77936867 OK        0 loaded: blockSize 400, 0000000000000003
2020-09-25 01:11:50 GeForce RTX 2060 SUPER-0 77936867 OK      800   0.00%; 3966 us/it; ETA 3d 13:51; 1579c241dc63eca6 (check 1.65s)
2020-09-25 01:34:08 GeForce RTX 2060 SUPER-0 77936867 OK   200000   0.26%; 5188 us/it; ETA 4d 16:01; f0b04b45b0855bd2 (check 2.18s)
2020-09-25 01:51:15 GeForce RTX 2060 SUPER-0 77936867 OK   400000   0.51%; 5123 us/it; ETA 4d 14:21; c03f94396a5aa29e (check 2.12s)
2020-09-25 02:08:30 GeForce RTX 2060 SUPER-0 77936867 OK   600000   0.77%; 5164 us/it; ETA 4d 14:56; b9decd65ca71b629 (check 2.13s)
With a heavy load like FurMark this GPU draws 175W, but with gpuowl it uses only 70W. We tried running two instances and it didn't change. Are we doing something wrong?
Attached Thumbnails
Click image for larger version

Name:	power.png
Views:	23
Size:	32.5 KB
ID:	23409  
Xyzzy is offline   Reply With Quote
Old 2020-09-25, 13:21   #2481
Viliam Furik
 
Jul 2018
Martin, Slovakia

257 Posts
Default

Quote:
Originally Posted by Prime95 View Post
The second proof failure is here: 108979853

I do not know if Bruno Victal is a forum member and can comment on what may have happened.
Residue matched. The certificate is not yet certified, so my guess is it was either an error in CERT-generation process in original run (most likely), or if my CERT turns out to be not good either, it could be some weird bug in both programs (highly unlikely).
Viliam Furik is offline   Reply With Quote
Old 2020-09-25, 14:17   #2482
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

22·7·132 Posts
Default

Quote:
Originally Posted by Xyzzy View Post
2060 Super
Code:
2020-09-25 01:11:43 GeForce RTX 2060 SUPER-0 OpenCL compilation in 1.35 s
2020-09-25 01:11:45 GeForce RTX 2060 SUPER-0 77936867 OK        0 loaded: blockSize 400, 0000000000000003
2020-09-25 01:11:50 GeForce RTX 2060 SUPER-0 77936867 OK      800   0.00%; 3966 us/it; ETA 3d 13:51; 1579c241dc63eca6 (check 1.65s)
2020-09-25 01:34:08 GeForce RTX 2060 SUPER-0 77936867 OK   200000   0.26%; 5188 us/it; ETA 4d 16:01; f0b04b45b0855bd2 (check 2.18s)
2020-09-25 01:51:15 GeForce RTX 2060 SUPER-0 77936867 OK   400000   0.51%; 5123 us/it; ETA 4d 14:21; c03f94396a5aa29e (check 2.12s)
2020-09-25 02:08:30 GeForce RTX 2060 SUPER-0 77936867 OK   600000   0.77%; 5164 us/it; ETA 4d 14:56; b9decd65ca71b629 (check 2.13s)
With a heavy load like FurMark this GPU draws 175W, but with gpuowl it uses only 70W. We tried running two instances and it didn't change. Are we doing something wrong?
RTX20xx is much more productively used in TF than in any DP related computation (LL, PRP, P-1) because of its extremely high SP/DP ratio.

77936867 is a composite exponent 77 936867 = 1447 × 53861, so has known factors, which means there's little point in making that PRP run.
I haven't tried much DP on RTX20xx, because the extreme SP/DP ratio would make it sort of a waste of the gpu's capability, but do run DP often on GTX10xx, and see power differences in GTX10xx depending on whether the run is TF (high power, in one case more than the system can handle) or DP dominant (less power, runs on a system that can't handle the TF power load of a GTX1080). GTX10xx SP/DP ratio is large but not as extreme as for RTX20xx. So it seems plausible the SP,DP power difference could be more significant on the RTX20xx gpus.
kriesel is online now   Reply With Quote
Old 2020-09-25, 17:51   #2483
Neutron3529
 
Neutron3529's Avatar
 
Dec 2018
China

23·5 Posts
Lightbulb

Quote:
Originally Posted by preda View Post
I recently understood how to implement a "Fast Galois Transform" (FGT) which is simply complex arithmetic with integers modulo some number M.

I had hope that this integer-only transform may be faster on the GPU because it does not use double-precision floating point (which is slow on commodity GPUs). So I had fun and implemented FGT modulo M(31)=2^31-1 and modulo M(61)=2^61-1. Unfortunately the hoped performance gain was not there, but it was a very cool exercise nevertheless.

Anyway, now it's possible to select among these 4 transforms:
-fft DP : the old double precision floating point
-fft SP : simple precision FP
-fft M61 : FGT(M61)
-fft M31 : FGT(M31)

Of these, SP is very fast but also useless at 2M FFT-size and up (it may prove useful for something at lower FFT sizes).

M31 has about 5 bits-per-word usable at 4M FFT size. It's not much use by itself, but can be tested.

M61 has deeper word bits than DP. So it can be used for real work. Unfortunately it's also slower than DP. Part of the slowness may be from poor compiler optimizations and that aspect may improve in the future, hopefully.


I updated the savefile format to save "compacted" bits now. That means that it's possible to change the FFT size (among 2M, 4M, 8M) or the FFT kind (e.g. switching between DP and M61) in the middle of a test, and everything should work fine (assuming enough usable bits are available; otherwise the "Gerbicz check" which is used with all the transforms will catch it).

It's nice that adding the FGTs was done with very little additional code compared to DP-FFT-only -- most of the code is common.

Also, a dynamic step of the Gerbicz verification is implemented, which starts with a very small step of 2K iterations at program start (allowing the user to see that the program functions correctly) and ramps up towards 500K if no errors are encountered, or back down if errors are detected.

Anyway, if anybody wants to play with pure-integer convolutions on the GPU (for the limited FFT sizes of 2M/4M/8M), the code is here: https://github.com/preda/gpuowl
Is it possible to test SP in gpuowl now?
I will soon bought nvidia RTX 3090 and want to test whether 3090 generate good results
I want to test dp,sp and int32(maybe using M31), is it possible using gpuowl to perform such test?


anyway, thanks for your great program!
Neutron3529 is offline   Reply With Quote
Old 2020-09-25, 18:38   #2484
Xyzzy
 
Xyzzy's Avatar
 
"Mike"
Aug 2002

2·5·11·71 Posts
Default

Quote:
Originally Posted by kriesel View Post
77936867 is a composite exponent 77 936867 = 1447 × 53861, so has known factors, which means there's little point in making that PRP run.
We just use that exponent for benchmarking purposes. As you mentioned, it does much better with trial factoring.
Code:
Sep 25 13:32 | 3664  79.6% |  0.381   1m15s |   2150.15    75061    n.a.%
Sep 25 13:32 | 3669  79.7% |  0.375   1m13s |   2184.55    75061    n.a.%
Sep 25 13:32 | 3676  79.8% |  0.376   1m13s |   2178.74    75061    n.a.%
Sep 25 13:32 | 3681  79.9% |  0.374   1m12s |   2190.39    75061    n.a.%
Sep 25 13:32 | 3684  80.0% |  0.376   1m12s |   2178.74    75061    n.a.%
Sep 25 13:32 | 3685  80.1% |  0.376   1m12s |   2178.74    75061    n.a.%
Sep 25 13:32 | 3696  80.2% |  0.380   1m12s |   2155.81    75061    n.a.%
Sep 25 13:32 | 3697  80.3% |  0.378   1m11s |   2167.21    75061    n.a.%
Sep 25 13:32 | 3709  80.4% |  0.375   1m11s |   2184.55    75061    n.a.%
Sep 25 13:32 | 3712  80.5% |  0.376   1m10s |   2178.74    75061    n.a.%
Xyzzy is offline   Reply With Quote
Old 2020-09-25, 18:39   #2485
Xyzzy
 
Xyzzy's Avatar
 
"Mike"
Aug 2002

2×5×11×71 Posts
Default

Quote:
Originally Posted by kriesel View Post
Congratulations, you can apparently run mfakto on the hd630 because the Intel OpenCL is working. (But it does not have DP and OpenCL2.0, which gpuowl requires.)
Somehow we got gpuowl to work on our integrated graphics. (It runs so slow that it isn't worth doing, but it is neat that it works at all!)
Code:
2020-09-25 13:26:07 Intel(R) UHD Graphics 630-1 OpenCL compilation in 6.71 s
2020-09-25 13:26:21 Intel(R) UHD Graphics 630-1 77936867 OK        0 loaded: blockSize 400, 0000000000000003
2020-09-25 13:26:56 Intel(R) UHD Graphics 630-1 77936867 OK      800   0.00%; 29530 us/it; ETA 26d 15:17; 1579c241dc63eca6 (check 11.88s)
2020-09-25 15:07:23 Intel(R) UHD Graphics 630-1 77936867 OK   200000   0.26%; 30198 us/it; ETA 27d 04:05; f0b04b45b0855bd2 (check 11.97s)
Xyzzy is offline   Reply With Quote
Old 2020-09-25, 18:43   #2486
Mark Rose
 
Mark Rose's Avatar
 
"/X\(‘-‘)/X\"
Jan 2013

2×31×47 Posts
Default

Quote:
Originally Posted by Neutron3529 View Post
Is it possible to test SP in gpuowl now?
I will soon bought nvidia RTX 3090 and want to test whether 3090 generate good results
I want to test dp,sp and int32(maybe using M31), is it possible using gpuowl to perform such test?


anyway, thanks for your great program!
The RTX 3090 is not much better than the 3080. Unless you want the absolute best performance in a single card, the 3080 is a vastly better currency/performance card.
Mark Rose is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
mfakto: an OpenCL program for Mersenne prefactoring Bdot GPU Computing 1657 2020-10-27 01:23
GPUOWL AMD Windows OpenCL issues xx005fs GpuOwl 0 2019-07-26 21:37
Testing an expression for primality 1260 Software 17 2015-08-28 01:35
Testing Mersenne cofactors for primality? CRGreathouse Computer Science & Computational Number Theory 18 2013-06-08 19:12
Primality-testing program with multiple types of moduli (PFGW-related) Unregistered Information & Answers 4 2006-10-04 22:38

All times are UTC. The time now is 17:39.

Sun Nov 29 17:39:49 UTC 2020 up 80 days, 14:50, 4 users, load averages: 1.86, 1.88, 1.73

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.