mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing > GpuOwl

Reply
 
Thread Tools
Old 2019-07-31, 05:41   #1299
SELROC
 

37·197 Posts
Default

Quote:
Originally Posted by maxzor View Post
Hello and thank you for the program.
How much of it depends on CPU performance?
Will it be significantly slower running on a Radeon VII with a pentium II, i5 2500 or R7 1800x (or 3600) ?
I am about to compile in linux soon.
I have a 1800x, and setup Radeon VI for gpuOwl and Nvidia 1050ti for the lesser stuff, any experience in balancing load between two gpus appreciated!

Hello, gpuowl is independent from CPU performance, because the computation is done on the GPU all the time.
To compile on linux you need a few libraries preinstalled, libgmp-dev and probably g++-8 depending on your linux distribution.


Any finding on how to run gpuowl on Nvidia is welcome.
  Reply With Quote
Old 2019-08-11, 07:13   #1300
xx005fs
 
"Eric"
Jan 2018
USA

22×53 Posts
Default R9 390x and R9 280x performance?

Hi all, I have a friend recently who upgraded his computer and is giving me his old R9 280x to me for a cheap price. Does anyone have performance number on gpuowl with the 280x on the current wavefront? At the same time, is buying used R9 390x worth it for gpuowl, and some benchmarks at current wavefront will be welcomed. Thanks
xx005fs is offline   Reply With Quote
Old 2019-08-11, 08:59   #1301
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

3·457 Posts
Default

Quote:
Originally Posted by xx005fs View Post
Hi all, I have a friend recently who upgraded his computer and is giving me his old R9 280x to me for a cheap price. Does anyone have performance number on gpuowl with the 280x on the current wavefront? At the same time, is buying used R9 390x worth it for gpuowl, and some benchmarks at current wavefront will be welcomed. Thanks
I didn't have a R9 280x, so I don't know perf numbers. On Linux, ROCm may not support 280x as too old, but amdgpu-pro should(?) support it. I wouldn't recommend buying R9 390x for gpuowl: the best thing to buy is Radeon VII (expensive though), but makes up the price in power cost savings IMO.
preda is offline   Reply With Quote
Old 2019-08-11, 15:32   #1302
xx005fs
 
"Eric"
Jan 2018
USA

110101002 Posts
Default

Quote:
Originally Posted by preda View Post
I didn't have a R9 280x, so I don't know perf numbers. On Linux, ROCm may not support 280x as too old, but amdgpu-pro should(?) support it. I wouldn't recommend buying R9 390x for gpuowl: the best thing to buy is Radeon VII (expensive though), but makes up the price in power cost savings IMO.
Thanks for the suggestions. Since the LL benchmark values on mersenne.ca seems to be ran on clLucas and are vastly different than gpuowl speeds (besides the Radeon vii run), and I see the R9 390x surprisingly situated above Vega 56, so I just wanted a confirmation in the performance hierarchy. Also, it seems that Radeon viis are out of stock nearly everywhere and the price has gone up from 600$ to 700, which is definitely out of my current budget range. I guess I'll just wait for a while till sometime AMD or Nvidia release a true "professional" GPU with high double precision to single precision ratio.
xx005fs is offline   Reply With Quote
Old 2019-08-11, 17:11   #1303
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

5,419 Posts
Default

Quote:
Originally Posted by xx005fs View Post
Also, it seems that Radeon viis are out of stock nearly everywhere and the price has gone up from 600$ to 700, which is definitely out of my current budget range. I guess I'll just wait for a while till sometime AMD or Nvidia release a true "professional" GPU with high double precision to single precision ratio.
If you think the Radeon VII is expensive, check out the prices of current existing NVIDIA Tesla or Quadro or AMD MI product lines, which have the higher dp capability of a pro gpu. I found an MI25 once, used, for around $3000.
kriesel is offline   Reply With Quote
Old 2019-09-03, 13:47   #1304
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

3·457 Posts
Default P-1 bug in GpuOwl

Hi, I just realized that P-1 stage2 in GpuOwl was broken since v6.5-51-gefc3c9f
P-1 should be fixed starting with v6.6 (pending more validation)
If somebody had the bad luck of doing P-1 with an affected version, the stage2 part of the "no factor" results is not valid. (stage1 is good though). Any factor-found results are good, too.

Independenly, I recently changed the memory allocation in stage2, which was a problem in the past (reported by Ken) (this was how I realized there's a bug).

While this shows lack of testing on my part, it's also an advice to self-validate: please do a couple of P-1 on known results (that can be found in the folder test-pm1 in GpuOwl source) before starting serious P-1 work.
preda is offline   Reply With Quote
Old 2019-09-03, 21:35   #1305
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

5,419 Posts
Default

Quote:
Originally Posted by preda View Post
Hi, I just realized that P-1 stage2 in GpuOwl was broken since v6.5-51-gefc3c9f
P-1 should be fixed starting with v6.6 (pending more validation)
If somebody had the bad luck of doing P-1 with an affected version, the stage2 part of the "no factor" results is not valid. (stage1 is good though). Any factor-found results are good, too.

Independently, I recently changed the memory allocation in stage2, which was a problem in the past (reported by Ken) (this was how I realized there's a bug).

While this shows lack of testing on my part, it's also an advice to self-validate: please do a couple of P-1 on known results (that can be found in the folder test-pm1 in GpuOwl source) before starting serious P-1 work.
Great; progress I've been waiting for.
What sort of validation do you have in mind?
We all should validate each specific installation combination, and on most gpu applications, also benchmark and tune, specific to the system, gpu model, planned work type and exponent range.

I've made a successful 4M exponent P-1 run in gpuowl on Win7 and RX480 and a 50M p using -use NO_ASM on gpuowl-win v6.6-5-667954b, although ORIG_X2 might have been better.
Bounds and therefore runtime were excessive on the 4M. It found the product of all 3 known factors for it.
Code:
{"exponent":"4444091", "worktype":"PM1", "status":"F", "program":{"name":"gpuowl", "version":"v6.6-5-g667954b"}, "timestamp":"2019-09-03 20:26:34 UTC", "fft-length":229376, "B1":500000, "B2":15000000, "factors":["1809798096458971047321927127"]}
{"exponent":"50001781", "worktype":"PM1", "status":"F", "program":{"name":"gpuowl", "version":"v6.6-5-g667954b"}, "timestamp":"2019-09-03 21:12:48 UTC", "fft-length":2883584, "B1":95000, "B2":4100000, "factors":["4392938042637898431087689"]}
Do you have any guidance for what is feasible on a gpu versus installed gpu ram amounts? 2GB, 4GB, 8GB, etc.?
Interestingly, GPU-Z v2.24.0 showed 0% gpu load indicated throughout stage 2, but showed 100% load during stage 1 and during V6.5 or v3.8 PRP3. GPU-Z shows other oddities on this system when accessed by RDP which is almost always.

Last fiddled with by kriesel on 2019-09-03 at 21:37
kriesel is offline   Reply With Quote
Old 2019-09-03, 22:15   #1306
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

10101001010112 Posts
Default P-1 bounds in GpuOwL

Please consider, rather than the fixed B1 bound and derived B2 bound, having the default similar to the GPUto72 bounds or a fit to them. See https://www.mersenneforum.org/showpo...7&postcount=23
kriesel is offline   Reply With Quote
Old 2019-09-03, 22:25   #1307
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

5,419 Posts
Default

Quote:
Originally Posted by kriesel View Post
I've made a successful 4M exponent P-1 run in gpuowl on Win7 and RX480 and a 50M p using -use NO_ASM on gpuowl-win v6.6-5-667954b, although ORIG_X2 might have been better.
Code:
{"exponent":"4444091", "worktype":"PM1", "status":"F", "program":{"name":"gpuowl", "version":"v6.6-5-g667954b"}, "timestamp":"2019-09-03 20:26:34 UTC", "fft-length":229376, "B1":500000, "B2":15000000, "factors":["1809798096458971047321927127"]}
{"exponent":"50001781", "worktype":"PM1", "status":"F", "program":{"name":"gpuowl", "version":"v6.6-5-g667954b"}, "timestamp":"2019-09-03 21:12:48 UTC", "fft-length":2883584, "B1":95000, "B2":4100000, "factors":["4392938042637898431087689"]}
The news is not as good on NVIDIA. This was an attempt on a GTX 1080 Ti.
Code:
>gpuowl-win -device 0 -use ORIG_X2 -B1 6000 -B2 2100000 -pm1 51558151
2019-09-03 17:16:17 gpuowl v6.6-5-g667954b
2019-09-03 17:16:17 Note: no config.txt file found
2019-09-03 17:16:17 config: -device 0 -use ORIG_X2 -B1 6000 -B2 2100000 -pm1 51558151
2019-09-03 17:16:17 51558151 FFT 2816K: Width 8x8, Height 256x8, Middle 11; 17.88 bits/word
2019-09-03 17:16:17 using short carry kernels
2019-09-03 17:16:17 OpenCL args "-DEXP=51558151u -DWIDTH=64u -DSMALL_HEIGHT=2048u -DMIDDLE=11u -DWEIGHT_STEP=0x8.b1cf5f16b2fap-3 -DIWEIGHT_STEP=0xe.b8
c9efc21a378p-4 -DWEIGHT_BIGSTEP=0x8.b95c1e3ea8bd8p-3 -DIWEIGHT_BIGSTEP=0xe.ac0c6e7dd2438p-4 -DORIG_X2=1  -I. -cl-fast-relaxed-math -cl-std=CL2.0"
2019-09-03 17:16:24

2019-09-03 17:16:24 OpenCL compilation in 6895 ms
2019-09-03 17:16:25 51558151 P-1 starting stage1
2019-09-03 17:18:01 Exception 9gpu_error: INVALID_VALUE clGetDeviceInfo(id, what, bufSize, buf, NULL) at clwrap.cpp:98 getInfo
2019-09-03 17:18:01 Bye

>gpuowl-win -device 0 -use ORIG_X2 -B1 2000 -B2 8000000 -pm1 100000081
2019-09-03 17:18:01 gpuowl v6.6-5-g667954b
2019-09-03 17:18:01 Note: no config.txt file found
2019-09-03 17:18:01 config: -device 0 -use ORIG_X2 -B1 2000 -B2 8000000 -pm1 100000081
2019-09-03 17:18:01 100000081 FFT 5632K: Width 256x4, Height 64x4, Middle 11; 17.34 bits/word
2019-09-03 17:18:01 using short carry kernels
2019-09-03 17:18:02 OpenCL args "-DEXP=100000081u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=11u -DWEIGHT_STEP=0xc.a5067a8c5cb2p-3 -DIWEIGHT_STEP=0xa.
1f74af2719fap-4 -DWEIGHT_BIGSTEP=0xd.744fccad69d68p-3 -DIWEIGHT_BIGSTEP=0x9.837f0518db8a8p-4 -DORIG_X2=1  -I. -cl-fast-relaxed-math -cl-std=CL2.0"
2019-09-03 17:18:05

2019-09-03 17:18:05 OpenCL compilation in 3541 ms
2019-09-03 17:18:07 100000081 P-1 starting stage1
2019-09-03 17:20:57 Exception 9gpu_error: INVALID_VALUE clGetDeviceInfo(id, what, bufSize, buf, NULL) at clwrap.cpp:98 getInfo
2019-09-03 17:20:57 Bye
kriesel is offline   Reply With Quote
Old 2019-09-04, 00:02   #1308
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

5,419 Posts
Default cmd line options; p-1 save files

I note there is -user and -cpu but no -aid command line option.
Is there a save file provision for P-1 runs so that they can be stopped and later continued, or is a run in progress lost if it is halted?

Last fiddled with by kriesel on 2019-09-04 at 00:05
kriesel is offline   Reply With Quote
Old 2019-09-04, 01:08   #1309
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

5,419 Posts
Default gpuowl reports both target bounds with stage 1 factor found

See 100002337. Factor was found in stage 1. B1 and B2 were included in the report. It is customary to report B1 and B2 as the B1 value when the factor is found in stage 1 and stage 2 is not fully performed. That way, the recorded bounds reflect actual factoring limits completed, and processing credit is properly computed, not overestimated.
Code:
2019-09-03 19:29:07 100002337     1170000 97.69%; 4859 us/sq; ETA 0d 00:02; a2c219ce3eac5f47
2019-09-03 19:29:55 100002337     1180000 98.52%; 4848 us/sq; ETA 0d 00:01; f8dd78f4927d7326
2019-09-03 19:30:44 100002337     1190000 99.36%; 4863 us/sq; ETA 0d 00:01; df1df66b8d7a209c
2019-09-03 19:31:21 P-1 stage2 using 160 buffers of 44.0 MB each
2019-09-03 19:31:22 P-1 (B1=830000, B2=17430000, D=30030): primes 1050980, expanded 1071560, doubles 177259 (left 703338), singles 696462, total 873721 (83%)
2019-09-03 19:31:22 100002337 P-1 stage2: 553 blocks starting at block 28 (873721 selected)
2019-09-03 19:35:48 Round 1 of 18: init 4.62 s; 5.37 ms/mul; 48637 muls
2019-09-03 19:35:48 100002337 P-1 stage1 GCD: 2393819567666978656303937
2019-09-03 19:35:48 {"exponent":"100002337", "worktype":"PM1", "status":"F", "program":{"name":"gpuowl", "version":"v6.6-5-g667954b"}, "timestamp":"2019-09-04 0
0:35:48 UTC", "fft-length":5767168, "B1":830000, "B2":17430000, "factors":["2393819567666978656303937"]}
2019-09-03 19:35:48 Bye
kriesel is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
mfakto: an OpenCL program for Mersenne prefactoring Bdot GPU Computing 1676 2021-06-30 21:23
GPUOWL AMD Windows OpenCL issues xx005fs GpuOwl 0 2019-07-26 21:37
Testing an expression for primality 1260 Software 17 2015-08-28 01:35
Testing Mersenne cofactors for primality? CRGreathouse Computer Science & Computational Number Theory 18 2013-06-08 19:12
Primality-testing program with multiple types of moduli (PFGW-related) Unregistered Information & Answers 4 2006-10-04 22:38

All times are UTC. The time now is 20:32.


Sun Aug 1 20:32:17 UTC 2021 up 9 days, 15:01, 0 users, load averages: 2.23, 2.25, 1.95

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.