![]() |
[QUOTE=maxzor;522179]Hello and thank you for the program.
How much of it depends on CPU performance? Will it be significantly slower running on a Radeon VII with a pentium II, i5 2500 or R7 1800x (or 3600) ? I am about to compile in linux soon. I have a 1800x, and setup Radeon VI for gpuOwl and Nvidia 1050ti for the lesser stuff, any experience in balancing load between two gpus appreciated![/QUOTE] Hello, gpuowl is independent from CPU performance, because the computation is done on the GPU all the time. To compile on linux you need a few libraries preinstalled, libgmp-dev and probably g++-8 depending on your linux distribution. Any finding on how to run gpuowl on Nvidia is welcome. |
R9 390x and R9 280x performance?
Hi all, I have a friend recently who upgraded his computer and is giving me his old R9 280x to me for a cheap price. Does anyone have performance number on gpuowl with the 280x on the current wavefront? At the same time, is buying used R9 390x worth it for gpuowl, and some benchmarks at current wavefront will be welcomed. Thanks
|
[QUOTE=xx005fs;523498]Hi all, I have a friend recently who upgraded his computer and is giving me his old R9 280x to me for a cheap price. Does anyone have performance number on gpuowl with the 280x on the current wavefront? At the same time, is buying used R9 390x worth it for gpuowl, and some benchmarks at current wavefront will be welcomed. Thanks[/QUOTE]
I didn't have a R9 280x, so I don't know perf numbers. On Linux, ROCm may not support 280x as too old, but amdgpu-pro should(?) support it. I wouldn't recommend buying R9 390x for gpuowl: the best thing to buy is Radeon VII (expensive though), but makes up the price in power cost savings IMO. |
[QUOTE=preda;523499]I didn't have a R9 280x, so I don't know perf numbers. On Linux, ROCm may not support 280x as too old, but amdgpu-pro should(?) support it. I wouldn't recommend buying R9 390x for gpuowl: the best thing to buy is Radeon VII (expensive though), but makes up the price in power cost savings IMO.[/QUOTE]
Thanks for the suggestions. Since the LL benchmark values on mersenne.ca seems to be ran on clLucas and are vastly different than gpuowl speeds (besides the Radeon vii run), and I see the R9 390x surprisingly situated above Vega 56, so I just wanted a confirmation in the performance hierarchy. Also, it seems that Radeon viis are out of stock nearly everywhere and the price has gone up from 600$ to 700, which is definitely out of my current budget range. I guess I'll just wait for a while till sometime AMD or Nvidia release a true "professional" GPU with high double precision to single precision ratio. |
[QUOTE=xx005fs;523522]Also, it seems that Radeon viis are out of stock nearly everywhere and the price has gone up from 600$ to 700, which is definitely out of my current budget range. I guess I'll just wait for a while till sometime AMD or Nvidia release a true "professional" GPU with high double precision to single precision ratio.[/QUOTE]If you think the Radeon VII is expensive, check out the prices of current existing NVIDIA Tesla or Quadro or AMD MI product lines, which have the higher dp capability of a pro gpu. I found an MI25 once, used, for around $3000.
|
P-1 bug in GpuOwl
Hi, I just realized that P-1 stage2 in GpuOwl was broken since v6.5-51-gefc3c9f
P-1 should be fixed starting with v6.6 (pending more validation) If somebody had the bad luck of doing P-1 with an affected version, the stage2 part of the "no factor" results is not valid. (stage1 is good though). Any factor-found results are good, too. Independenly, I recently changed the memory allocation in stage2, which was a problem in the past (reported by Ken) (this was how I realized there's a bug). While this shows lack of testing on my part, it's also an advice to self-validate: please do a couple of P-1 on known results (that can be found in the folder test-pm1 in GpuOwl source) before starting serious P-1 work. |
[QUOTE=preda;525076]Hi, I just realized that P-1 stage2 in GpuOwl was broken since v6.5-51-gefc3c9f
P-1 should be fixed starting with v6.6 (pending more validation) If somebody had the bad luck of doing P-1 with an affected version, the stage2 part of the "no factor" results is not valid. (stage1 is good though). Any factor-found results are good, too. Independently, I recently changed the memory allocation in stage2, which was a problem in the past (reported by Ken) (this was how I realized there's a bug). While this shows lack of testing on my part, it's also an advice to self-validate: please do a couple of P-1 on known results (that can be found in the folder test-pm1 in GpuOwl source) before starting serious P-1 work.[/QUOTE] Great; progress I've been waiting for. What sort of validation do you have in mind? We all should validate each specific installation combination, and on most gpu applications, also benchmark and tune, specific to the system, gpu model, planned work type and exponent range. I've made a successful 4M exponent P-1 run in gpuowl on Win7 and RX480 and a 50M p using -use NO_ASM on gpuowl-win v6.6-5-667954b, although ORIG_X2 might have been better. Bounds and therefore runtime were excessive on the 4M. It found the product of all 3 known factors for it. [CODE]{"exponent":"4444091", "worktype":"PM1", "status":"F", "program":{"name":"gpuowl", "version":"v6.6-5-g667954b"}, "timestamp":"2019-09-03 20:26:34 UTC", "fft-length":229376, "B1":500000, "B2":15000000, "factors":["1809798096458971047321927127"]} {"exponent":"50001781", "worktype":"PM1", "status":"F", "program":{"name":"gpuowl", "version":"v6.6-5-g667954b"}, "timestamp":"2019-09-03 21:12:48 UTC", "fft-length":2883584, "B1":95000, "B2":4100000, "factors":["4392938042637898431087689"]} [/CODE]Do you have any guidance for what is feasible on a gpu versus installed gpu ram amounts? 2GB, 4GB, 8GB, etc.? Interestingly, GPU-Z v2.24.0 showed 0% gpu load indicated throughout stage 2, but showed 100% load during stage 1 and during V6.5 or v3.8 PRP3. GPU-Z shows other oddities on this system when accessed by RDP which is almost always. |
P-1 bounds in GpuOwL
Please consider, rather than the fixed B1 bound and derived B2 bound, having the default similar to the GPUto72 bounds or a fit to them. See [url]https://www.mersenneforum.org/showpost.php?p=522257&postcount=23[/url]
|
[QUOTE=kriesel;525105]I've made a successful 4M exponent P-1 run in gpuowl on Win7 and RX480 and a 50M p using -use NO_ASM on gpuowl-win v6.6-5-667954b, although ORIG_X2 might have been better.
[CODE]{"exponent":"4444091", "worktype":"PM1", "status":"F", "program":{"name":"gpuowl", "version":"v6.6-5-g667954b"}, "timestamp":"2019-09-03 20:26:34 UTC", "fft-length":229376, "B1":500000, "B2":15000000, "factors":["1809798096458971047321927127"]} {"exponent":"50001781", "worktype":"PM1", "status":"F", "program":{"name":"gpuowl", "version":"v6.6-5-g667954b"}, "timestamp":"2019-09-03 21:12:48 UTC", "fft-length":2883584, "B1":95000, "B2":4100000, "factors":["4392938042637898431087689"]} [/CODE][/QUOTE]The news is not as good on NVIDIA. This was an attempt on a GTX 1080 Ti. [CODE]>gpuowl-win -device 0 -use ORIG_X2 -B1 6000 -B2 2100000 -pm1 51558151 2019-09-03 17:16:17 gpuowl v6.6-5-g667954b 2019-09-03 17:16:17 Note: no config.txt file found 2019-09-03 17:16:17 config: -device 0 -use ORIG_X2 -B1 6000 -B2 2100000 -pm1 51558151 2019-09-03 17:16:17 51558151 FFT 2816K: Width 8x8, Height 256x8, Middle 11; 17.88 bits/word 2019-09-03 17:16:17 using short carry kernels 2019-09-03 17:16:17 OpenCL args "-DEXP=51558151u -DWIDTH=64u -DSMALL_HEIGHT=2048u -DMIDDLE=11u -DWEIGHT_STEP=0x8.b1cf5f16b2fap-3 -DIWEIGHT_STEP=0xe.b8 c9efc21a378p-4 -DWEIGHT_BIGSTEP=0x8.b95c1e3ea8bd8p-3 -DIWEIGHT_BIGSTEP=0xe.ac0c6e7dd2438p-4 -DORIG_X2=1 -I. -cl-fast-relaxed-math -cl-std=CL2.0" 2019-09-03 17:16:24 2019-09-03 17:16:24 OpenCL compilation in 6895 ms 2019-09-03 17:16:25 51558151 P-1 starting stage1 2019-09-03 17:18:01 Exception 9gpu_error: INVALID_VALUE clGetDeviceInfo(id, what, bufSize, buf, NULL) at clwrap.cpp:98 getInfo 2019-09-03 17:18:01 Bye >gpuowl-win -device 0 -use ORIG_X2 -B1 2000 -B2 8000000 -pm1 100000081 2019-09-03 17:18:01 gpuowl v6.6-5-g667954b 2019-09-03 17:18:01 Note: no config.txt file found 2019-09-03 17:18:01 config: -device 0 -use ORIG_X2 -B1 2000 -B2 8000000 -pm1 100000081 2019-09-03 17:18:01 100000081 FFT 5632K: Width 256x4, Height 64x4, Middle 11; 17.34 bits/word 2019-09-03 17:18:01 using short carry kernels 2019-09-03 17:18:02 OpenCL args "-DEXP=100000081u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=11u -DWEIGHT_STEP=0xc.a5067a8c5cb2p-3 -DIWEIGHT_STEP=0xa. 1f74af2719fap-4 -DWEIGHT_BIGSTEP=0xd.744fccad69d68p-3 -DIWEIGHT_BIGSTEP=0x9.837f0518db8a8p-4 -DORIG_X2=1 -I. -cl-fast-relaxed-math -cl-std=CL2.0" 2019-09-03 17:18:05 2019-09-03 17:18:05 OpenCL compilation in 3541 ms 2019-09-03 17:18:07 100000081 P-1 starting stage1 2019-09-03 17:20:57 Exception 9gpu_error: INVALID_VALUE clGetDeviceInfo(id, what, bufSize, buf, NULL) at clwrap.cpp:98 getInfo 2019-09-03 17:20:57 Bye[/CODE] |
cmd line options; p-1 save files
I note there is -user and -cpu but no -aid command line option.
Is there a save file provision for P-1 runs so that they can be stopped and later continued, or is a run in progress lost if it is halted? |
gpuowl reports both target bounds with stage 1 factor found
See [M]100002337[/M]. Factor was found in stage 1. B1 and B2 were included in the report. It is customary to report B1 and B2 as the B1 value when the factor is found in stage 1 and stage 2 is not fully performed. That way, the recorded bounds reflect actual factoring limits completed, and processing credit is properly computed, not overestimated.
[CODE]2019-09-03 19:29:07 100002337 1170000 97.69%; 4859 us/sq; ETA 0d 00:02; a2c219ce3eac5f47 2019-09-03 19:29:55 100002337 1180000 98.52%; 4848 us/sq; ETA 0d 00:01; f8dd78f4927d7326 2019-09-03 19:30:44 100002337 1190000 99.36%; 4863 us/sq; ETA 0d 00:01; df1df66b8d7a209c 2019-09-03 19:31:21 P-1 stage2 using 160 buffers of 44.0 MB each 2019-09-03 19:31:22 P-1 (B1=830000, B2=17430000, D=30030): primes 1050980, expanded 1071560, doubles 177259 (left 703338), singles 696462, total 873721 (83%) 2019-09-03 19:31:22 100002337 P-1 stage2: 553 blocks starting at block 28 (873721 selected) 2019-09-03 19:35:48 Round 1 of 18: init 4.62 s; 5.37 ms/mul; 48637 muls 2019-09-03 19:35:48 100002337 P-1 [B]stage1 GCD[/B]: 2393819567666978656303937 2019-09-03 19:35:48 {"exponent":"100002337", "worktype":"PM1", "status":"F", "program":{"name":"gpuowl", "version":"v6.6-5-g667954b"}, "timestamp":"2019-09-04 0 0:35:48 UTC", "fft-length":5767168, "B1":830000, "B2":[B]17430000[/B], "factors":["2393819567666978656303937"]} 2019-09-03 19:35:48 Bye[/CODE] |
| All times are UTC. The time now is 23:15. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.