![]() |
|
|
#3532 | |
|
Feb 2022
7 Posts |
Quote:
I'm running my chip stock without PBO PBO is explicitly set to DISABLED in BIOS. running stock power limits through the chip. So basically this is the OOB as is performance no overclocking. (See screenshots) |
|
|
|
|
|
|
#3533 |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
1E9016 Posts |
So, same exponent,
prime95, 5950X out of the box, ~106W, 21.6 hours, 2.29 KWhours Gpuowl, Radeon VII, ~200W, 8.98 hours, 1.80 KWhours, to 250W, 8.52 hours, 2.13 KWhours You may be able to improve your 5950x power cost per result with some tuning. Last fiddled with by kriesel on 2022-03-30 at 17:05 |
|
|
|
|
|
#3534 |
|
Feb 2022
78 Posts |
And here is my RTX 2080 I don't think it's the same test though because I'm pretty sure that AMD CPU > Nvidia GPU at PRP
Can you give me the line of worktodo.txt for GpuOwl you think is the correct one to comparison benchmark your GPU with my RTX2080 to test? I used PRP=N/A,1,2,65005679,-1,65,2 TBH I'm not sure of all the positional inputs, I know, PRP=AID,?,?,Exponent,?,MaxBit,? Some enlightenment would be appreciated. Code:
PS D:\GIMPS\gpuowl-v7.2-91-g9c22195> .\gpuowl-win.exe
20220330 12:57:00 GpuOwl VERSION v7.2-91-g9c22195
20220330 12:57:00 GpuOwl VERSION v7.2-91-g9c22195
20220330 12:57:00 config: -cpu OPS71-GPU_RTX2080
20220330 12:57:00 config: -user nullcure
20220330 12:57:00 config:
20220330 12:57:00 device 0, unique id ''
20220330 12:57:00 OPS71-GPU_RTX2080 65005679 FFT: 3.50M 1K:7:256 (17.71 bpw)
20220330 12:57:00 OPS71-GPU_RTX2080 65005679 OpenCL args "-DEXP=65005679u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=7u -DWEIGHT_STEP=0.22040343673494417 -DIWEIGHT_STEP=-0.18059883322240508 -DIWEIGHTS={0,-0.32858172788351608,-0.098395007736230292,-0.3946459339626639,-0.18710843792504778,-0.45420975197356606,-0.267092909464129,-0.01582557530097212,} -DFWEIGHTS={0,0.48938454839446288,0.10913316649808907,0.65192580029407698,0.23017638102607779,0.83220569370005959,0.3644294248386134,0.01608005136468748,} -cl-std=CL2.0 -cl-finite-math-only "
20220330 12:57:04 OPS71-GPU_RTX2080 65005679
20220330 12:57:04 OPS71-GPU_RTX2080 65005679 OpenCL compilation in 3.59 s
20220330 12:57:04 OPS71-GPU_RTX2080 65005679 maxAlloc: 0.0 GB
20220330 12:57:04 OPS71-GPU_RTX2080 65005679 You should use -maxAlloc if your GPU has more than 4GB memory. See help '-h'
20220330 12:57:04 OPS71-GPU_RTX2080 65005679 P1(3.5M) 5049336 bits
20220330 12:57:05 OPS71-GPU_RTX2080 65005679 PRP starting from beginning
20220330 12:57:05 OPS71-GPU_RTX2080 65005679 Acquired memory lock 'memlock-0'
20220330 12:57:05 OPS71-GPU_RTX2080 65005679 P1(3.5M) using 204 buffers
20220330 12:57:15 OPS71-GPU_RTX2080 65005679 OK 0 on-load: blockSize 400, 0000000000000003
20220330 12:57:15 OPS71-GPU_RTX2080 65005679 validating proof residues for power 8
20220330 12:57:15 OPS71-GPU_RTX2080 65005679 Proof using power 8
20220330 12:57:45 OPS71-GPU_RTX2080 65005679 10000 b70c81dfa6ea9eed 3008
20220330 12:58:16 OPS71-GPU_RTX2080 65005679 20000 1805210eafbd98b6 3050
20220330 12:58:46 OPS71-GPU_RTX2080 65005679 30000 1308d69bec04ec33 3029
20220330 12:59:17 OPS71-GPU_RTX2080 65005679 40000 0ef3ad7aa0f100a8 3061
20220330 12:59:47 OPS71-GPU_RTX2080 65005679 50000 592b24ce299b8ab7 3005
20220330 13:00:17 OPS71-GPU_RTX2080 65005679 60000 e43185ab59a07b48 3022
20220330 13:00:47 OPS71-GPU_RTX2080 65005679 70000 8fc4923f37774afe 3011
20220330 13:01:17 OPS71-GPU_RTX2080 65005679 80000 a3dccd3f52ab7805 3001
20220330 13:01:47 OPS71-GPU_RTX2080 65005679 90000 439df49da0e0a8e5 2979
20220330 13:02:16 OPS71-GPU_RTX2080 65005679 100000 1b2a0189d2a3249d 2974
20220330 13:02:46 OPS71-GPU_RTX2080 65005679 110000 5a597fce90bb9501 2985
20220330 13:03:16 OPS71-GPU_RTX2080 65005679 120000 b73ac48ccfdee32a 2996
20220330 13:03:46 OPS71-GPU_RTX2080 65005679 130000 3a742e8e45958c31 2986
20220330 13:04:16 OPS71-GPU_RTX2080 65005679 140000 731143da7eec3832 2974
20220330 13:04:46 OPS71-GPU_RTX2080 65005679 150000 7d920954585b88b4 2979
20220330 13:05:15 OPS71-GPU_RTX2080 65005679 160000 15d917ccac4a9db2 2976
20220330 13:05:45 OPS71-GPU_RTX2080 65005679 170000 0755ad595f35430c 2978
20220330 13:06:15 OPS71-GPU_RTX2080 65005679 180000 ec7e8dcc75f7321f 2982
20220330 13:06:45 OPS71-GPU_RTX2080 65005679 190000 07abab06c10bb4c9 2976
20220330 13:07:17 OPS71-GPU_RTX2080 65005679 OK 200000 0.31% 2b12a6ddb4071298 2977 us/it + check 1.11s + save 1.65s; ETA 2d 05:35 | P1(3.5M) 4.0% ETA 04:01 45a22c5565cbc38a
20220330 13:07:28 OPS71-GPU_RTX2080 65005679 Stopping, please wait..
20220330 13:07:31 OPS71-GPU_RTX2080 65005679 OK 203600 0.31% 6c427f59576a1693 2990 us/it + check 1.12s + save 1.68s; ETA 2d 05:49 | P1(3.5M) 4.0% ETA 04:01 ebbc2ba8ee984b9f
20220330 13:07:31 OPS71-GPU_RTX2080 65005679 P1(3.5M) releasing 204 buffers
20220330 13:07:31 OPS71-GPU_RTX2080 65005679 Released memory lock 'memlock-0'
20220330 13:07:31 OPS71-GPU_RTX2080 Exiting because "stop requested"
20220330 13:07:31 OPS71-GPU_RTX2080 Bye
PS D:\GIMPS\gpuowl-v7.2-91-g9c22195>
|
|
|
|
|
|
#3535 |
|
Jun 2003
10101010110002 Posts |
It is doing a P-1 test, because you told that doing P-1 would potentially save two tests (the last "2" in your work line).
Use PRP=N/A,1,2,65005679,-1,65,0. |
|
|
|
|
|
#3536 | |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
24×3×163 Posts |
Quote:
See #2 of best practices reference post. For worktodo entry formats see that reference post. Gpuowl v7.x which you used does simultaneous P-1 stage 1 and PRP at the beginning unless you're more careful to tell it not to include P-1. Note there are TWO ETA values per line in some of your abbreviated run; first is for the PRP, second is for the P-1 stage 1. Pure PRP as in V6.11-380 gpuowl would be a better comparison since there is some overhead introduced by the P-1 provisions in v7.x. It won't change the conclusion that the RTX2080 PRP is not competitively fast. There are big benchmark result pages at mersenne.ca that are useful guides. Last fiddled with by kriesel on 2022-03-30 at 18:08 |
|
|
|
|
|
|
#3537 | |
|
Feb 2022
7 Posts |
Quote:
Here is the correct output. Code:
PS D:\GIMPS\gpuowl-v7.2-91-g9c22195> .\gpuowl-win.exe
20220330 14:39:16 GpuOwl VERSION v7.2-91-g9c22195
20220330 14:39:16 GpuOwl VERSION v7.2-91-g9c22195
20220330 14:39:16 config: -cpu OPS71-GPU_RTX2080
20220330 14:39:16 config: -user nullcure
20220330 14:39:16 config:
20220330 14:39:16 device 0, unique id ''
20220330 14:39:16 OPS71-GPU_RTX2080 65005679 FFT: 3.50M 1K:7:256 (17.71 bpw)
20220330 14:39:17 OPS71-GPU_RTX2080 65005679 OpenCL args "-DEXP=65005679u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=7u -DWEIGHT_STEP=0.22040343673494417 -DIWEIGHT_STEP=-0.18059883322240508 -DIWEIGHTS={0,-0.32858172788351608,-0.098395007736230292,-0.3946459339626639,-0.18710843792504778,-0.45420975197356606,-0.267092909464129,-0.01582557530097212,} -DFWEIGHTS={0,0.48938454839446288,0.10913316649808907,0.65192580029407698,0.23017638102607779,0.83220569370005959,0.3644294248386134,0.01608005136468748,} -cl-std=CL2.0 -cl-finite-math-only "
20220330 14:39:17 OPS71-GPU_RTX2080 65005679
20220330 14:39:17 OPS71-GPU_RTX2080 65005679 OpenCL compilation in 0.04 s
20220330 14:39:17 OPS71-GPU_RTX2080 65005679 maxAlloc: 0.0 GB
20220330 14:39:17 OPS71-GPU_RTX2080 65005679 You should use -maxAlloc if your GPU has more than 4GB memory. See help '-h'
20220330 14:39:17 OPS71-GPU_RTX2080 65005679 P1(0) 0 bits
20220330 14:39:17 OPS71-GPU_RTX2080 65005679 PRP starting from beginning
20220330 14:39:18 OPS71-GPU_RTX2080 65005679 OK 0 on-load: blockSize 400, 0000000000000003
20220330 14:39:18 OPS71-GPU_RTX2080 65005679 validating proof residues for power 8
20220330 14:39:18 OPS71-GPU_RTX2080 65005679 Proof using power 8
20220330 14:39:22 OPS71-GPU_RTX2080 65005679 OK 800 0.00% a3310bd156143cd4 2677 us/it + check 1.12s + save 0.13s; ETA 2d 00:21
20220330 14:39:46 OPS71-GPU_RTX2080 65005679 10000 b70c81dfa6ea9eed 2672
20220330 14:40:31 OPS71-GPU_RTX2080 65005679 20000 1805210eafbd98b6 4441
20220330 14:40:58 OPS71-GPU_RTX2080 65005679 30000 1308d69bec04ec33 2674
20220330 14:41:25 OPS71-GPU_RTX2080 65005679 40000 0ef3ad7aa0f100a8 2696
20220330 14:41:52 OPS71-GPU_RTX2080 65005679 50000 592b24ce299b8ab7 2704
20220330 14:42:19 OPS71-GPU_RTX2080 65005679 60000 e43185ab59a07b48 2712
20220330 14:42:46 OPS71-GPU_RTX2080 65005679 70000 8fc4923f37774afe 2688
20220330 14:43:12 OPS71-GPU_RTX2080 65005679 80000 a3dccd3f52ab7805 2687
20220330 14:43:39 OPS71-GPU_RTX2080 65005679 90000 439df49da0e0a8e5 2697
20220330 14:44:06 OPS71-GPU_RTX2080 65005679 100000 1b2a0189d2a3249d 2664
20220330 14:44:33 OPS71-GPU_RTX2080 65005679 110000 5a597fce90bb9501 2675
20220330 14:45:00 OPS71-GPU_RTX2080 65005679 120000 b73ac48ccfdee32a 2686
20220330 14:45:26 OPS71-GPU_RTX2080 65005679 130000 3a742e8e45958c31 2674
20220330 14:45:55 OPS71-GPU_RTX2080 65005679 140000 731143da7eec3832 2884
20220330 14:46:24 OPS71-GPU_RTX2080 65005679 150000 7d920954585b88b4 2825
20220330 14:46:52 OPS71-GPU_RTX2080 65005679 160000 15d917ccac4a9db2 2855
20220330 14:47:21 OPS71-GPU_RTX2080 65005679 170000 0755ad595f35430c 2872
20220330 14:47:50 OPS71-GPU_RTX2080 65005679 180000 ec7e8dcc75f7321f 2934
20220330 14:48:19 OPS71-GPU_RTX2080 65005679 190000 07abab06c10bb4c9 2869
20220330 14:48:48 OPS71-GPU_RTX2080 65005679 OK 200000 0.31% 2b12a6ddb4071298 2823 us/it + check 1.11s + save 0.14s; ETA 2d 02:49
Is this the direct difference of AMD stream cores vs NVIDIA's CUDA cores? I was thinking along the lines of SIMD and MIMD architectures. ******************************** • Single Instruction, Multiple Data (SIMD): A single operation (task) executes simultaneously on multiple elements of data. The number of elements in a SIMD operation can vary from a small number, such as the 4 to 16 elements in short vector instructions, to thousands, as in streaming vector processors. SIMD processors are also known as array processors, since they consist of an array of functional units with a shared controller. • Multiple Instruction, Multiple Data (MIMD): Separate instruction streams, each with its own flow of control, operate on separate data. This characterizes the use of multiple cores in a single processor, multiple processors in a single computer, and multiple computers in a cluster. When multiple processors using different architectures are present in the same computer system, we say it is a heterogeneous computer. An example would be a host processor and a co-processor with different instruction sets. (Reference https://www.sciencedirect.com/topics...on-single-data) ********************************** |
|
|
|
|
|
|
#3538 | |
|
"Curtis"
Feb 2005
Riverside, CA
2·2,927 Posts |
Quote:
Nvidia decided to make consumer GPUs relatively slow at scientific computing; perhaps that protects their Tesla-brand scientific GPU market. |
|
|
|
|
|
|
#3539 | |
|
Feb 2022
710 Posts |
Quote:
On a happier note I just installed a 240mm Liquid cooling solution. - Results are down to 49C-55C from 64C-74C I think I will invest in an AMD GPU :-) |
|
|
|
|
|
|
#3540 |
|
Mar 2017
Halifax, NS
3·17 Posts |
I also run a 5950X, though I have PBO enabled and tuned a fair amount. I have it working on two exponents at once, using 8 cores each. In the 62M range, I'm getting about 1.550 ms/iteration.
|
|
|
|
|
|
#3541 |
|
Mar 2018
59 Posts |
Why does this program finish Factor=4288182653,74,75 in 2 minutes but takes 48 minutes to finish Factor=123570253,74,75 on gtx 1650?
Last fiddled with by kotenok2000 on 2022-04-21 at 20:45 |
|
|
|
|
|
#3542 |
|
6809 > 6502
"""""""""""""""""""
Aug 2003
101×103 Posts
22×2,767 Posts |
Because there are more candidates to test for 123570253 than for 4288182653. Factors for Mersennes must be in the form 2kp+1. p is the exponent being checked and k is a multiplier. The larger the p, the fewer values that k can have and still have a potential factor between 74 and 75 bits.
Edit: For k=1 for 4288182653 we get 32.9977 bits. So nothing under 39 needs to get tested. For 4288182653 the k values that correspond to 74 and 75 bits are (roughly) 2200000000000 and 4410000000000 For 123570253 76400000000000 and 153000000000000 So you can see many more k values need to be checked for the same bit level. Last fiddled with by Uncwilly on 2022-04-21 at 22:53 |
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| mfakto: an OpenCL program for Mersenne prefactoring | Bdot | GPU Computing | 1724 | 2023-06-04 23:31 |
| gr-mfaktc: a CUDA program for generalized repunits prefactoring | MrRepunit | GPU Computing | 42 | 2022-12-18 05:59 |
| The P-1 factoring CUDA program | firejuggler | GPU Computing | 753 | 2020-12-12 18:07 |
| mfaktc 0.21 - CUDA runtime wrong | keisentraut | Software | 2 | 2020-08-18 07:03 |
| World's second-dumbest CUDA program | fivemack | Programming | 112 | 2015-02-12 22:51 |