mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2022-03-30, 16:48   #3532
nullcure
 
nullcure's Avatar
 
Feb 2022

7 Posts
Default

Quote:
Originally Posted by kriesel View Post
Try running a DC using PRP, or LLDC, on your 5950x. Can you beat 9 hours for ~65M exponent? 8.5 hours?


https://www.mersenne.org/report_expo...exp_hi=&full=1
Well sir here's your answer.

I'm running my chip stock without PBO PBO is explicitly set to DISABLED in BIOS. running stock power limits through the chip.

So basically this is the OOB as is performance no overclocking.

(See screenshots)
Attached Thumbnails
Click image for larger version

Name:	65005679_1.png
Views:	153
Size:	6.8 KB
ID:	26709   Click image for larger version

Name:	65005679_2.png
Views:	151
Size:	56.4 KB
ID:	26710   Click image for larger version

Name:	65005679_3-Stock-Fused.png
Views:	147
Size:	350.6 KB
ID:	26711   Click image for larger version

Name:	65005679_4-Stock-Fused-Validation.png
Views:	150
Size:	101.8 KB
ID:	26712  
nullcure is offline   Reply With Quote
Old 2022-03-30, 17:03   #3533
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

1E9016 Posts
Default

So, same exponent,
prime95, 5950X out of the box, ~106W, 21.6 hours, 2.29 KWhours
Gpuowl, Radeon VII, ~200W, 8.98 hours, 1.80 KWhours, to 250W, 8.52 hours, 2.13 KWhours
You may be able to improve your 5950x power cost per result with some tuning.

Last fiddled with by kriesel on 2022-03-30 at 17:05
kriesel is online now   Reply With Quote
Old 2022-03-30, 17:15   #3534
nullcure
 
nullcure's Avatar
 
Feb 2022

78 Posts
Default

And here is my RTX 2080 I don't think it's the same test though because I'm pretty sure that AMD CPU > Nvidia GPU at PRP

Can you give me the line of worktodo.txt for GpuOwl you think is the correct one to comparison benchmark your GPU with my RTX2080 to test?

I used

PRP=N/A,1,2,65005679,-1,65,2

TBH I'm not sure of all the positional inputs,

I know, PRP=AID,?,?,Exponent,?,MaxBit,?

Some enlightenment would be appreciated.

Code:
PS D:\GIMPS\gpuowl-v7.2-91-g9c22195> .\gpuowl-win.exe
20220330 12:57:00 GpuOwl VERSION v7.2-91-g9c22195
20220330 12:57:00 GpuOwl VERSION v7.2-91-g9c22195
20220330 12:57:00 config: -cpu OPS71-GPU_RTX2080
20220330 12:57:00 config: -user nullcure
20220330 12:57:00 config:
20220330 12:57:00 device 0, unique id ''
20220330 12:57:00 OPS71-GPU_RTX2080 65005679 FFT: 3.50M 1K:7:256 (17.71 bpw)
20220330 12:57:00 OPS71-GPU_RTX2080 65005679 OpenCL args "-DEXP=65005679u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=7u -DWEIGHT_STEP=0.22040343673494417 -DIWEIGHT_STEP=-0.18059883322240508 -DIWEIGHTS={0,-0.32858172788351608,-0.098395007736230292,-0.3946459339626639,-0.18710843792504778,-0.45420975197356606,-0.267092909464129,-0.01582557530097212,} -DFWEIGHTS={0,0.48938454839446288,0.10913316649808907,0.65192580029407698,0.23017638102607779,0.83220569370005959,0.3644294248386134,0.01608005136468748,}  -cl-std=CL2.0 -cl-finite-math-only "
20220330 12:57:04 OPS71-GPU_RTX2080 65005679

20220330 12:57:04 OPS71-GPU_RTX2080 65005679 OpenCL compilation in 3.59 s
20220330 12:57:04 OPS71-GPU_RTX2080 65005679 maxAlloc: 0.0 GB
20220330 12:57:04 OPS71-GPU_RTX2080 65005679 You should use -maxAlloc if your GPU has more than 4GB memory. See help '-h'
20220330 12:57:04 OPS71-GPU_RTX2080 65005679 P1(3.5M) 5049336 bits
20220330 12:57:05 OPS71-GPU_RTX2080 65005679 PRP starting from beginning
20220330 12:57:05 OPS71-GPU_RTX2080 65005679 Acquired memory lock 'memlock-0'
20220330 12:57:05 OPS71-GPU_RTX2080 65005679 P1(3.5M) using 204 buffers
20220330 12:57:15 OPS71-GPU_RTX2080 65005679 OK         0 on-load: blockSize 400, 0000000000000003
20220330 12:57:15 OPS71-GPU_RTX2080 65005679 validating proof residues for power 8
20220330 12:57:15 OPS71-GPU_RTX2080 65005679 Proof using power 8
20220330 12:57:45 OPS71-GPU_RTX2080 65005679     10000 b70c81dfa6ea9eed 3008
20220330 12:58:16 OPS71-GPU_RTX2080 65005679     20000 1805210eafbd98b6 3050
20220330 12:58:46 OPS71-GPU_RTX2080 65005679     30000 1308d69bec04ec33 3029
20220330 12:59:17 OPS71-GPU_RTX2080 65005679     40000 0ef3ad7aa0f100a8 3061
20220330 12:59:47 OPS71-GPU_RTX2080 65005679     50000 592b24ce299b8ab7 3005
20220330 13:00:17 OPS71-GPU_RTX2080 65005679     60000 e43185ab59a07b48 3022
20220330 13:00:47 OPS71-GPU_RTX2080 65005679     70000 8fc4923f37774afe 3011
20220330 13:01:17 OPS71-GPU_RTX2080 65005679     80000 a3dccd3f52ab7805 3001
20220330 13:01:47 OPS71-GPU_RTX2080 65005679     90000 439df49da0e0a8e5 2979
20220330 13:02:16 OPS71-GPU_RTX2080 65005679    100000 1b2a0189d2a3249d 2974
20220330 13:02:46 OPS71-GPU_RTX2080 65005679    110000 5a597fce90bb9501 2985
20220330 13:03:16 OPS71-GPU_RTX2080 65005679    120000 b73ac48ccfdee32a 2996
20220330 13:03:46 OPS71-GPU_RTX2080 65005679    130000 3a742e8e45958c31 2986
20220330 13:04:16 OPS71-GPU_RTX2080 65005679    140000 731143da7eec3832 2974
20220330 13:04:46 OPS71-GPU_RTX2080 65005679    150000 7d920954585b88b4 2979
20220330 13:05:15 OPS71-GPU_RTX2080 65005679    160000 15d917ccac4a9db2 2976
20220330 13:05:45 OPS71-GPU_RTX2080 65005679    170000 0755ad595f35430c 2978
20220330 13:06:15 OPS71-GPU_RTX2080 65005679    180000 ec7e8dcc75f7321f 2982
20220330 13:06:45 OPS71-GPU_RTX2080 65005679    190000 07abab06c10bb4c9 2976
20220330 13:07:17 OPS71-GPU_RTX2080 65005679 OK    200000   0.31% 2b12a6ddb4071298 2977 us/it + check 1.11s + save 1.65s; ETA 2d 05:35 | P1(3.5M) 4.0% ETA 04:01 45a22c5565cbc38a
20220330 13:07:28 OPS71-GPU_RTX2080 65005679 Stopping, please wait..
20220330 13:07:31 OPS71-GPU_RTX2080 65005679 OK    203600   0.31% 6c427f59576a1693 2990 us/it + check 1.12s + save 1.68s; ETA 2d 05:49 | P1(3.5M) 4.0% ETA 04:01 ebbc2ba8ee984b9f
20220330 13:07:31 OPS71-GPU_RTX2080 65005679 P1(3.5M) releasing 204 buffers
20220330 13:07:31 OPS71-GPU_RTX2080 65005679 Released memory lock 'memlock-0'
20220330 13:07:31 OPS71-GPU_RTX2080 Exiting because "stop requested"
20220330 13:07:31 OPS71-GPU_RTX2080 Bye
PS D:\GIMPS\gpuowl-v7.2-91-g9c22195>
nullcure is offline   Reply With Quote
Old 2022-03-30, 17:25   #3535
axn
 
axn's Avatar
 
Jun 2003

10101010110002 Posts
Default

It is doing a P-1 test, because you told that doing P-1 would potentially save two tests (the last "2" in your work line).

Use PRP=N/A,1,2,65005679,-1,65,0.
axn is offline   Reply With Quote
Old 2022-03-30, 18:04   #3536
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

24×3×163 Posts
Default

Quote:
Originally Posted by nullcure View Post
And here is my RTX 2080 I don't think it's the same test though because I'm pretty sure that AMD CPU > Nvidia GPU at PRP

...

PRP=N/A,1,2,65005679,-1,65,2

TBH I'm not sure of all the positional inputs,
RTX2080 is stellar at TF and relatively slow at the rest, as are most recent consumer NVIDIA GPUs.
See #2 of best practices reference post.
For worktodo entry formats see that reference post.

Gpuowl v7.x which you used does simultaneous P-1 stage 1 and PRP at the beginning unless you're more careful to tell it not to include P-1. Note there are TWO ETA values per line in some of your abbreviated run; first is for the PRP, second is for the P-1 stage 1. Pure PRP as in V6.11-380 gpuowl would be a better comparison since there is some overhead introduced by the P-1 provisions in v7.x. It won't change the conclusion that the RTX2080 PRP is not competitively fast. There are big benchmark result pages at mersenne.ca that are useful guides.

Last fiddled with by kriesel on 2022-03-30 at 18:08
kriesel is online now   Reply With Quote
Old 2022-03-30, 18:57   #3537
nullcure
 
nullcure's Avatar
 
Feb 2022

7 Posts
Default

Quote:
Originally Posted by axn View Post
It is doing a P-1 test, because you told that doing P-1 would potentially save two tests (the last "2" in your work line).

Use PRP=N/A,1,2,65005679,-1,65,0.


Here is the correct output.

Code:
PS D:\GIMPS\gpuowl-v7.2-91-g9c22195> .\gpuowl-win.exe
20220330 14:39:16 GpuOwl VERSION v7.2-91-g9c22195
20220330 14:39:16 GpuOwl VERSION v7.2-91-g9c22195
20220330 14:39:16 config: -cpu OPS71-GPU_RTX2080
20220330 14:39:16 config: -user nullcure
20220330 14:39:16 config:
20220330 14:39:16 device 0, unique id ''
20220330 14:39:16 OPS71-GPU_RTX2080 65005679 FFT: 3.50M 1K:7:256 (17.71 bpw)
20220330 14:39:17 OPS71-GPU_RTX2080 65005679 OpenCL args "-DEXP=65005679u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=7u -DWEIGHT_STEP=0.22040343673494417 -DIWEIGHT_STEP=-0.18059883322240508 -DIWEIGHTS={0,-0.32858172788351608,-0.098395007736230292,-0.3946459339626639,-0.18710843792504778,-0.45420975197356606,-0.267092909464129,-0.01582557530097212,} -DFWEIGHTS={0,0.48938454839446288,0.10913316649808907,0.65192580029407698,0.23017638102607779,0.83220569370005959,0.3644294248386134,0.01608005136468748,}  -cl-std=CL2.0 -cl-finite-math-only "
20220330 14:39:17 OPS71-GPU_RTX2080 65005679

20220330 14:39:17 OPS71-GPU_RTX2080 65005679 OpenCL compilation in 0.04 s
20220330 14:39:17 OPS71-GPU_RTX2080 65005679 maxAlloc: 0.0 GB
20220330 14:39:17 OPS71-GPU_RTX2080 65005679 You should use -maxAlloc if your GPU has more than 4GB memory. See help '-h'
20220330 14:39:17 OPS71-GPU_RTX2080 65005679 P1(0) 0 bits
20220330 14:39:17 OPS71-GPU_RTX2080 65005679 PRP starting from beginning
20220330 14:39:18 OPS71-GPU_RTX2080 65005679 OK         0 on-load: blockSize 400, 0000000000000003
20220330 14:39:18 OPS71-GPU_RTX2080 65005679 validating proof residues for power 8
20220330 14:39:18 OPS71-GPU_RTX2080 65005679 Proof using power 8
20220330 14:39:22 OPS71-GPU_RTX2080 65005679 OK       800   0.00% a3310bd156143cd4 2677 us/it + check 1.12s + save 0.13s; ETA 2d 00:21
20220330 14:39:46 OPS71-GPU_RTX2080 65005679     10000 b70c81dfa6ea9eed 2672
20220330 14:40:31 OPS71-GPU_RTX2080 65005679     20000 1805210eafbd98b6 4441
20220330 14:40:58 OPS71-GPU_RTX2080 65005679     30000 1308d69bec04ec33 2674
20220330 14:41:25 OPS71-GPU_RTX2080 65005679     40000 0ef3ad7aa0f100a8 2696
20220330 14:41:52 OPS71-GPU_RTX2080 65005679     50000 592b24ce299b8ab7 2704
20220330 14:42:19 OPS71-GPU_RTX2080 65005679     60000 e43185ab59a07b48 2712
20220330 14:42:46 OPS71-GPU_RTX2080 65005679     70000 8fc4923f37774afe 2688
20220330 14:43:12 OPS71-GPU_RTX2080 65005679     80000 a3dccd3f52ab7805 2687
20220330 14:43:39 OPS71-GPU_RTX2080 65005679     90000 439df49da0e0a8e5 2697
20220330 14:44:06 OPS71-GPU_RTX2080 65005679    100000 1b2a0189d2a3249d 2664
20220330 14:44:33 OPS71-GPU_RTX2080 65005679    110000 5a597fce90bb9501 2675
20220330 14:45:00 OPS71-GPU_RTX2080 65005679    120000 b73ac48ccfdee32a 2686
20220330 14:45:26 OPS71-GPU_RTX2080 65005679    130000 3a742e8e45958c31 2674
20220330 14:45:55 OPS71-GPU_RTX2080 65005679    140000 731143da7eec3832 2884
20220330 14:46:24 OPS71-GPU_RTX2080 65005679    150000 7d920954585b88b4 2825
20220330 14:46:52 OPS71-GPU_RTX2080 65005679    160000 15d917ccac4a9db2 2855
20220330 14:47:21 OPS71-GPU_RTX2080 65005679    170000 0755ad595f35430c 2872
20220330 14:47:50 OPS71-GPU_RTX2080 65005679    180000 ec7e8dcc75f7321f 2934
20220330 14:48:19 OPS71-GPU_RTX2080 65005679    190000 07abab06c10bb4c9 2869
20220330 14:48:48 OPS71-GPU_RTX2080 65005679 OK    200000   0.31% 2b12a6ddb4071298 2823 us/it + check 1.11s + save 0.14s; ETA 2d 02:49

Is this the direct difference of AMD stream cores vs NVIDIA's CUDA cores?

I was thinking along the lines of SIMD and MIMD architectures.

********************************

• Single Instruction, Multiple Data (SIMD): A single operation (task) executes simultaneously on multiple elements of data. The number of elements in a SIMD operation can vary from a small number, such as the 4 to 16 elements in short vector instructions, to thousands, as in streaming vector processors. SIMD processors are also known as array processors, since they consist of an array of functional units with a shared controller.


• Multiple Instruction, Multiple Data (MIMD): Separate instruction streams, each with its own flow of control, operate on separate data. This characterizes the use of multiple cores in a single processor, multiple processors in a single computer, and multiple computers in a cluster. When multiple processors using different architectures are present in the same computer system, we say it is a heterogeneous computer. An example would be a host processor and a co-processor with different instruction sets.

(Reference https://www.sciencedirect.com/topics...on-single-data)

**********************************
nullcure is offline   Reply With Quote
Old 2022-03-30, 20:21   #3538
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

2·2,927 Posts
Default

Quote:
Originally Posted by nullcure View Post
Is this the direct difference of AMD stream cores vs NVIDIA's CUDA cores?
It's a direct result of nvidia's choice to cripple consumer GPUs for double-precision computation. Trial factoring does not use DP calculations, while prp-testing does; so each GPU's relative performance on TF vs prp-testing is tied to the ratio of DP performance to SP performance.

Nvidia decided to make consumer GPUs relatively slow at scientific computing; perhaps that protects their Tesla-brand scientific GPU market.
VBCurtis is offline   Reply With Quote
Old 2022-04-01, 18:57   #3539
nullcure
 
nullcure's Avatar
 
Feb 2022

710 Posts
Default

Quote:
Originally Posted by VBCurtis View Post
It's a direct result of nvidia's choice to cripple consumer GPUs for double-precision computation. Trial factoring does not use DP calculations, while prp-testing does; so each GPU's relative performance on TF vs prp-testing is tied to the ratio of DP performance to SP performance.

Nvidia decided to make consumer GPUs relatively slow at scientific computing; perhaps that protects their Tesla-brand scientific GPU market.
Thank you everyone for the feedback.

On a happier note I just installed a 240mm Liquid cooling solution. - Results are down to 49C-55C from 64C-74C

I think I will invest in an AMD GPU :-)
nullcure is offline   Reply With Quote
Old 2022-04-12, 14:49   #3540
ZacHFX
 
ZacHFX's Avatar
 
Mar 2017
Halifax, NS

3·17 Posts
Default

Quote:
Originally Posted by nullcure View Post
Well sir here's your answer.

I'm running my chip stock without PBO PBO is explicitly set to DISABLED in BIOS. running stock power limits through the chip.

So basically this is the OOB as is performance no overclocking.

(See screenshots)
I also run a 5950X, though I have PBO enabled and tuned a fair amount. I have it working on two exponents at once, using 8 cores each. In the 62M range, I'm getting about 1.550 ms/iteration.
ZacHFX is offline   Reply With Quote
Old 2022-04-21, 20:43   #3541
kotenok2000
 
Mar 2018

59 Posts
Default

Why does this program finish Factor=4288182653,74,75 in 2 minutes but takes 48 minutes to finish Factor=123570253,74,75 on gtx 1650?

Last fiddled with by kotenok2000 on 2022-04-21 at 20:45
kotenok2000 is offline   Reply With Quote
Old 2022-04-21, 22:37   #3542
Uncwilly
6809 > 6502
 
Uncwilly's Avatar
 
"""""""""""""""""""
Aug 2003
101×103 Posts

22×2,767 Posts
Default

Because there are more candidates to test for 123570253 than for 4288182653. Factors for Mersennes must be in the form 2kp+1. p is the exponent being checked and k is a multiplier. The larger the p, the fewer values that k can have and still have a potential factor between 74 and 75 bits.

Edit:
For k=1 for 4288182653 we get 32.9977 bits. So nothing under 39 needs to get tested.

For 4288182653 the k values that correspond to 74 and 75 bits are (roughly)
2200000000000 and 4410000000000
For 123570253
76400000000000 and 153000000000000

So you can see many more k values need to be checked for the same bit level.

Last fiddled with by Uncwilly on 2022-04-21 at 22:53
Uncwilly is online now   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
mfakto: an OpenCL program for Mersenne prefactoring Bdot GPU Computing 1724 2023-06-04 23:31
gr-mfaktc: a CUDA program for generalized repunits prefactoring MrRepunit GPU Computing 42 2022-12-18 05:59
The P-1 factoring CUDA program firejuggler GPU Computing 753 2020-12-12 18:07
mfaktc 0.21 - CUDA runtime wrong keisentraut Software 2 2020-08-18 07:03
World's second-dumbest CUDA program fivemack Programming 112 2015-02-12 22:51

All times are UTC. The time now is 14:43.


Fri Jul 7 14:43:28 UTC 2023 up 323 days, 12:12, 0 users, load averages: 1.32, 1.31, 1.12

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔