mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2022-11-15, 15:44   #12
moebius
 
moebius's Avatar
 
Jul 2009
Germany

2A016 Posts
Default

Quote:
Originally Posted by yuki0831 View Post
both version7.~ can run corrctly now.
Can I use it instead?
You've already figured it out, but please give me again a short benchmark of -prp 77936867 with version 7.x and without errors. Thx!
moebius is offline   Reply With Quote
Old 2022-11-15, 15:55   #13
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

112·61 Posts
Default

Quote:
Originally Posted by yuki0831 View Post
v6.11-382 for P-1
2022-11-15 23:58:18 config: -maxAlloc 10000MB
2022-11-15 23:58:18 config: -proof 12
2022-11-15 23:58:18 config: -device 0
2022-11-15 23:58:18 config: -nospin
2022-11-15 23:58:18 config: -log 10000
2022-11-15 23:58:18 config: 4K:12:512
2022-11-15 23:58:18 config: block 10000
2022-11-15 23:58:18 device 0, unique id ''
2022-11-15 23:58:18 NVIDIA GeForce RTX 4090-0 831199679 FFT: 48M 4K:12:512 (16.51 bpw)
2022-11-15 23:58:18 NVIDIA GeForce RTX 4090-0 Expected maximum carry32: 543E0000
2022-11-15 23:58:22 NVIDIA GeForce RTX 4090-0 OpenCL args "-DEXP=831199679u -DWIDTH=4096u -DSMALL_HEIGHT=512u -DMIDDLE=12u -DPM1=1 -DCARRYM64=1 -DWEIGHT_STEP_MINUS_1=0xc.cdbf9b185b77p-5 -DIWEIGHT_STEP_MINUS_1=-0x9.250e27efb0c9p-5 -cl-unsafe-math-optimizations -cl-std=CL2.0 -cl-finite-math-only "
2022-11-15 23:58:22 NVIDIA GeForce RTX 4090-0

2022-11-15 23:58:22 NVIDIA GeForce RTX 4090-0 OpenCL compilation in 0.02 s
2022-11-15 23:58:23 NVIDIA GeForce RTX 4090-0 831199679 P1 B1=4500000, B2=220000000; 6492506 bits; starting at 0
2022-11-15 23:59:58 NVIDIA GeForce RTX 4090-0 831199679 P1 10000 0.15%; 9439 us/it; ETA 0d 17:00; 0000000000000000

continue crunching now.

I use ver6 for stand-alone P-1 task . v6.11-382
I use ver7 for PRP gpuowl-v7.2-112-gd6ad1e0-dirty
Sometimes using mfaktc-0.2.1.win.cuda11.2 2042 for TF

Do you mind if I use different ver for searching prime?
Different versions for different purposes is ok. (Maybe even recommended, for efficiency. But all first PRP should be done on proof capable versions, with near optimal proof power whenever space allows.) All P-1 should be done on proven-reliable hardware and software combinations.

The quoted P-1 calculations above have failed badly and early. There is no point to continuing that P-1 attempt. P-1 computes in stage 1, powers of 3. The red highlighted res64 = 0 shows the calculation has already failed.
Also, please do not run large exponents until you get it sorted out how to run reliably. P-1 has fewer error checks than PRP. If PRP fails as rapidly as you have demonstrated, so will P-1.

When we were young children, we did not run before crawling, then walking. Start small and carefully, methodically. Generally, for a GPU, first get it to run PRP reliably in the same version you would use P-1. Then verify with P-1 on a small exponent that you can find a known factor. Then again on a medium exponent. Ideally again with a test exponent (near in value to where you plan to run new work) with known factor, that will default to using the same fft length as you plan to run new work. See https://www.mersenneforum.org/showpo...8&postcount=31 for some to try to reproduce. Only after it's proven reliable, proceed to new work. Some GPUs simply are not capable, or with time become incapable, of operating at sufficient reliability.

And M888888887 PRP & proof is under way by me on a Radeon VII. You will not likely finish before me. All the remaining near repdigits <1G are estimated to be finished in January or February.
Please do not launch computations without an assignment.
Please begin reading the reference info, in the recommended starting sequence. It is large and will take some time to get through enough to become an effective participant.

You might find CUDALucas -memtest or mfaktc self test helpful in sorting out what is making your GPU so error-prone. Its error rate is STARTLINGLY HIGH. Please address that first. (Or it may be defective hardware. Is it too late to return it?) Then prove it out with some ordinary small runs. Large runs need highly reliable hardware & software combinations. Do not attempt large runs on hardware/software combinations until they are proven reliable on small runs, then medium runs.

And again, the best use of an RTX 4090 would be TF where it is needed. It's designed for far higher performance in low precision computation (SP) which suits TF.
Even if it were perfectly reliable in DP (and it is far from that), its power cost (and purchase cost) would be higher than for a Radeon VII for the same work done. Using a 4090 for other than TF is a little like using a hammer where a screwdriver is needed.

If you are determined to use that GPU for PRP and P-1 despite it being relatively unsuited both by design and from initial experimental experience, consider simultaneous P-1/PRP in v7.2. The P-1 may still fail to find a factor that should be found for the bounds used, but V7.2's coding provides more error detection for P-1 stage 1 work in progress than any v6.x, with a combination of using GEC-protected powers of 3 from the PRP computations, and using the Jacobi check for their multiplication together, IIRC. V7.2 also provides much lower incremental cost of a P-1 stage 1, IF you are also going to run the PRP for the same exponent in the absence of finding a factor.

Or, someone else could run the P-1 for you by mutual agreement. There's a forum thread for that.

Last fiddled with by kriesel on 2022-11-15 at 16:51
kriesel is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
is this right (stage2) crash893 Software 2 2004-01-06 00:21
Stage2 of P-1 jocelynl Math 1 2002-11-16 04:46

All times are UTC. The time now is 10:07.


Thu Feb 9 10:07:19 UTC 2023 up 175 days, 7:35, 1 user, load averages: 0.70, 0.83, 0.84

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔