mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2022-11-15, 06:07   #1
yuki0831
 
"Yuki@karoushi"
Feb 2020
Japan, Chiba pref

22×3 Posts
Question Cant run P-1 stage2

Hello everyone.
I run P-1 factoring with CUDAPm1-20190323-CUDA10.1
But it cant run stage 2

Here is the log?
~~~~
omit above line~~~
"Using up to 22680M GPU memory.
Using B1 = 5377742 from savefile.
Continuing stage 1 from a partial result of M831199679 fft length = 48384K, iteration = 7750001
M831199679, 0x8c3440e5483d968d, n = 48384K, CUDAPm1 v0.22
Stage 1 complete, estimated total time = 30:44:06
Starting stage 1 gcd."
===finish that stage B1 with no factor then go to Stage 2

CUDAPm1 v0.22
------- DEVICE 0 -------
name NVIDIA GeForce RTX 4090
Compatibility 8.9
clockRate (MHz) 2520
memClockRate (MHz) 10501
totalGlobalMem 25756565504
totalConstMem 65536
l2CacheSize 75497472
sharedMemPerBlock 49152
regsPerBlock 65536
warpSize 32
memPitch 2147483647
maxThreadsPerBlock 1024
maxThreadsPerMP 1536
multiProcessorCount 128
maxThreadsDim[3] 1024,1024,64
maxGridSize[3] 2147483647,65535,65535
textureAlignment 512
deviceOverlap 1

No NVIDIAGeForceRTX4090_fft.txt file found. Using default fft lengths.
For optimal fft selection, please run
./CUDAPm1 -cufftbench 1 8192 r
for some small r, 0 < r < 6 e.g.
CUDA reports 23006M of 24563M GPU memory free.
Using threads: norm1 512, mult 32, norm2 256.
No stage 2 checkpoint.
Using up to 4158M GPU memory.
Using B1 = 5377742 from savefile.
Continuing stage 2 from a partial result of M831199679 fft length = 48384K
Starting stage 2.
Using b1 = 5377742, b2 = 161332260, d = 210, e = 2, nrp = 1

And shut down the P-1 program

~~~~~~~~~~
also worktodo text wrote down ...
Pminus1=1,2,831199679,-1,5377742,161332260,87
~~~~~~~~~



Run with RTX 4090 with i7-7700 no over clock both GPU,CPU and MEM.
desktopRAM 48GB

delete that save file didnt solve the problem.
changes UnusedMem also didnt solve that problem.

I cant seive that big exponent:(
What can I do the next?


-------------------
Note that Im Japanese. I didint recognize English very well. I study it harder.
Attached Files
File Type: ini CUDAPm1.ini (4.4 KB, 9 views)
yuki0831 is offline   Reply With Quote
Old 2022-11-15, 08:21   #2
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

1BBF16 Posts
Default

Quote:
Originally Posted by yuki0831 View Post
Hello everyone.
I run P-1 factoring with CUDAPm1-20190323-CUDA10.1
But it cant run stage 2
...
CUDAPm1 v0.22
------- DEVICE 0 -------
name NVIDIA GeForce RTX 4090
Compatibility 8.9
...
No NVIDIAGeForceRTX4090_fft.txt file found. Using default fft lengths.
For optimal fft selection, please run
./CUDAPm1 -cufftbench 1 8192 r
for some small r, 0 < r < 6 e.g.

CUDA reports 23006M of 24563M GPU memory free.
Using threads: norm1 512, mult 32, norm2 256.
No stage 2 checkpoint.
Using up to 4158M GPU memory.
Using B1 = 5377742 from savefile.
Continuing stage 2 from a partial result of M831199679 fft length = 48384K
Starting stage 2.
Using b1 = 5377742, b2 = 161332260, d = 210, e = 2, nrp = 1

And shut down the P-1 program

~~~~~~~~~~
also worktodo text wrote down ...
Pminus1=1,2,831199679,-1,5377742,161332260,87
~~~~~~~~~



Run with RTX 4090 with i7-7700 no over clock both GPU,CPU and MEM.
desktopRAM 48GB

delete that save file didnt solve the problem.
changes UnusedMem also didnt solve that problem.

I cant seive that big exponent:(
What can I do the next?


-------------------
Note that Im Japanese. I didint recognize English very well. I study it harder.
I'm not surprised at all that CUDAPm1 is failing on such a high exponent. That exponent is about twice the highest I've been able to run in CUDAPm1 stage 2 on any GPU model I've tried it upon.
Despite its age, CUDAPm1 is considered alpha software. There are numerous bugs.
See https://www.mersenneforum.org/showthread.php?t=23389
especially http://www.mersenneforum.org/showpos...65&postcount=7.
Try smaller exponents, but first, make the effort to benchmark CUDAPm1 at least for the relevant fft lengths for your GPU and intended work.
Or, better, use the 4090 for TF, on exponents which need more, which would be much more productive. The 4090 has an extreme SP/DP performance ratio; strong SP for TF, comparatively slow DP performance for PRP or P-1.
Or, use gpuowl V6.11-380 for P-1 on the GPU, which is much more reliable and probably much more efficient.
(You'll need to start over, since CUDAPm1 and gpuowl save files are not compatible.)
When doing so, specify bounds matching the mersenne.ca GPU72 row at https://www.mersenne.ca/exponent/831199679
B1=4500000,B2=220000000;Pfactor=0,1,2,831199679,-1,87,1
or the equivalent for a smaller exponent.
Or use prime95 v30.8 or mlucas on the CPU.
Thank you for communicating in English. For Japanese I'd need Google translate.
kriesel is offline   Reply With Quote
Old 2022-11-15, 09:20   #3
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
"name field"
Jun 2011
Thailand

3·5·683 Posts
Default

Quote:
Originally Posted by kriesel View Post
but first, make the effort to benchmark CUDAPm1 at least for the relevant fft lengths for your GPU and intended work.
+1. @yuki, please benchmark that card for both TF and PRP and send results to James, as described here or here.

Last fiddled with by LaurV on 2022-11-15 at 09:21
LaurV is offline   Reply With Quote
Old 2022-11-15, 10:04   #4
yuki0831
 
"Yuki@karoushi"
Feb 2020
Japan, Chiba pref

22·3 Posts
Plus P-1 prob calc says so../exp 831M maxTF to..?//Benchmark later/

Hello, everyone

quote
When doing so, specify bounds matching the mersenne.ca GPU72 row at https://www.mersenne.ca/exponent/831199679
B1=4500000,B2=220000000;Pfactor=0,1,2,831199679,-1,87,1
or the equivalent for a smaller exponent.
quote end


https://www.mersenne.ca/prob.php
P-1 prob calcutlater says...
Calculate bounds from desired probability or effort:(Right menu)
exponent 831199679
Probability 4%
preP-1 TF 85 #acutually 87 no toggle choices
Not set B1=B2 or B1 to be
This calculater says

Seeking 4.000000% chance of factor for exponent M831199679
M831199679, factored to 85.000 bits, with B1=5377742 and B2=161332260
Probability = 4.000000%
Should take about 874.766220 GHz-days (using FFT size 46,080K)
Recommended RAM allocation: (Prime95 v30.7 and earlier)min=7GB; small=22GB; low=39GB; mid=108GB; good=160GB; great=177GB; huge=350GB;
Pminus1=1,2,831199679,-1,5377742,161332260,85
Pfactor=1,2,831199679,-1,85,2

So, I work on P-1 that strange? B1 and B2.

-------------------------------------------------------------------------------
You say more factor to that exponent
M831.1 TF 87-88 takes 4days to crunch.
GPU72 row says TF up to 84
What range should I do TF such a exponent on RTX4090.

Gpuowl run on

2022-11-14 13:28:33 NVIDIA GeForce RTX 4090-0 831199679 OK 1340000 0.16%; 9388 us/it; ETA 90d 04:12; bd0a75175c134213 (check 4.56s)
2022-11-14 13:30:11 NVIDIA GeForce RTX 4090-0 831199679 OK 1350000 0.16%; 9388 us/it; ETA 90d 04:10; e2b60ba16ea7fe56 (check 4.57s)
2022-11-14 13:31:50 NVIDIA GeForce RTX 4090-0 831199679 OK 1360000 0.16%; 9389 us/it; ETA 90d 04:10; 47006a1fafd2a44f (check 4.59s)
2022-11-14 13:33:28 NVIDIA GeForce RTX 4090-0 831199679 OK 1370000 0.16%; 9388 us/it; ETA 90d 04:06; 2a84df7a9b03db3c (check 4.58s)
2022-11-14 13:35:07 NVIDIA GeForce RTX 4090-0 831199679 OK 1380000 0.17%; 9426 us/it; ETA 90d 12:44; 30a216173c25bc2b (check 4.70s)
like that. 90days ETA.

Kriesel suggests run P-1 by Gpuowl, so I did.

2022-11-15 17:47:31 gpuowl v6.11-147-g3b8b00e
2022-11-15 17:47:31 config:
2022-11-15 17:47:31 config: -device 0
2022-11-15 17:47:31 config: -log 50000
2022-11-15 17:47:31 config: -nospin
2022-11-15 17:47:31 config: -maxAlloc 20000MB
2022-11-15 17:47:31 device 0, unique id ''
2022-11-15 17:47:31 NVIDIA GeForce RTX 4090-0 831199679 FFT 49152K: Width 256x4, Height 256x8, Middle 12; 16.51 bits/word
2022-11-15 17:47:36 NVIDIA GeForce RTX 4090-0 OpenCL args "-DEXP=831199679u -DWIDTH=1024u -DSMALL_HEIGHT=2048u -DMIDDLE=12u -DWEIGHT_STEP=0xb.336fe6c616dd8p-3 -DIWEIGHT_STEP=0xb.6d78ec08279b8p-4 -DWEIGHT_BIGSTEP=0x9.837f0518db8a8p-3 -DIWEIGHT_BIGSTEP=0xd.744fccad69d68p-4 -DPM1=1 -cl-fast-relaxed-math -cl-std=CL2.0"
2022-11-15 17:47:36 NVIDIA GeForce RTX 4090-0

2022-11-15 17:47:36 NVIDIA GeForce RTX 4090-0 OpenCL compilation in 0.08 s
2022-11-15 17:47:37 NVIDIA GeForce RTX 4090-0 831199679 P1 B1=4500000, B2=220000000; 6492506 bits; starting at 0
2022-11-15 17:49:18 NVIDIA GeForce RTX 4090-0 831199679 P1 10000 0.15%; 10059 us/it; ETA 0d 18:07; e143ba364abb1776
2022-11-15 17:50:57 NVIDIA GeForce RTX 4090-0 831199679 P1 20000 0.31%; 9922 us/it; ETA 0d 17:50; c088e4f7e89079a0
2022-11-15 17:52:38 NVIDIA GeForce RTX 4090-0 831199679 P1 30000 0.46%; 10079 us/it; ETA 0d 18:06; 3575e4be9489d98f
2022-11-15 17:52:39 NVIDIA GeForce RTX 4090-0 saved
2022-11-15 17:54:20 NVIDIA GeForce RTX 4090-0 831199679 P1 40000 0.62%; 10213 us/it; ETA 0d 18:18; 19bb0d493499a5a3
2022-11-15 17:56:00 NVIDIA GeForce RTX 4090-0 831199679 P1 50000 0.77%; 10022 us/it; ETA 0d 17:56; afc1262fa37b51cb
2022-11-15 17:57:40 NVIDIA GeForce RTX 4090-0 saved
2022-11-15 17:57:43 NVIDIA GeForce RTX 4090-0 831199679 P1 60000 0.92%; 10238 us/it; ETA 0d 18:18; e370fec1b482457f

like that. Its quite fast than I thought. Hope that this task will finish.

TBP is roughly 230-250W. GPU load is 100 % But Power consuption is 50%



I will attach benchmark to him.

----------------------------------------------------------------------------
I also mention about TF noise on GPU. Its off-topic i thought.
When crunch TF on GPU with RTX4090, coil noise had occurd.
I annoied with that noise. Its not fan noise.
yuki0831 is offline   Reply With Quote
Old 2022-11-15, 10:19   #5
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

7,103 Posts
Default

Quote:
Originally Posted by yuki0831 View Post
You say more factor to that exponent
M831.1 TF 87-88 takes 4days to crunch.
GPU72 row says TF up to 84
What range should I do TF such a exponent on RTX4090.
No, none. M831199679 has had too much TF done already. Any more on it would be wasted time.
There are other exponents that need TF and the 4090 is excellent at TF, relatively much slower at anything that needs FP64.
(PRP, P-1, LL are better done on the CPU, or a GPU such as Radeon VII or most NVIDIA Tesla models.)
https://www.techpowerup.com/gpu-spec...rtx-4090.c3889
"FP32 (float) performance 82.58 TFLOPS
FP64 (double) performance 1,290 GFLOPS (1:64)" so PRP, P-1 etc is not effective use of the 4090.


Quote:
Originally Posted by yuki0831 View Post
Kriesel suggests run P-1 by Gpuowl, so I did.

2022-11-15 17:47:31 gpuowl v6.11-147-g3b8b00e
2022-11-15 17:47:31 config:
2022-11-15 17:47:31 config: -device 0
2022-11-15 17:47:31 config: -log 50000
2022-11-15 17:47:31 config: -nospin
2022-11-15 17:47:31 config: -maxAlloc 20000MB
2022-11-15 17:47:31 device 0, unique id ''
2022-11-15 17:47:31 NVIDIA GeForce RTX 4090-0 831199679 FFT 49152K: Width 256x4, Height 256x8, Middle 12; 16.51 bits/word
2022-11-15 17:47:36 NVIDIA GeForce RTX 4090-0 OpenCL args "-DEXP=831199679u -DWIDTH=1024u -DSMALL_HEIGHT=2048u -DMIDDLE=12u -DWEIGHT_STEP=0xb.336fe6c616dd8p-3 -DIWEIGHT_STEP=0xb.6d78ec08279b8p-4 -DWEIGHT_BIGSTEP=0x9.837f0518db8a8p-3 -DIWEIGHT_BIGSTEP=0xd.744fccad69d68p-4 -DPM1=1 -cl-fast-relaxed-math -cl-std=CL2.0"
2022-11-15 17:47:36 NVIDIA GeForce RTX 4090-0

2022-11-15 17:47:36 NVIDIA GeForce RTX 4090-0 OpenCL compilation in 0.08 s
2022-11-15 17:47:37 NVIDIA GeForce RTX 4090-0 831199679 P1 B1=4500000, B2=220000000; 6492506 bits; starting at 0
2022-11-15 17:49:18 NVIDIA GeForce RTX 4090-0 831199679 P1 10000 0.15%; 10059 us/it; ETA 0d 18:07; e143ba364abb1776
2022-11-15 17:50:57 NVIDIA GeForce RTX 4090-0 831199679 P1 20000 0.31%; 9922 us/it; ETA 0d 17:50; c088e4f7e89079a0
2022-11-15 17:52:38 NVIDIA GeForce RTX 4090-0 831199679 P1 30000 0.46%; 10079 us/it; ETA 0d 18:06; 3575e4be9489d98f
2022-11-15 17:52:39 NVIDIA GeForce RTX 4090-0 saved
2022-11-15 17:54:20 NVIDIA GeForce RTX 4090-0 831199679 P1 40000 0.62%; 10213 us/it; ETA 0d 18:18; 19bb0d493499a5a3
2022-11-15 17:56:00 NVIDIA GeForce RTX 4090-0 831199679 P1 50000 0.77%; 10022 us/it; ETA 0d 17:56; afc1262fa37b51cb
2022-11-15 17:57:40 NVIDIA GeForce RTX 4090-0 saved
2022-11-15 17:57:43 NVIDIA GeForce RTX 4090-0 831199679 P1 60000 0.92%; 10238 us/it; ETA 0d 18:18; e370fec1b482457f
Why such an old version? Why not v6.11-382? See http://www.mersenneforum.org/showpos...35&postcount=2

Last fiddled with by kriesel on 2022-11-15 at 10:41
kriesel is offline   Reply With Quote
Old 2022-11-15, 11:29   #6
moebius
 
moebius's Avatar
 
Jul 2009
Germany

12038 Posts
Default

Quote:
Originally Posted by yuki0831 View Post
I will attach benchmark to him.
Please make a prp benchmark with the exponent 77936867 too for this gpuOwl specific List with (as kriesel said) v6.11-382 or higher

gpuOwl benchmarks online (new link)

It is customary to post the results in this thread

https://www.mersenneforum.org/showth...787#post617787

Last fiddled with by moebius on 2022-11-15 at 11:44
moebius is offline   Reply With Quote
Old 2022-11-15, 13:05   #7
yuki0831
 
"Yuki@karoushi"
Feb 2020
Japan, Chiba pref

22×3 Posts
Question cant run on M831 exp -v6.11-382-g98ff9c7-dirty

After Download that file I cant run on the this program with error.
and discard the program.
With using power 12 it had occured error, also power 8.



log.txt
2022-11-15 21:54:03 config: -maxAlloc 20000MB
2022-11-15 21:54:03 config: -fft 48M
2022-11-15 21:54:03 config: -proof 12
2022-11-15 21:54:03 device 0, unique id ''
2022-11-15 21:54:03 NVIDIA GeForce RTX 4090-0 831199679 FFT: 48M 4K:12:512 (16.51 bpw)
2022-11-15 21:54:03 NVIDIA GeForce RTX 4090-0 Expected maximum carry32: 543E0000
2022-11-15 21:54:08 NVIDIA GeForce RTX 4090-0 OpenCL args "-DEXP=831199679u -DWIDTH=4096u -DSMALL_HEIGHT=512u -DMIDDLE=12u -DPM1=0 -DWEIGHT_STEP_MINUS_1=0xc.cdbf9b185b77p-5 -DIWEIGHT_STEP_MINUS_1=-0x9.250e27efb0c9p-5 -cl-unsafe-math-optimizations -cl-std=CL2.0 -cl-finite-math-only "
2022-11-15 21:54:08 NVIDIA GeForce RTX 4090-0

2022-11-15 21:54:08 NVIDIA GeForce RTX 4090-0 OpenCL compilation in 0.02 s
2022-11-15 21:54:13 NVIDIA GeForce RTX 4090-0 831199679 OK 0 loaded: blockSize 400, 0000000000000003
2022-11-15 21:54:13 NVIDIA GeForce RTX 4090-0 validating proof residues for power 12
2022-11-15 21:54:13 NVIDIA GeForce RTX 4090-0 Proof using power 12
2022-11-15 21:54:25 NVIDIA GeForce RTX 4090-0 831199679 EE 800 0.00%; 9394 us/it; ETA 90d 08:59; 0000000000000000 (check 4.40s)
2022-11-15 21:54:29 NVIDIA GeForce RTX 4090-0 831199679 OK 0 loaded: blockSize 400, 0000000000000003
2022-11-15 21:54:41 NVIDIA GeForce RTX 4090-0 831199679 EE 800 0.00%; 9349 us/it; ETA 89d 22:38; 0000000000000000 (check 4.37s) 1 errors
2022-11-15 21:54:46 NVIDIA GeForce RTX 4090-0 831199679 OK 0 loaded: blockSize 400, 0000000000000003
2022-11-15 21:54:58 NVIDIA GeForce RTX 4090-0 831199679 EE 800 0.00%; 9357 us/it; ETA 90d 00:27; 0000000000000000 (check 4.37s) 2 errors
2022-11-15 21:54:58 NVIDIA GeForce RTX 4090-0 3 sequential errors, will stop.
2022-11-15 21:54:58 NVIDIA GeForce RTX 4090-0 Exiting because "too many errors"
2022-11-15 21:54:58 NVIDIA GeForce RTX 4090-0 Bye

With using power 12 it had occured error, also power 8.
yuki0831 is offline   Reply With Quote
Old 2022-11-15, 14:05   #8
yuki0831
 
"Yuki@karoushi"
Feb 2020
Japan, Chiba pref

22×3 Posts
Unhappy also cant run 77936867 new ver.

2022-11-15 23:03:19 config: -maxAlloc 10000MB
2022-11-15 23:03:19 config: -proof 12
2022-11-15 23:03:19 config: -device 0
2022-11-15 23:03:19 config: -nospin
2022-11-15 23:03:19 config: -log 10000
2022-11-15 23:03:19 device 0, unique id ''
2022-11-15 23:03:19 NVIDIA GeForce RTX 4090-0 77936867 FFT: 4M 1K:8:256 (18.58 bpw)
2022-11-15 23:03:19 NVIDIA GeForce RTX 4090-0 Expected maximum carry32: 583B0000
2022-11-15 23:03:20 NVIDIA GeForce RTX 4090-0 OpenCL args "-DEXP=77936867u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=8u -DPM1=0 -DMM_CHAIN=1u -DMM2_CHAIN=2u -DMAX_ACCURACY=1 -DWEIGHT_STEP_MINUS_1=0xa.c42d0d7cec038p-5 -DIWEIGHT_STEP_MINUS_1=-0x8.0e50c8817ddf8p-5 -cl-unsafe-math-optimizations -cl-std=CL2.0 -cl-finite-math-only "
2022-11-15 23:03:20 NVIDIA GeForce RTX 4090-0

2022-11-15 23:03:20 NVIDIA GeForce RTX 4090-0 OpenCL compilation in 0.02 s
2022-11-15 23:03:20 NVIDIA GeForce RTX 4090-0 77936867 OK 0 loaded: blockSize 400, 0000000000000003
2022-11-15 23:03:20 NVIDIA GeForce RTX 4090-0 validating proof residues for power 12
2022-11-15 23:03:20 NVIDIA GeForce RTX 4090-0 Proof using power 12
2022-11-15 23:03:21 NVIDIA GeForce RTX 4090-0 77936867 EE 800 0.00%; 753 us/it; ETA 0d 16:18; 0000000000000000 (check 0.35s)
2022-11-15 23:03:21 NVIDIA GeForce RTX 4090-0 77936867 OK 0 loaded: blockSize 400, 0000000000000003
2022-11-15 23:03:22 NVIDIA GeForce RTX 4090-0 77936867 EE 800 0.00%; 750 us/it; ETA 0d 16:14; 0000000000000000 (check 0.35s) 1 errors
2022-11-15 23:03:23 NVIDIA GeForce RTX 4090-0 77936867 OK 0 loaded: blockSize 400, 0000000000000003
2022-11-15 23:03:24 NVIDIA GeForce RTX 4090-0 77936867 EE 800 0.00%; 746 us/it; ETA 0d 16:10; 0000000000000000 (check 0.35s) 2 errors
2022-11-15 23:03:24 NVIDIA GeForce RTX 4090-0 3 sequential errors, will stop.
2022-11-15 23:03:24 NVIDIA GeForce RTX 4090-0 Exiting because "too many errors"
2022-11-15 23:03:24 NVIDIA GeForce RTX 4090-0 Bye



M831 also M77 cant run gpuowl latest version.
yuki0831 is offline   Reply With Quote
Old 2022-11-15, 14:26   #9
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

1BBF16 Posts
Default

Other downloadable versions are listed at https://www.mersenneforum.org/showpo...39&postcount=4 as part of the large reference info collection. Try v6.11-318, the oldest I could build that supports PRP proof generation.

Perhaps that particular GPU uses less than perfectly reliable memory, that is producing errors. If so it might be resolvable by lowering or limiting memory clock rate. (Some Radeon VIIs contain Samsung memory and do not run reliably at rated clock rate, for example, and require ~13% lower to improve reliability.)
Quote:
Originally Posted by yuki0831 View Post
M831 also M77 cant run gpuowl latest version.
I think you meant M831M & M77M there.

Last fiddled with by kriesel on 2022-11-15 at 14:27
kriesel is offline   Reply With Quote
Old 2022-11-15, 14:53   #10
yuki0831
 
"Yuki@karoushi"
Feb 2020
Japan, Chiba pref

C16 Posts
Minus

GpuOwl-v7.2-2-ga135d8d


2022-11-15 23:22:24 GpuOwl VERSION v7.2-2-ga135d8d
2022-11-15 23:22:24 Note: not found 'config.txt'
2022-11-15 23:22:24 device 0, unique id ''
2022-11-15 23:22:24 NVIDIA GeForce RTX 4090-0 888888887 FFT: 48M 4K:12:512 (17.66 bpw)
2022-11-15 23:22:27 NVIDIA GeForce RTX 4090-0 888888887 OpenCL args "-DEXP=888888887u -DWIDTH=4096u -DSMALL_HEIGHT=512u -DMIDDLE=12u -DCARRY64=1 -DCARRYM64=1 -DMM_CHAIN=3u -DMM2_CHAIN=3u -DMAX_ACCURACY=1 -DULTRA_TRIG=1 -DWEIGHT_STEP_MINUS_1=0x8.7c83084423358p-5 -DIWEIGHT_STEP_MINUS_1=-0xd.6a42b2e25641p-6 -cl-std=CL2.0 -cl-finite-math-only "
2022-11-15 23:22:27 NVIDIA GeForce RTX 4090-0 888888887

2022-11-15 23:22:27 NVIDIA GeForce RTX 4090-0 888888887 OpenCL compilation in 0.02 s
2022-11-15 23:22:27 NVIDIA GeForce RTX 4090-0 888888887 maxAlloc: 0.0 GB
2022-11-15 23:22:27 NVIDIA GeForce RTX 4090-0 888888887 You should use -maxAlloc if your GPU has more than 4GB memory. See help '-h'
2022-11-15 23:22:27 NVIDIA GeForce RTX 4090-0 888888887 P1(0) 0 bits
2022-11-15 23:22:27 NVIDIA GeForce RTX 4090-0 888888887 PRP starting from beginning
2022-11-15 23:22:32 NVIDIA GeForce RTX 4090-0 888888887 OK 0 on-load: blockSize 400, 0000000000000003
2022-11-15 23:22:32 NVIDIA GeForce RTX 4090-0 888888887 validating proof residues for power 8
2022-11-15 23:22:32 NVIDIA GeForce RTX 4090-0 888888887 Proof using power 8
2022-11-15 23:22:47 NVIDIA GeForce RTX 4090-0 888888887 OK 800 0.00% 4aaba35497171881 10518 us/it + check 4.98s + save 1.20s; ETA 108d 05:05
2022-11-15 23:24:23 NVIDIA GeForce RTX 4090-0 888888887 10000 0.00% b6dd1397ccbf3843 10505 us/it


gpuowl-v7.2-112-gd6ad1e0-dirty
20221115 23:32:18 GpuOwl VERSION v7.2-112-gd6ad1e0-dirty
20221115 23:32:18 Note: not found 'config.txt'
20221115 23:32:18 device 0, unique id ''
20221115 23:32:18 NVIDIA GeForce RTX 4090-0 worktodo.txt : line "PRP=888888887" does not end with a newline
20221115 23:32:18 NVIDIA GeForce RTX 4090-0 Exiting because "lines must end with newline"
20221115 23:32:18 NVIDIA GeForce RTX 4090-0 Bye
20221115 23:32:43 GpuOwl VERSION v7.2-112-gd6ad1e0-dirty
20221115 23:32:43 Note: not found 'config.txt'
20221115 23:32:43 device 0, unique id ''
20221115 23:32:43 NVIDIA GeForce RTX 4090-0 888888887 FFT: 48M 4K:12:512 (17.66 bpw)
20221115 23:32:44 NVIDIA GeForce RTX 4090-0 888888887 OpenCL args "-DEXP=888888887u -DWIDTH=4096u -DSMALL_HEIGHT=512u -DMIDDLE=12u -DCARRY64=1 -DMM_CHAIN=3u -DMM2_CHAIN=3u -DWEIGHT_STEP=0.26519919981465162 -DIWEIGHT_STEP=-0.20961062878754794 -DIWEIGHTS={0,-0.37528464187438465,-0.21946144264396836,-0.024771151220951103,-0.39075956048056493,-0.23879628128201763,-0.048928692509090981,-0.40585114753781953,} -DFWEIGHTS={0,0.6007290152116348,0.28116669007020512,0.025400347059014595,0.64138808774544498,0.31370876863843733,0.051445871748747582,0.68307991483271158,} -cl-std=CL2.0 -cl-finite-math-only "
20221115 23:32:47 NVIDIA GeForce RTX 4090-0 888888887

20221115 23:32:47 NVIDIA GeForce RTX 4090-0 888888887 OpenCL compilation in 2.67 s
20221115 23:32:47 NVIDIA GeForce RTX 4090-0 888888887 maxAlloc: 0.0 GB
20221115 23:32:47 NVIDIA GeForce RTX 4090-0 888888887 You should use -maxAlloc if your GPU has more than 4GB memory. See help '-h'
20221115 23:32:47 NVIDIA GeForce RTX 4090-0 888888887 P1(0) 0 bits
20221115 23:32:47 NVIDIA GeForce RTX 4090-0 888888887 PRP starting from beginning
20221115 23:32:52 NVIDIA GeForce RTX 4090-0 888888887 OK 0 on-load: blockSize 400, 0000000000000003
20221115 23:32:52 NVIDIA GeForce RTX 4090-0 888888887 validating proof residues for power 8
20221115 23:32:52 NVIDIA GeForce RTX 4090-0 888888887 Proof using power 8
20221115 23:33:06 NVIDIA GeForce RTX 4090-0 888888887 OK 800 0.00% 4aaba35497171881 9746 us/it + check 4.66s + save 1.20s; ETA 100d 06:31
20221115 23:34:36 NVIDIA GeForce RTX 4090-0 888888887 10000 b6dd1397ccbf3843 9821
20221115 23:36:15 NVIDIA GeForce RTX 4090-0 888888887 20000 0cbe7b6e26a8c3ef 9871
20221115 23:37:54 NVIDIA GeForce RTX 4090-0 888888887 30000 74b16c6f1795f798 9920
20221115 23:39:33 NVIDIA GeForce RTX 4090-0 888888887 40000 df3001f19129200e 9872

both version7.~ can run corrctly now.

Can I use it instead?

V6.11-382 with down VRAM clock down also reports error.
yuki0831 is offline   Reply With Quote
Old 2022-11-15, 15:07   #11
yuki0831
 
"Yuki@karoushi"
Feb 2020
Japan, Chiba pref

11002 Posts
Lightbulb

v6.11-382 for P-1
2022-11-15 23:58:18 config: -maxAlloc 10000MB
2022-11-15 23:58:18 config: -proof 12
2022-11-15 23:58:18 config: -device 0
2022-11-15 23:58:18 config: -nospin
2022-11-15 23:58:18 config: -log 10000
2022-11-15 23:58:18 config: 4K:12:512
2022-11-15 23:58:18 config: block 10000
2022-11-15 23:58:18 device 0, unique id ''
2022-11-15 23:58:18 NVIDIA GeForce RTX 4090-0 831199679 FFT: 48M 4K:12:512 (16.51 bpw)
2022-11-15 23:58:18 NVIDIA GeForce RTX 4090-0 Expected maximum carry32: 543E0000
2022-11-15 23:58:22 NVIDIA GeForce RTX 4090-0 OpenCL args "-DEXP=831199679u -DWIDTH=4096u -DSMALL_HEIGHT=512u -DMIDDLE=12u -DPM1=1 -DCARRYM64=1 -DWEIGHT_STEP_MINUS_1=0xc.cdbf9b185b77p-5 -DIWEIGHT_STEP_MINUS_1=-0x9.250e27efb0c9p-5 -cl-unsafe-math-optimizations -cl-std=CL2.0 -cl-finite-math-only "
2022-11-15 23:58:22 NVIDIA GeForce RTX 4090-0

2022-11-15 23:58:22 NVIDIA GeForce RTX 4090-0 OpenCL compilation in 0.02 s
2022-11-15 23:58:23 NVIDIA GeForce RTX 4090-0 831199679 P1 B1=4500000, B2=220000000; 6492506 bits; starting at 0
2022-11-15 23:59:58 NVIDIA GeForce RTX 4090-0 831199679 P1 10000 0.15%; 9439 us/it; ETA 0d 17:00; 0000000000000000

continue crunching now.

I use ver6 for stand-alone P-1 task . v6.11-382
I use ver7 for PRP gpuowl-v7.2-112-gd6ad1e0-dirty
Sometimes using mfaktc-0.2.1.win.cuda11.2 2042 for TF

Do you mind if I use differnt ver for serching prime?
yuki0831 is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
is this right (stage2) crash893 Software 2 2004-01-06 00:21
Stage2 of P-1 jocelynl Math 1 2002-11-16 04:46

All times are UTC. The time now is 04:31.


Thu Dec 1 04:31:07 UTC 2022 up 105 days, 1:59, 0 users, load averages: 0.85, 0.81, 0.73

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔