mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2023-02-25, 00:46   #859
SethTro
 
SethTro's Avatar
 
"Seth"
Apr 2019

2×3×83 Posts
Default

Thanks for the quick double check, weird that this is happening only on my machine :shrug: computers.

I restarted the machine with a lower memory speed and after initially thinking it worked, I discovered I had set SumInputsErrorCheck=0 in prime.txt, SUMOUT error still reproduces.
SethTro is offline   Reply With Quote
Old 2023-02-25, 02:49   #860
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

17·487 Posts
Default

Quote:
Originally Posted by SethTro View Post
I'm running a small ECM assignment for 76441 and getting reproducible SUMOUT errors on a Ryzen 3900x only with FMA3.
This caused prime95 to get stuck in a loop failing, restarting from backup, failing, ...
Investigating...
Prime95 is offline   Reply With Quote
Old 2023-02-25, 04:23   #861
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

17×487 Posts
Default

For now, do not use "SumInputsErrorCheck=1" with ECM.
Prime95 is offline   Reply With Quote
Old 2023-03-05, 21:54   #862
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

782110 Posts
Default In write_gwnum, unexpected gwtogiant failure, retcode -1

Prime95 v30.8b15, on Win7/ dual xeon E5645 ECC ram system:

One worker is having serious issues (other is fine). Some excerpts:
comm window
Code:
[Comm thread Feb 26 16:41:25] Sending expected completion date for M333043493: Jun 19 2023
[Comm thread Feb 26 16:41:25] Sending expected completion date for M74218931: Feb 28 2023
[Comm thread Feb 26 16:41:25] Done communicating with server.
[Main thread Feb 27 10:52:54] In write_gwnum, unexpected gwtogiant failure, retcode -1
[Main thread Feb 27 11:22:54] In write_gwnum, unexpected gwtogiant failure, retcode -1
[Main thread Feb 27 11:52:54] In write_gwnum, unexpected gwtogiant failure, retcode -1
[Main thread Feb 27 12:22:54] In write_gwnum, unexpected gwtogiant failure, retcode -1
(It would be good if the comm window message identified the worker generating the error. The more so, the more workers may be running in parallel.)

worker 1 window:
Code:
[Mon Feb 27 10:52:54 2023]
In write_gwnum, unexpected gwtogiant failure, retcode -1
Error writing intermediate file: p333043493
Errno: 34, Result too large
DOSerrno: 2
[Mon Feb 27 11:22:54 2023]
In write_gwnum, unexpected gwtogiant failure, retcode -1
Error writing intermediate file: p333043493
...
(continues for about a week, without successfully writing a save file.)
Stop worker, rename p333043493 to wasp333043493,
resume worker from p333043493.bu, cross fingers...
Jacobi check passed during the restart.

in Directory of C:\Users\ken\Documents\prime95x64
Code:
02/27/2023  10:22 AM        41,630,508 p333043493.bu
02/27/2023  09:52 AM        41,630,508 p333043493.bu2
02/27/2023  09:28 AM        41,630,508 p333043493.bu3
02/26/2023  09:28 PM        41,630,508 p333043493.bu4
02/27/2023  07:26 PM        41,630,508 wasp333043493
Nuts, problem reoccurred after restart from .bu file.
Same problem with restart from .bu2 also;
Code:
[Mar 5 13:02:13] Worker starting
[Mar 5 13:02:13] Setting affinity to run worker on CPU core #1
[Mar 5 13:02:15] Setting affinity to run helper thread 1 on CPU core #2
[Mar 5 13:02:15] Setting affinity to run helper thread 2 on CPU core #3
[Mar 5 13:02:15] Setting affinity to run helper thread 3 on CPU core #4
[Mar 5 13:02:15] Setting affinity to run helper thread 4 on CPU core #5
[Mar 5 13:02:15] Setting affinity to run helper thread 5 on CPU core #6
[Mar 5 13:02:15] Trying backup intermediate file: p333043493.bu2
[Mar 5 13:02:19] Running Jacobi error check.  Passed.  Time: 354.439 sec.
[Mar 5 13:08:12] Resuming primality test of M333043493 using FFT length 18M, Pass1=4K, Pass2=4608, clm=4, 6 threads
[Mar 5 13:08:12] Iteration: 214768550 / 333043493 [64.48%].
[Mar 5 13:11:54] Iteration: 214770000 / 333043493 [64.48%], ms/iter: 153.382, ETA: 209d 23:10
[Mar 5 13:37:45] Iteration: 214780000 / 333043493 [64.49%], ms/iter: 154.785, ETA: 211d 20:49
[Mar 5 14:02:13] Error writing intermediate file: p333043493
[Mar 5 14:02:13] Errno: 34, Result too large
[Mar 5 14:02:13] DOSerrno: 2
[Mar 5 14:03:23] Iteration: 214790000 / 333043493 [64.49%], ms/iter: 153.811, ETA: 210d 12:24
[Mar 5 14:29:01] Iteration: 214800000 / 333043493 [64.49%], ms/iter: 153.839, ETA: 210d 12:54
[Mar 5 14:32:13] Error writing intermediate file: p333043493
[Mar 5 14:54:35] Iteration: 214810000 / 333043493 [64.49%], ms/iter: 153.362, ETA: 209d 20:49
[Mar 5 15:02:13] Error writing intermediate file: p333043493
[Mar 5 15:15:37] Error writing intermediate file: p333043493
[Mar 5 15:15:37] Stopping primality test of M333043493 at iteration 214818176 [64.50%]
[Mar 5 15:15:37] Worker stopped.
Note, hard drive has 155. GB free space; other worker is having no issues.
Temperatures of cpu cores and ram sticks look ok to me; all 78C or lower for max readings.

Trying start from .bu3 now, also have a .bu4, not looking promising after all 3 of the later files failed.
Copies saved of p333043493, and the .bu3, .bu4.
If those don't work on that hardware either, I could try copying some over to another system/version/cpu-instruction-set & retry one more time.

Last fiddled with by kriesel on 2023-03-05 at 22:00
kriesel is offline   Reply With Quote
Old 2023-03-07, 13:36   #863
Andrew Usher
 
Dec 2022

3×132 Posts
Default

I can't deny this is a serious issue, but it seems you should be able to complete it (on some machine), if other attempts fail, by not writing any more save files. That can't be any worse than abandoning it, and it's what was happening, as you say, for a week before you restarted. (it would be nice if prime95 workers could be paused rather than stopped, retaining all current state in memory).
Andrew Usher is offline   Reply With Quote
Old 2023-03-07, 22:11   #864
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

32·11·79 Posts
Default

Re post 862: Maybe George would like to weigh in on what is happening there & here, or receive & examine a save file.
The .bu4 file on the original (SSE2) system (oldest available save file from the run) also resulted in an iteration loop.
And a separate continuation attempt from the .bu4 file, on different hardware, Xeon Phi 7250 (AVX512) & ECC MCDRAM, Windows 10 Pro, prime95 v30.8b14 worker 2 of 2 also looped:
Quote:
[Mar 7 11:41:03] Worker starting
[Mar 7 11:41:03] Setting affinity to run worker on CPU core #35
[Mar 7 11:41:13] Setting affinity to run helper thread 1 on CPU core #36
...
[Mar 7 11:41:13] Setting affinity to run helper thread 33 on CPU core #68
[Mar 7 11:41:13] Trying backup intermediate file: p333043493.bu4
[Mar 7 11:41:19] Running Jacobi error check. Passed. Time: 1365.880 sec.
[Mar 7 12:04:02] Resuming primality test of M333043493 using AVX-512 FFT length 18M, Pass1=1152, Pass2=16K, clm=1, 34 threads
[Mar 7 12:04:02] Iteration: 214758750 / 333043493 [64.48%].
[Mar 7 12:04:18] Iteration: 214760000 / 333043493 [64.48%], ms/iter: 12.674, ETA: 17d 08:25
[Mar 7 12:06:25] Iteration: 214770000 / 333043493 [64.48%], ms/iter: 12.693, ETA: 17d 09:01
[Mar 7 12:08:31] Iteration: 214780000 / 333043493 [64.49%], ms/iter: 12.679, ETA: 17d 08:31
[Mar 7 12:10:39] Iteration: 214790000 / 333043493 [64.49%], ms/iter: 12.714, ETA: 17d 09:37
[Mar 7 12:10:57] Error writing intermediate file: p333043493
[Mar 7 12:10:57] Errno: 34, Result too large
[Mar 7 12:10:58] DOSerrno: 2
[Mar 7 12:12:45] Iteration: 214800000 / 333043493 [64.49%], ms/iter: 12.653, ETA: 17d 07:35
[Mar 7 12:14:52] Iteration: 214810000 / 333043493 [64.49%], ms/iter: 12.685, ETA: 17d 08:36
[Mar 7 12:16:59] Iteration: 214820000 / 333043493 [64.50%], ms/iter: 12.687, ETA: 17d 08:38
[Mar 7 12:19:06] Iteration: 214830000 / 333043493 [64.50%], ms/iter: 12.684, ETA: 17d 08:31
[Mar 7 12:21:13] Iteration: 214840000 / 333043493 [64.50%], ms/iter: 12.703, ETA: 17d 09:05
[Mar 7 12:23:21] Iteration: 214850000 / 333043493 [64.51%], ms/iter: 12.770, ETA: 17d 11:15
[Mar 7 12:25:28] Iteration: 214860000 / 333043493 [64.51%], ms/iter: 12.692, ETA: 17d 08:40
[Mar 7 12:27:35] Iteration: 214870000 / 333043493 [64.51%], ms/iter: 12.694, ETA: 17d 08:41
[Mar 7 12:29:42] Iteration: 214880000 / 333043493 [64.52%], ms/iter: 12.689, ETA: 17d 08:30
[Mar 7 12:31:49] Iteration: 214890000 / 333043493 [64.52%], ms/iter: 12.707, ETA: 17d 09:03
[Mar 7 12:33:57] Iteration: 214900000 / 333043493 [64.52%], ms/iter: 12.790, ETA: 17d 11:44
[Mar 7 12:36:04] Iteration: 214910000 / 333043493 [64.52%], ms/iter: 12.730, ETA: 17d 09:44
[Mar 7 12:38:11] Iteration: 214920000 / 333043493 [64.53%], ms/iter: 12.702, ETA: 17d 08:47
[Mar 7 12:40:19] Iteration: 214930000 / 333043493 [64.53%], ms/iter: 12.719, ETA: 17d 09:18
[Mar 7 12:40:57] Error writing intermediate file: p333043493
[Mar 7 12:42:26] Iteration: 214940000 / 333043493 [64.53%], ms/iter: 12.678, ETA: 17d 07:56
[Mar 7 12:44:33] Iteration: 214950000 / 333043493 [64.54%], ms/iter: 12.695, ETA: 17d 08:26
[Mar 7 12:46:40] Iteration: 214960000 / 333043493 [64.54%], ms/iter: 12.694, ETA: 17d 08:22
[Mar 7 12:48:47] Iteration: 214970000 / 333043493 [64.54%], ms/iter: 12.787, ETA: 17d 11:24
[Mar 7 12:50:55] Iteration: 214980000 / 333043493 [64.55%], ms/iter: 12.756, ETA: 17d 10:19
[Mar 7 12:53:03] Iteration: 214990000 / 333043493 [64.55%], ms/iter: 12.766, ETA: 17d 10:37
[Mar 7 12:55:10] Iteration: 215000000 / 333043493 [64.55%], ms/iter: 12.696, ETA: 17d 08:18
[Mar 7 12:55:10] ERROR: Invalid FFT data. Restarting from last save file.
[Mar 7 12:55:10] Possible hardware failure, consult readme.txt file.
[Mar 7 12:55:10] Continuing from last save file.
[Mar 7 12:55:10] Waiting five minutes before restarting.
[Mar 7 13:00:48] Setting affinity to run helper thread 2 on CPU core #37
...
[Mar 7 13:00:48] Setting affinity to run helper thread 33 on CPU core #68
[Mar 7 13:00:48] Trying backup intermediate file: p333043493.bu4
[Mar 7 13:00:55] Running Jacobi error check. Passed. Time: 1365.445 sec.
[Mar 7 13:23:37] Resuming primality test of M333043493 using AVX-512 FFT length 18M, Pass1=1152, Pass2=16K, clm=1, 34 threads
[Mar 7 13:23:37] Iteration: 214758750 / 333043493 [64.48%].
[Mar 7 13:23:37] Possible hardware errors have occurred during the test!
[Mar 7 13:23:37] 1 ILLEGAL SUMOUT/bad FFT data.
[Mar 7 13:23:37] Confidence in final result is excellent.
[Mar 7 13:23:56] Iteration: 214760000 / 333043493 [64.48%], ms/iter: 12.745, ETA: 17d 10:45
[Mar 7 13:23:56] Possible hardware errors have occurred during the test!
[Mar 7 13:23:56] 1 ILLEGAL SUMOUT/bad FFT data.
[Mar 7 13:23:56] Confidence in final result is excellent.
[Mar 7 13:26:04] Iteration: 214770000 / 333043493 [64.48%], ms/iter: 12.709, ETA: 17d 09:32
[Mar 7 13:26:04] Possible hardware errors have occurred during the test!
[Mar 7 13:26:04] 1 ILLEGAL SUMOUT/bad FFT data.
[Mar 7 13:26:04] Confidence in final result is excellent.
[Mar 7 13:28:11] Iteration: 214780000 / 333043493 [64.49%], ms/iter: 12.713, ETA: 17d 09:37
[Mar 7 13:28:11] Possible hardware errors have occurred during the test!
[Mar 7 13:28:11] 1 ILLEGAL SUMOUT/bad FFT data.
[Mar 7 13:28:11] Confidence in final result is excellent.
[Mar 7 13:30:18] Iteration: 214790000 / 333043493 [64.49%], ms/iter: 12.721, ETA: 17d 09:50
[Mar 7 13:30:18] Possible hardware errors have occurred during the test!
[Mar 7 13:30:18] 1 ILLEGAL SUMOUT/bad FFT data.
[Mar 7 13:30:18] Confidence in final result is excellent.
[Mar 7 13:32:25] Iteration: 214800000 / 333043493 [64.49%], ms/iter: 12.729, ETA: 17d 10:05
[Mar 7 13:32:25] Possible hardware errors have occurred during the test!
[Mar 7 13:32:25] 1 ILLEGAL SUMOUT/bad FFT data.
[Mar 7 13:32:25] Confidence in final result is excellent.
[Mar 7 13:34:33] Iteration: 214810000 / 333043493 [64.49%], ms/iter: 12.767, ETA: 17d 11:17
[Mar 7 13:34:33] Possible hardware errors have occurred during the test!
[Mar 7 13:34:33] 1 ILLEGAL SUMOUT/bad FFT data.
[Mar 7 13:34:33] Confidence in final result is excellent.
[Mar 7 13:36:40] Iteration: 214820000 / 333043493 [64.50%], ms/iter: 12.717, ETA: 17d 09:36
[Mar 7 13:36:40] Possible hardware errors have occurred during the test!
[Mar 7 13:36:40] 1 ILLEGAL SUMOUT/bad FFT data.
[Mar 7 13:36:40] Confidence in final result is excellent.
[Mar 7 13:38:47] Iteration: 214830000 / 333043493 [64.50%], ms/iter: 12.705, ETA: 17d 09:12
[Mar 7 13:38:47] Possible hardware errors have occurred during the test!
[Mar 7 13:38:47] 1 ILLEGAL SUMOUT/bad FFT data.
[Mar 7 13:38:47] Confidence in final result is excellent.
[Mar 7 13:40:55] Iteration: 214840000 / 333043493 [64.50%], ms/iter: 12.718, ETA: 17d 09:34
[Mar 7 13:40:55] Possible hardware errors have occurred during the test!
[Mar 7 13:40:55] 1 ILLEGAL SUMOUT/bad FFT data.
[Mar 7 13:40:55] Confidence in final result is excellent.
[Mar 7 13:40:57] Error writing intermediate file: p333043493
[Mar 7 13:40:57] Errno: 34, Result too large
[Mar 7 13:40:57] DOSerrno: 2
[Mar 7 13:43:02] Iteration: 214850000 / 333043493 [64.51%], ms/iter: 12.747, ETA: 17d 10:30
[Mar 7 13:43:02] Possible hardware errors have occurred during the test!
[Mar 7 13:43:02] 1 ILLEGAL SUMOUT/bad FFT data.
[Mar 7 13:43:02] Confidence in final result is excellent.
[Mar 7 13:45:10] Iteration: 214860000 / 333043493 [64.51%], ms/iter: 12.746, ETA: 17d 10:25
[Mar 7 13:45:10] Possible hardware errors have occurred during the test!
[Mar 7 13:45:10] 1 ILLEGAL SUMOUT/bad FFT data.
[Mar 7 13:45:10] Confidence in final result is excellent.
[Mar 7 13:47:17] Iteration: 214870000 / 333043493 [64.51%], ms/iter: 12.742, ETA: 17d 10:15
[Mar 7 13:47:17] Possible hardware errors have occurred during the test!
[Mar 7 13:47:17] 1 ILLEGAL SUMOUT/bad FFT data.
[Mar 7 13:47:17] Confidence in final result is excellent.
[Mar 7 13:49:25] Iteration: 214880000 / 333043493 [64.52%], ms/iter: 12.763, ETA: 17d 10:55
[Mar 7 13:49:25] Possible hardware errors have occurred during the test!
[Mar 7 13:49:25] 1 ILLEGAL SUMOUT/bad FFT data.
[Mar 7 13:49:25] Confidence in final result is excellent.
[Mar 7 13:51:33] Iteration: 214890000 / 333043493 [64.52%], ms/iter: 12.759, ETA: 17d 10:45
[Mar 7 13:51:33] Possible hardware errors have occurred during the test!
[Mar 7 13:51:33] 1 ILLEGAL SUMOUT/bad FFT data.
[Mar 7 13:51:33] Confidence in final result is excellent.
[Mar 7 13:53:40] Iteration: 214900000 / 333043493 [64.52%], ms/iter: 12.739, ETA: 17d 10:04
[Mar 7 13:53:40] Possible hardware errors have occurred during the test!
[Mar 7 13:53:40] 1 ILLEGAL SUMOUT/bad FFT data.
[Mar 7 13:53:40] Confidence in final result is excellent.
[Mar 7 13:55:48] Iteration: 214910000 / 333043493 [64.52%], ms/iter: 12.742, ETA: 17d 10:06
[Mar 7 13:55:48] Possible hardware errors have occurred during the test!
[Mar 7 13:55:48] 1 ILLEGAL SUMOUT/bad FFT data.
[Mar 7 13:55:48] Confidence in final result is excellent.
[Mar 7 13:57:56] Iteration: 214920000 / 333043493 [64.53%], ms/iter: 12.809, ETA: 17d 12:18
[Mar 7 13:57:56] Possible hardware errors have occurred during the test!
[Mar 7 13:57:56] 1 ILLEGAL SUMOUT/bad FFT data.
[Mar 7 13:57:56] Confidence in final result is excellent.
[Mar 7 14:00:26] Iteration: 214930000 / 333043493 [64.53%], ms/iter: 14.980, ETA: 20d 11:29
[Mar 7 14:00:26] Possible hardware errors have occurred during the test!
[Mar 7 14:00:26] 1 ILLEGAL SUMOUT/bad FFT data.
[Mar 7 14:00:26] Confidence in final result is excellent.
[Mar 7 14:02:33] Iteration: 214940000 / 333043493 [64.53%], ms/iter: 12.778, ETA: 17d 11:12
[Mar 7 14:02:33] Possible hardware errors have occurred during the test!
[Mar 7 14:02:33] 1 ILLEGAL SUMOUT/bad FFT data.
[Mar 7 14:02:33] Confidence in final result is excellent.
[Mar 7 14:04:56] Iteration: 214950000 / 333043493 [64.54%], ms/iter: 14.303, ETA: 19d 13:11
[Mar 7 14:04:56] Possible hardware errors have occurred during the test!
[Mar 7 14:04:56] 1 ILLEGAL SUMOUT/bad FFT data.
[Mar 7 14:04:56] Confidence in final result is excellent.
[Mar 7 14:07:23] Iteration: 214960000 / 333043493 [64.54%], ms/iter: 14.683, ETA: 20d 01:36
[Mar 7 14:07:23] Possible hardware errors have occurred during the test!
[Mar 7 14:07:23] 1 ILLEGAL SUMOUT/bad FFT data.
[Mar 7 14:07:23] Confidence in final result is excellent.
[Mar 7 14:09:47] Iteration: 214970000 / 333043493 [64.54%], ms/iter: 14.334, ETA: 19d 14:07
[Mar 7 14:09:47] Possible hardware errors have occurred during the test!
[Mar 7 14:09:47] 1 ILLEGAL SUMOUT/bad FFT data.
[Mar 7 14:09:47] Confidence in final result is excellent.
[Mar 7 14:10:57] Error writing intermediate file: p333043493
[Mar 7 14:11:54] Iteration: 214980000 / 333043493 [64.55%], ms/iter: 12.739, ETA: 17d 09:46
[Mar 7 14:11:54] Possible hardware errors have occurred during the test!
[Mar 7 14:11:54] 1 ILLEGAL SUMOUT/bad FFT data.
[Mar 7 14:11:54] Confidence in final result is excellent.
[Mar 7 14:14:02] Iteration: 214990000 / 333043493 [64.55%], ms/iter: 12.750, ETA: 17d 10:06
[Mar 7 14:14:02] Possible hardware errors have occurred during the test!
[Mar 7 14:14:02] 1 ILLEGAL SUMOUT/bad FFT data.
[Mar 7 14:14:02] Confidence in final result is excellent.
[Mar 7 14:16:09] Iteration: 215000000 / 333043493 [64.55%], ms/iter: 12.741, ETA: 17d 09:45
[Mar 7 14:16:09] Possible hardware errors have occurred during the test!
[Mar 7 14:16:09] 1 ILLEGAL SUMOUT/bad FFT data.
[Mar 7 14:16:09] Confidence in final result is excellent.
[Mar 7 14:16:09] ERROR: Invalid FFT data. Restarting from last save file.
[Mar 7 14:16:09] Possible hardware failure, consult readme.txt file.
[Mar 7 14:16:09] Continuing from last save file.
[Mar 7 14:16:09] Waiting five minutes before restarting.
[Mar 7 14:21:48] Setting affinity to run helper thread 2 on CPU core #37
...
[Mar 7 14:21:48] Setting affinity to run helper thread 33 on CPU core #68
[Mar 7 14:21:54] Running Jacobi error check.
(after another iteration loop not detailed above, I aborted the attempt and saved the resulting files, since it is looping too)

Lessons learned:
100Mdigit LL is hard.
It's harder on slow hardware.
All the software error checks and ECC ram don't solve everything.
If on prime95/mprime, at the outset, for more restart possibilities, and comparison between runs:
four backup files are not deep enough
set InterimFiles=50000000 or similar in prime.txt
set InterimResidues=10000000 or similar in prime.txt
Some errors can not be recovered from with .bun files, and they are not apparent until too late.

Parallel LL runs have been started from scratch; one on prime95 v30.8b14 on the xeon phi 7250, and another in gpuowl v6.11-380 on a highly reliable Radeon VII. The GPU is indicating about 3.4 times the speed of the prime95 worker.
Gpuowl outputs interim residues every 10k; the prime95 worker is set currently for 1M.

Last fiddled with by kriesel on 2023-03-07 at 22:20
kriesel is offline   Reply With Quote
Old 2023-03-08, 11:28   #865
S485122
 
S485122's Avatar
 
"Jacob"
Sep 2006
Brussels, Belgium

36428 Posts
Default

Quote:
Originally Posted by kriesel View Post
...
Error writing intermediate file: p333043493
[Mar 7 12:10:57] Errno: 34, Result too large
[Mar 7 12:10:58] DOSerrno: 2
...
"Errno: 34, Result too large" suggests some overflow, since it happens at write time it concerns the file written to or some value to be written.
According to DOS Error Numbers "DOSerrno: 2" would be "File not found".
What is the size of the pXXXXXXXXX.* files ?
What is the filesystem off the place the program writes to ? Could it be a filesize limitation ?
Then only the first of the recurring "1 ILLEGAL SUMOUT/bad FFT data." message corresponds to an error occurring : from undoc.txt :
Quote:
"SumInputsErrorCheck=0 or 1" controls "SUM(INPUTS) != SUM(OUTPUTS)" error checks available in
pre-AVX FFTs (which makes it pretty much obsolete).
... the "count of errors during this test" message is output with every screen update.
S485122 is offline   Reply With Quote
Old 2023-03-08, 13:17   #866
Andrew Usher
 
Dec 2022

7738 Posts
Default

As his post already indicated, the file sizes are just over 40MB (essentially one residue) and we has written many before, and he has plenty of free disk space. So, it would seem impossible that file size or disk space has anything to do with it, though a hard disk error seems possible as the trouble apparently started with writing and reading save files.

Otherwise the options seem to be a genuine hardware error, a rare bug in prime95/gwnum, or both.

A 'file not found' error, if that is what it is, would seem to implicate prime95 or the OS, but when accompanying another error I wouldn't trust that to be reported sensibly.
Andrew Usher is offline   Reply With Quote
Old 2023-03-08, 14:43   #867
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

782110 Posts
Default

Quote:
Originally Posted by S485122 View Post
What is the size of the pXXXXXXXXX.* files ?
What is the filesystem of the place the program writes to ? Could it be a filesize limitation ?
The size of the files that get written are normal, matching, ~40MB each. I have no info on what it attempts and fails to do. There's an 18 GB p577215631.residues file in the same prime95 folder, on an NTFS drive with 166 GB free space. It successfully writes new files of 40MB after the iteration resets to earlier than it has reached. So it looks very unlikely to be a filesize limitation affecting expected file sizes, or even considerably larger.
333043493 / 8 /1024/1024 = 39.7 MiB.
(FAT32 is not for boot disks, or data I care about, when there are better alternatives, IMO. I use it reluctantly on USB memory sticks, where it sometimes prevents storing big files.)
It's occurred with the same data input, on two different systems, so unlikely to be a drive hardware error.

Last fiddled with by kriesel on 2023-03-08 at 14:48
kriesel is offline   Reply With Quote
Old 2023-03-08, 15:32   #868
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

17·487 Posts
Default

Ken, have you tried forcing use of a 20M FFT length to get past the trouble iterations?
Prime95 is offline   Reply With Quote
Old 2023-03-08, 15:44   #869
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

782110 Posts
Default

Quote:
Originally Posted by Prime95 View Post
have you tried forcing use of a 20M FFT length to get past the trouble iterations?
No. How? I looked for a way to do that and must have missed it (in v30.8b15 readme, undoc, whatsnew). I was also contemplating giving v30.11b1 a try on the 7250 xeon phi.

Last fiddled with by kriesel on 2023-03-08 at 15:58
kriesel is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
Do not post your results here! kar_bon Prime Wiki 40 2022-04-03 19:05
what should I post ? science_man_88 science_man_88 24 2018-10-19 23:00
Where to post job ad? xilman Linux 2 2010-12-15 16:39
Moderated Post kar_bon Forum Feedback 3 2010-09-28 08:01
Something that I just had to post/buy dave_0273 Lounge 1 2005-02-27 18:36

All times are UTC. The time now is 04:22.


Fri Jul 7 04:22:56 UTC 2023 up 323 days, 1:51, 0 users, load averages: 1.51, 1.63, 1.54

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔