mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2022-01-18, 15:55   #331
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
"name field"
Jun 2011
Thailand

2×17×293 Posts
Default

Quote:
Originally Posted by LaurV View Post
30.8, stage 1 found a factor for 7105589, but the file still in folder together with other stage 1 files, moved to where stage 2 runs, then when running stage 2, P95 crashes and can not be restarted as soon as it reached the exponent. After first restart and crash, the file is renamed to extension "write", then after the second restart, to extension "bad1", the third restart says "bad file assignment skipped for now".

After deleting the file, stage 2 can continue correctly with the other files.
Problem repeated with 7122781, so it is for sure a P95 bug. After deleting the file, it resumed working normally. It wasted some time, as I was not in front of the computer, so it had to wait for me to come back from work to restart, but on the other hand, the bug is minor, as there is not every day you find a 105 bits factor in stage 1, AND you want to keep the file to run stage 2 on it...
LaurV is offline   Reply With Quote
Old 2022-01-18, 16:40   #332
petrw1
1976 Toyota Corona years forever!
 
petrw1's Avatar
 
"Wayne"
Nov 2006
Saskatchewan, Canada

3×11×157 Posts
Default

Quote:
Originally Posted by Xyzzy View Post
[Work thread Jan 16 10:18] Using 229377MB of memory. D: 90090, 8640x34699 polynomial multiplication.

That RAM would do it...with that much RAM you could get away with a smaller B1 and still find plenty of factors.
B1=1M might do it but its up to you.
petrw1 is offline   Reply With Quote
Old 2022-01-19, 21:01   #333
SethTro
 
SethTro's Avatar
 
"Seth"
Apr 2019

1B016 Posts
Default

After benchmark other threads didn't restart


I was running 1 worker with 8 threads and htop was showing 800% cpu usage this morning it's showing 1 thread at 100% cpu usage


Code:
[Work thread Jan 18 23:36] Setting affinity to run worker on CPU core #2
[Work thread Jan 18 23:36]
[Work thread Jan 18 23:36] P-1 on M7508981 with B1=1000000000, B2=500000000000
[Work thread Jan 18 23:36] Setting affinity to run helper thread 1 on CPU core #3
[Work thread Jan 18 23:36] Setting affinity to run helper thread 2 on CPU core #4
[Work thread Jan 18 23:36] Setting affinity to run helper thread 3 on CPU core #5
[Work thread Jan 18 23:36] Using AVX FFT length 384K, Pass1=384, Pass2=1K, clm=2, 8 threads
[Work thread Jan 18 23:36] Setting affinity to run helper thread 7 on CPU core #9
[Work thread Jan 18 23:36] Setting affinity to run helper thread 6 on CPU core #8
[Work thread Jan 18 23:36] Setting affinity to run helper thread 4 on CPU core #6
[Work thread Jan 18 23:36] Setting affinity to run helper thread 5 on CPU core #7
[Work thread Jan 18 23:37] M7508981 stage 1 is 97.07% complete.
[Work thread Jan 19 00:16] M7508981 stage 1 is 97.37% complete. Time: 2368.379 sec.
...
[Work thread Jan 19 05:32] M7508981 stage 1 is 99.77% complete. Time: 2370.455 sec.
[Main thread Jan 19 05:36] Benchmarking multiple workers to tune FFT selection.
[Work thread Jan 19 05:36] Worker stopped while running needed benchmarks.
[Main thread Jan 19 05:36] Timing 384K FFT, 8 cores, 1 worker.  Average times:  0.35 ms.  Total throughput: 2876.95 iter/sec.
[Main thread Jan 19 05:37] Timing 384K FFT, 8 cores, 1 worker.  Average times:  0.43 ms.  Total throughput: 2305.95 iter/sec.
...
[Main thread Jan 19 05:39] Timing 384K FFT, 8 cores, 1 worker.  Average times:  0.41 ms.  Total throughput: 2429.03 iter/sec.
[Main thread Jan 19 05:39]
[Main thread Jan 19 05:39] Throughput benchmark complete.
[Work thread Jan 19 05:39] Benchmarks complete, restarting worker.
[Work thread Jan 19 05:39]
[Work thread Jan 19 05:39] P-1 on M7508981 with B1=1000000000, B2=500000000000
[Work thread Jan 19 05:39] Setting affinity to run helper thread 2 on CPU core #4
[Work thread Jan 19 05:39] Setting affinity to run helper thread 1 on CPU core #3
[Work thread Jan 19 05:39] Setting affinity to run helper thread 3 on CPU core #5
[Work thread Jan 19 05:39] Setting affinity to run helper thread 6 on CPU core #8
[Work thread Jan 19 05:39] Using AVX FFT length 384K, Pass1=384, Pass2=1K, clm=2, 8 threads
[Work thread Jan 19 05:39] Setting affinity to run helper thread 7 on CPU core #9
[Work thread Jan 19 05:39] Setting affinity to run helper thread 4 on CPU core #6
[Work thread Jan 19 05:39] Setting affinity to run helper thread 5 on CPU core #7
[Work thread Jan 19 05:39] M7508981 stage 1 is 99.80% complete.
I saw this 7 hours later and it happened finished the last .2% and when I tried to stop the worker it wouldn't and I was forced to kill -9 mprime
SethTro is offline   Reply With Quote
Old 2022-01-20, 00:20   #334
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

24×17×29 Posts
Default

Quote:
Originally Posted by LaurV View Post
Problem repeated with 7122781, so it is for sure a P95 bug.
Next time this happens can you send me the offending save file? Thanks.
Prime95 is online now   Reply With Quote
Old 2022-01-20, 07:51   #335
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
"name field"
Jun 2011
Thailand

2·17·293 Posts
Default

Quote:
Originally Posted by Prime95 View Post
Next time this happens can you send me the offending save file? Thanks.
Sure. I have to watch when a stage 1 factor is found (this runs one or two days before stage 2) and grab the file, as I don't know if the ".write" and ".bad1" is just a rename, or the content is changed too. Will do that.
LaurV is offline   Reply With Quote
Old 2022-01-20, 09:58   #336
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
"name field"
Jun 2011
Thailand

233528 Posts
Default

Ha!


Exactly in the same time I was posting the above, I decided to have a look remotely to that system (long live anydesk!) and found P95 crashed, with the same issue, this time for M7128227, which was already in ".write" status, i.e. too late to catch the error. So, I decided to reconstruct it, by doing again the stage 1.

I sent you the two files by email (quite large, even zipped), but you can reproduce it easier, faster than you can read the email from me, hehe. Just add to your worktodo a line like "Pminus1=N/A,1,2,7128227,-1,1993731,1993731", or choose any exponent with a P-1 factor discoverable in stage 1. If you chose the same exponent, you will get the "m7128227" file identical with the one I sent you, and you will find the correct factor.

In this point, one (normal guy) would discard the file and look for another exponent to play with, so no harm ever done.

But in the case we are hunting for large factors or work for the "twok" project, and want to run stage 2 on ALL the exponents/files kept from stage 1, we have no way to know which of them had found factors, unless we go through them by hand one by one, so we just run stage 2 on all of them.***

So, put into your worktodo file a line like "Pminus1=N/A,1,2,7128227,-1,1993731,1993731,97" (to force stage 2) and run P95 again. P95 will crash, it will rename the file to "m7128227.write", and forcefully exit the memory, but as I correctly guessed, the content of the file is modified too, some bytes (which look like a residue) are changed to "00000002".

That is the second file I sent you. You can compare them binary and see the modified bytes.

So, when you come home in the evening, swear two times, for the lost time, restart P95, the "m7128227.write" will be renamed to "m7128227.bad1" and the job will continue**. There is another restart needed later, to delete the ".bad1" file (by hand) and edit the worktodo manually to get rid of the naughty line (which is skipped every time, as long as the bad1 file is there, otherwise, if you just delete the file, the stage 1 will run again and again).

By the way, if you modify the worktodo to something like "Pminus1=N/A,1,2,7128227,-1,1993761,1993761,97" (so the new P-1 does a bit of stage 1 too, before going directly to stage 2), then it works normally.

_____________
Edits:
** to clarify, when the file is renamed from ".write" to ".bad1", its content is not changed.
*** therefore one feature request or fast/dirty solution, would be to modify the behavior of the KeepPminus1SaveFiles switch, to have 0, 1, 2, etc, to keep the files if stage 1 or 2, or if a factor was/was not found.

Last fiddled with by LaurV on 2022-01-20 at 10:17
LaurV is offline   Reply With Quote
Old 2022-01-20, 23:02   #337
Mark Rose
 
Mark Rose's Avatar
 
"/X\(‘-‘)/X\"
Jan 2013

23·32·41 Posts
Default

Quote:
Originally Posted by Zhangrc View Post
See some of my previous posts for more information.
This CPU is AMD R7 4800H, which has 8 cores and 16 threads. Prime95 is set to run 1 worker at core 2,3,4,5 (not hyperthreaded). However, it seems that Prime95 is switching between threads of each core.
For comparison, this is my CPU usage when running LLDC (M63759613, 1 worker using core 1,2,3,4, not hyperthreaded).
Switching between hyperthreads of a core should have basically zero performance impact. All of the CPU caches are shared. So a bug, perhaps, but an inconsequential one.
Mark Rose is offline   Reply With Quote
Old 2022-01-20, 23:19   #338
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

11110110100002 Posts
Default

Quote:
Originally Posted by Mark Rose View Post
Switching between hyperthreads of a core should have basically zero performance impact. All of the CPU caches are shared. So a bug, perhaps, but an inconsequential one.
Indeed prime95 assigns affinity to cores. That is, on a hyperthreaded CPU the first prime95 thread is assigned to logical cores 1 and 2 (assuming one-based numbering of logical cores).
Prime95 is online now   Reply With Quote
Old 2022-01-21, 01:51   #339
Zhangrc
 
"University student"
May 2021
Beijing, China

2×53 Posts
Default

Quote:
Originally Posted by Mark Rose View Post
Switching between hyperthreads of a core should have basically zero performance impact
But on my machine, it does have performance impact (because switching needs to go through extra computations, which is shown by the CPU usage of the "system" process)
Usually it switches once every 8 seconds. However, at the beginning of stage 2, I see it switches once in less than a second. that would be a huge performance impact.
I'm pretty sure that if I closed hyperthreading, the problem would resolve itself. However I didn't see such option in my BIOS.

I spotted another (minor) bug:
Stage 2 stopping in the middle and restarting would result in percentage quickly going to 100%, which doesn't reflect the progress.

Last fiddled with by Zhangrc on 2022-01-21 at 02:22
Zhangrc is offline   Reply With Quote
Old 2022-01-21, 03:35   #340
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
"name field"
Jun 2011
Thailand

2×17×293 Posts
Default

Quote:
Originally Posted by LaurV View Post
one feature request or fast/dirty solution, would be to modify the behavior of the KeepPminus1SaveFiles switch, to have 0, 1, 2, etc, to keep the files if stage 1 or 2, or if a factor was/was not found.
re-reading, that was just blah-blah-blah without making any sense.

What I intended to say, set the switch to 0 or 1 to delete, respective keep, all the files, this is the case now, but add the option "2", to delete the file only if a factor is found. That would eliminate the issue (without solving it , but the advantage is that it gives us more flexibility to play with - of course, the correct solution would be to find why P95 crashes, and make it handle the situation correctly, with deleting the assignment from worktodo, or continuing stage two, whatever, but my real issue is not the crash, but the wasted time )

Last fiddled with by LaurV on 2022-01-21 at 03:37
LaurV is offline   Reply With Quote
Old 2022-01-21, 05:53   #341
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

24×17×29 Posts
Default

30.8 build 9

Only new feature is P-1 now works for non-Mersennes.
Linked with latest hwloc library -- maybe Windows will behave better (I'm not real hopeful).
A few bugs fixed: stage 2 % complete, LaurV/Anton save file issue.

Windows 64-bit: https://mersenne.org/ftp_root/gimps/p95v308b9.win64.zip
Linux 64-bit: https://mersenne.org/ftp_root/gimps/...linux64.tar.gz
Prime95 is online now   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Prime95 beta version 28.4 Prime95 Software 20 2014-03-02 02:51
Prime95 beta version 28.3 Prime95 Software 68 2014-02-23 05:42
Prime95 version 27.1 early preview, not-even-close-to-beta release Prime95 Software 126 2012-02-09 16:17
Beta version 24.12 available Prime95 Software 33 2005-06-14 13:19
Beta version of PRP Prime95 PSearch 15 2004-09-17 19:21

All times are UTC. The time now is 02:30.


Mon May 23 02:30:33 UTC 2022 up 39 days, 31 mins, 0 users, load averages: 1.92, 1.35, 1.22

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔