mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2022-10-02, 20:03   #749
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

1BC216 Posts
Default

Quote:
Originally Posted by kriesel View Post
Two outstanding issues as far as I know (haven't tested v30.8b17 yet)

1) Observed on Windows 7 Pro x64, dual Xeon E5-2670, prime95 V30.8b15, using start /Node 0 or 1, /affinity 0x5555, running two instances, intended as one each side of the QPI;
when a worker window assigns cores, the following message is produced repeatedly, with variety of hex values consisting of 3 or c at various offsets:
Error setting affinity to cpuset 0x000000c0: No error
(refer to attachment of https://mersenneforum.org/showpost.p...&postcount=731)
3 or c is 0011 or 1100.
Windows' numbering representation of the two logical cores of a x2 hyperthreaded physical core #0 is 0,1,
while Linux's is 0,n where n is number of physical hyperthreaded cores present in the system.
So it appears to me that prime95, a Windows application, may be using an inappropriate affinity mask for Windows. We don't usually want two prime95 compute threads running on the same physical core.
... except when using hyperthreading while performing TF, which we'd rather do on GPUs anyway for performance. I've been starting the dual instances with the equivalent of /node 0 and /node 1 of the following,
Code:
start /node 0 /affinity 0x5555 /d "C:\Users\ ... \prime95-x64" prime95.exe
It never occurred to me that prime95 would default to no hyperthreading for fft multiplies, and generally benchmark as better performance without using hyperthreading, yet insist on being able to set both hyperthreads of the same physical cores as available, even trying to override the user.

I should be using 0xFFFF, which allows 16 threads on 8 cores? And let the threads flop about between logical processors? Seems counter to documentation's guidance. From readme.txt:
Code:
Use hyperthreading
------------------

Except for trial factoring, which is best left for GPUs to do, hyperthreading often offers no performance
benefit while using more electricity.  You can try test if hyperthreading speeds up your worker windows by
selecting these options.
Or for that matter, "not recommended" built into the GUI dialog boxes.

I found in early experimentation that without the affinity mask included, the Windows 7 start /Node 1 command did not properly place the second instance on NUMA node 1, instead running both instances on the various hyperthreads of node 0 (CPU package 0), saturating all its logical processors, and leaving the second processor package idle, reducing performance.

To restate:
Code:
start /node 0 /affinity 0x5555 /d "C:\Users\ ... \prime95-x64" prime95.exe
start /node 1 /affinity 0x5555 /d "C:\Users\ ... \2\prime95-x64" prime95.exe
Good performance, but error messages;

without affinity masks,
Code:
start /node 0 /d "C:\Users\ ... \prime95-x64" prime95.exe
start /node 1 /d "C:\Users\ ... \2\prime95-x64" prime95.exe
No error messages, but both land on node 0, so half the performance.

Using affinity mask 0xFFFF in the start command seems to resolve the error messages, but the selection of hyperthreads is irregular in that case. Its hyperthread use is also not stable over time, switching between logical cores of a physical core somewhat.
Attached Thumbnails
Click image for larger version

Name:	no HT configuration prime95.png
Views:	34
Size:	159.5 KB
ID:	27405   Click image for larger version

Name:	logical core hopping without hyperthread enabled in prime95.png
Views:	32
Size:	44.7 KB
ID:	27406  

Last fiddled with by kriesel on 2022-10-02 at 20:43
kriesel is offline   Reply With Quote
Old 2022-10-02, 22:18   #750
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

2×11×17×19 Posts
Default Rapid logical processor switching

On the dual E5-2670 Win 7, again:

Node 1, using affinity mask 0xFFFF, is alternately using all 16 logical cores, rapidly changing between them. This node is doing P-1. Affinity related error messages remain, plus there's a new variant. The error message does not relate to proof storage optional directory, or temporary directory, because those fields are blank. The path specified in the start command is known to be valid since it was copied/pasted from Explorer, and is on the system's boot drive.
Code:
[Oct 2 15:55:30] Worker starting
[Oct 2 15:55:30] Setting affinity to run worker on CPU core #1
[Oct 2 15:55:30] Error setting affinity to cpuset 0x00000003: No such file or directory
[Oct 2 15:55:30] Optimal P-1 factoring of M116509073 using up to 49152MB of memory.
[Oct 2 15:55:30] Assuming no factors below 2^77 and 1.3 primality tests saved if a factor is found.
[Oct 2 15:55:30] Optimal bounds are B1=805000, B2=201211000
[Oct 2 15:55:30] Chance of finding a factor is an estimated 5.23%
[Oct 2 15:55:30] 
[Oct 2 15:55:32] Setting affinity to run helper thread 1 on CPU core #2
[Oct 2 15:55:32] Setting affinity to run helper thread 2 on CPU core #3
[Oct 2 15:55:32] Using AVX FFT length 6400K, Pass1=640, Pass2=10K, clm=2, 4 threads
[Oct 2 15:55:32] Setting affinity to run helper thread 3 on CPU core #4
[Oct 2 15:55:32] Error setting affinity to cpuset 0x0000000c: No error
[Oct 2 15:55:32] Error setting affinity to cpuset 0x00000030: No error
[Oct 2 15:55:32] Error setting affinity to cpuset 0x000000c0: No error
[Oct 2 15:55:33] Ignoring suggested B2 value, using B2=201653100 from the save file
[Oct 2 15:55:37] Available memory is 49089MB.
[Oct 2 15:55:39] Setting affinity to run helper thread 1 on CPU core #2
[Oct 2 15:55:39] Setting affinity to run helper thread 2 on CPU core #3
[Oct 2 15:55:39] Error setting affinity to cpuset 0x00000030: No error
[Oct 2 15:55:39] Switching to AVX FFT length 7M, Pass1=448, Pass2=16K, clm=4, 4 threads
[Oct 2 15:55:39] Estimated stage 2 vs. stage 1 runtime ratio: 0.890
[Oct 2 15:55:39] Error setting affinity to cpuset 0x0000000c: No error
[Oct 2 15:55:39] Setting affinity to run helper thread 3 on CPU core #4
[Oct 2 15:55:39] Error setting affinity to cpuset 0x000000c0: No error
[Oct 2 15:55:40] Using 49055MB of memory.  D: 1650, 200x687 polynomial multiplication.
[Oct 2 15:55:51] Setting affinity to run polymult helper thread on CPU core #2
[Oct 2 15:55:51] Setting affinity to run polymult helper thread on CPU core #3
[Oct 2 15:55:51] Setting affinity to run polymult helper thread on CPU core #4
[Oct 2 15:55:51] Error setting affinity to cpuset 0x0000000c: No error
[Oct 2 15:55:51] Error setting affinity to cpuset 0x00000030: No error
[Oct 2 15:55:51] Error setting affinity to cpuset 0x000000c0: No error
[Oct 2 15:58:41] Stage 2 init complete. 5235 transforms. Time: 186.005 sec.
[Oct 2 15:59:52] M116509073 stage 2 at B2=58167450 [16.98%]
[Oct 2 16:25:20] M116509073 stage 2 at B2=72373950 [25.20%].  Time: 1528.234 sec.
[Oct 2 16:51:05] M116509073 stage 2 at B2=86580450 [33.42%].  Time: 1544.864 sec.
Attached Thumbnails
Click image for larger version

Name:	rapid logical processor switching.png
Views:	27
Size:	267.1 KB
ID:	27407  
kriesel is offline   Reply With Quote
Old 2022-10-02, 23:22   #751
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

2×11×17×19 Posts
Default

Same system as previous post:
A second instance, specified in start command to run on NUMA node 1, without affinity mask used, lands on node 0 instead. But no error messages. Note first instance's iteration time more than doubled, and NUMA node 1 is idle.
Attached Thumbnails
Click image for larger version

Name:	node 1 without affinity mask lands on node 0 instead.png
Views:	28
Size:	269.5 KB
ID:	27408  
kriesel is offline   Reply With Quote
Old 2022-10-04, 23:35   #752
storm5510
Random Account
 
storm5510's Avatar
 
Aug 2009
Not U. + S.A.

238210 Posts
Default

Quote:
Originally Posted by kriesel View Post
Same system as previous post:
A second instance, specified in start command to run on NUMA node 1, without affinity mask used, lands on node 0 instead. But no error messages. Note first instance's iteration time more than doubled, and NUMA node 1 is idle.
NUMA: Non-Uniform Memory Access.

Off-topic: Why would the BIOS in a single CPU system have this enabled? It is on my old Xeon...
storm5510 is offline   Reply With Quote
Old 2022-10-05, 17:08   #753
bplenhart
 
"Brian Lenhart"
Oct 2013

B16 Posts
Default FreeBSD version of mprime

Are there any plans to update the FreeBSD version past 30.7b9?
bplenhart is offline   Reply With Quote
Old 2022-10-07, 01:12   #754
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

11111100101002 Posts
Default

Quote:
Originally Posted by bplenhart View Post
Are there any plans to update the FreeBSD version past 30.7b9?
Try 30.9b1: https://www.dropbox.com/s/1ag6cxl8f5...64.tar.gz?dl=0

It actually will be easier to build 30.9b3 rather than a 30.8 version.
Prime95 is offline   Reply With Quote
Old 2022-10-10, 10:00   #755
tha
 
tha's Avatar
 
Dec 2002

22·5·43 Posts
Default

In light of all these changes, the math page listed on the home page could use an update.
tha is offline   Reply With Quote
Old 2022-10-13, 15:11   #756
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

2×11×17×19 Posts
Default s2 s1 ratio

After a stop and continue during stage 2 P-1, the output line
Code:
Estimated stage 2 vs. stage 1 runtime ratio: x.xxx
seems to consider only the remaining portion of stage 2, not full stage 2 run time.
So for an actual stage 2 / stage 1 ratio ~1, if stage 2 is 80% complete when resumed, the resumption will indicate a ratio of ~0.2.
kriesel is offline   Reply With Quote
Old 2022-10-23, 11:50   #757
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

2·11·17·19 Posts
Default logging feature request

Somewhat similar to how gpuowl does things, please add an option in prime95 to log everything that appears in a worker window to a log file (open for append, write, close). Overnight, Windows Update restarted a system that was in the midst of a ~3.5 day 100Mdigit P-1 benchmarking run, and all prime95 worker window content (timing for stages & GCD) was lost. That content does not appear anywhere else (prime.log, results.txt, etc.).
kriesel is offline   Reply With Quote
Old 2022-10-25, 21:46   #758
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

2×11×17×19 Posts
Default Low cpu speed on prime95 V30.8b14 Xeon Phi 7250

This combo reports various cpu speeds, with the value in the second line fluctuating around 1/4-1/3 of the actual.
I've seen 440 MHz down to 371 MHz while Task Manager shows 1500 MHz (slight turbo clock).
Clearing the CpuSpeed entry out of local.txt and restarting the program does not resolve the issue.
This really throws off the ETA computations.

Its 7210 sibling running v30.7b9 reports cpu speed correctly.
Attached Thumbnails
Click image for larger version

Name:	hydra not that slow.png
Views:	35
Size:	6.5 KB
ID:	27515  

Last fiddled with by kriesel on 2022-10-25 at 21:49
kriesel is offline   Reply With Quote
Old 2022-10-31, 10:24   #759
pepi37
 
pepi37's Avatar
 
Dec 2011
After milion nines:)

1,597 Posts
Default

Before two days my Intel I5 9600 ( never OC) but working with Prime95 few years 24/7 start to show warnings Confidence in final result is very poor. So what else to do then go and buy Ryzen5 3600 and restart sequence for test. And why I wrote all this: Beside warning I was surprised when new CPU give me exact res as my "old Intel". So for me Prime95 is best software I ever and will use in future.!
pepi37 is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Do not post your results here! kar_bon Prime Wiki 40 2022-04-03 19:05
what should I post ? science_man_88 science_man_88 24 2018-10-19 23:00
Where to post job ad? xilman Linux 2 2010-12-15 16:39
Moderated Post kar_bon Forum Feedback 3 2010-09-28 08:01
Something that I just had to post/buy dave_0273 Lounge 1 2005-02-27 18:36

All times are UTC. The time now is 20:00.


Thu Dec 1 20:00:17 UTC 2022 up 105 days, 17:28, 0 users, load averages: 0.77, 0.99, 1.01

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔