 2022-06-01, 06:34 #595 petrw1 1976 Toyota Corona years forever!     "Wayne" Nov 2006 Saskatchewan, Canada 3×37×47 Posts I'm confused....might be a dumb question I understand with v30.8 running P-1 that very small FFTs are NOT multithreaded; or is that only Stage 1? I'm running Stage 2 on 0.2M exponents with FFT=16K on a 8-core PC and 1 worker 8-cores. Prime95 does NOT say "using 8 threads" however Task manager says that all 8 (actually all 16) cores are about 75% busy. Nothing else is working the CPU on this PC. And my completion times are pretty good: It was taking almost 3 hours for Stage 2 when 1 core is doing Stage 2 and the other 7 stage 1. About 30 minutes when all 8 cores are allocated to 1 worker in Stage 2. Interestingly the CPU temp is 72 for about 20 seconds then jumps to 85 for about 2 seconds then quickly drops back to 72 over and over. Thanks
 2022-06-01, 07:07 #596 kruoli     "Oliver" Sep 2017 Porta Westfalica, DE 2×5×109 Posts Stage 2 can use multiple threads even if stage 1 cannot. That is because different kinds of multithreading. Stage 1 needs FFT multithreading. This is usually not feasible for small exponents. But: Up to a limit, it is feasible to to multithread a certain aspect of P-1 stage 2. The jumps you see are likely from the polynom helper threads. They will work "in phases". During these phases, the CPU utilisation of that worker is approximately double that of the "normal" phase. This can cause the temperatures and power usage to jump periodically. Task Manager sometimes reports weird percentages. I saw on at least three machines something like this: Prime95 was running on all physical cores (so "50 % usage" would be expected). The CPU base frequency was $$x$$ GHz. But due to boosting, it ran at $$1.5x$$ GHz. Task Manager now reported the usage as 75 %. But the 75 % you reported are likely the average of the phases with and without polynom helper threads.
 Originally Posted by petrw1 Interestingly the CPU temp is 72 for about 20 seconds then jumps to 85 for about 2 seconds then quickly drops back to 72 over and over.
I don't believe that. I think it is a motherboard or monitoring software anomaly. A properly heatsinked chip can't fluctuate that wildly that rapidly.

 Originally Posted by PhilF I don't believe that. I think it is a motherboard or monitoring software anomaly. A properly heatsinked chip can't fluctuate that wildly that rapidly.
The i7 I have runs a consistent temperature range. Different work types have different ranges. The highest I have ever seen it was 68°C.

"Properly heat sinked?" It doesn't seem like it would be. IMHO, the heat sink device needs to be removed and both surfaces cleaned really well. I use a razor blade for the excess and rubbing alcohol on a cotton ball for the remainder. I use the alcohol until there is no discoloration on the cotton. Then, I use a dry cotton ball to absorb the remaining moisture. Store-bought alcohol contains water. The highest I have ever seen is 92% pure. The water keeps it from flashing if exposed to an ignition source.

 Originally Posted by PhilF I don't believe that. I think it is a motherboard or monitoring software anomaly. A properly heatsinked chip can't fluctuate that wildly that rapidly.
IMHO it's too consistent to be a monitoring or MB issue.

My two sources of observation are CoreTemp and a LED temp display right on the MB.

They agree with each other: 12 seconds at 72 degrees at second 13 it climbs quickly to 85 at second 14 it drops at the same rate back to 72.

 Originally Posted by storm5510 "Properly heat sinked?"
May not be; it is a little more than 3.5 years old.
It was built by a local reputable PC business.
It has always run in the 70-80 range when fully loaded.
Closer to 80 in Stage 1; closer to 70 in Stage 2.

 2022-06-01, 15:52 #601 James Heinrich     "James Heinrich" May 2004 ex-Northern Ontario 3,797 Posts The heatsink will keep the overall CPU package at a semi-constant (at least slow-changing) temperature, but spikes in load could drastically change the internal temperature of a single core for a short period before the overall CPU package temperature can catch up.
 Originally Posted by petrw1 May not be; it is a little more than 3.5 years old. It was built by a local reputable PC business. It has always run in the 70-80 range when fully loaded. Closer to 80 in Stage 1; closer to 70 in Stage 2.

Is it using an OEM heat sink and fan? Intel or AMD, for example. If so, an aftermarket type may do much better. There are lots to choose from. I put a liquid cooler on mine. Big difference.

 2022-06-03, 18:14 #603 kriesel     "TF79LL86GIMPS96gpu17" Mar 2017 US midwest 6,679 Posts MD5 errors in proof generation Recently I've been experimenting with pushing proof power higher in prime95 to obtain number of squarings data versus proof power. In the process of obtaining 10 samples each or more for proof powers 5-12, a few trials so far (~4-5%) have failed to generate a proof file due to recurring MD5 error. Usually in the first pass working toward hash0, but most recently, after hash1. If the disk or system is unreliable, bigger proof powers and bigger residues files could make MD5 errors more likely. If I recall correctly, all MD5 errors observed so far occurred with greater than compute-optimal proof power. Some ran locally, some were configured to use larger disk space on another system. In the pilot error domain: changing the location for the residues files needs to be done carefully. I think it would work to stop the workers, copy the existing residues file(s) of PRP with proof runs in progress to the new location, change the location (in prime95, Options, Resource limits, Advanced, enter a new UNC in "Optional directory to hold large temporary files", save the new setting), then resume work. Mprime/prime95 may not copy a residues file in progress when the location is changed during a run. So without manual copying at the right time, early residues may get stranded in the old location, & newer ones land in the new, forming an incomplete set which is likely to fail MD5 check in the earliest (not stored in that file) residues. I'm not sure how many such Humpty-Dumpty runs George will be willing to try to put back together. And it will require multiple large residue files and perhaps an interim save file, uploaded and downloaded for each. using a network shared drive, which may be necessary for sufficient space, can be risky. If network access is interrupted for too long, storing residues could fail. If an error in handling manual assignments occurs, such that two workers are running the same exponent, their residue files location had better be configured as separate subfolders (perhaps named for the system), or they could be altering the same residues file, which could be bad for both runs. Last fiddled with by kriesel on 2022-06-03 at 18:15
 Originally Posted by kriesel Recently I've been experimenting with pushing proof power higher in prime95 to obtain number of squarings data versus proof power.
You are simply wasting time.
You could try small exponents such as Mersenne cofactors first (at 13M), then LL double checks as PRP (at 64M), then PRP double check with proof at 82M(even if your proof generation failed you could get a matching residue).

 Originally Posted by Zhangrc You are simply wasting time. You could try small exponents such as Mersenne cofactors first (at 13M), then LL double checks as PRP (at 64M), then PRP double check with proof at 82M (even if your proof generation failed you could get a matching residue).
I do not agree with your first statement, and think it indicates you assume something incorrect about what I am doing, to reach the conclusion you did. In fact, many have been done as 60-64M PRP Proof as wavefront DC, with ProofPower=x in the [PrimeNet] section of prime.txt, so that I can specify the proof power needed to fill in table entries efficiently, and unless an MD5 error occurs, each forwards the DC wavefront in a single test, and provides data in days. Some others were PRP first test wavefront. I would run these anyway to forward the project. At low powers, I ran some small exponents that were very fast (minutes), so that excessive Cert time would not be imposed on usual size exponents due to the low proof powers. I also have caught some samples of squarings for exponents and default powers I already had running for other reasons. (The squarings counts appear only in the prime95 worker window, not any results or log files. So I could not collect any squarings data from past runs that had already overflowed the worker windows or been restarted.) The proof generation cost for elevated powers is real but not very large. So there was very little compute time "wasted", unless you consider GIMPS normal operation a waste of time. (Also George Woltman and Jan S have sent me some data captured as a result of runs they were doing anyway.) I do want multiple samples per proof power. There are a variety of exponents involved, spread over orders of magnitude, because GIMPS overall activity is, and so the problem space is, and to better sample the exponent space. There is noticeable fluctuation in squarings counts for a given proof power. Also noticeable deviation in one direction from what is expected from a quick code analysis. George had requested I gather some data in a PM discussion about proof generation cost.
Proof generation failing occasionally was an unexpected result, but useful.
Results tables are periodically posted as attachments to https://www.mersenneforum.org/showpo...8&postcount=24

Last fiddled with by kriesel on 2022-06-04 at 13:01

