mersenneforum.org Prime95 v30.4/30.5/30.6
 Register FAQ Search Today's Posts Mark Forums Read

2021-01-04, 21:41   #12
Prime95
P90 years forever!

Aug 2002
Yeehaw, FL

1DFD16 Posts

Quote:
 Originally Posted by lycorn I have not yet installed Build 5, Any ideas/thoughts/suggestions?
There was a bug or two in accessing the prime pairing bit array. Try build 5 and keep me updated.

Quote:
 Originally Posted by UBR47K I'm hitting another issue right now, mprime gets killed by OOM
Trying to replicate here on a Linux quad-core system with 8GB memory. Set max mem allocation to 7GB.

Quote:
 Originally Posted by PhilF I don't remember if the P-1/ECM memory allocation refers to per worker or not
The memory allocation is per system, not per worker. Only PRP emergency memory allocation is per worker.

One can get per-worker memory limits but not from the menus/dialog boxes. See undoc.txt.

Quote:
 Originally Posted by nordi I was also running into OOM problems when testing earlier builds of 30.4. They were supposed to be fixed in the new version, though. Which operating system are you using?
UBR47K's problem is different since the M1277 ECM was in stage 1. The primary fix for you was to limit stage 2 temporaries to 100,000.

Last fiddled with by Prime95 on 2021-01-04 at 21:42

 2021-01-05, 01:00 #13 Prime95 P90 years forever!     Aug 2002 Yeehaw, FL 32×853 Posts The overallocating memory problem seems to be specific to Linux. If I add a malloc_trim() call at the end of each curve, the mprime process is not killed. If any Linux gurus have insights, I'd appreciate your sharing them. I'm a little baffled as my reading of the mallopt man page seems to indicate malloc_trim is called automatically once 128KB can be freed.
2021-01-05, 02:02   #14
axn

Jun 2003

5,179 Posts

Quote:
 Originally Posted by Prime95 See undoc.txt. Please run a test on a known B-S factor.
Unless primes are being paired the exact same way from before, there is a good chance that 30.4 with B-S enabled will not find the factor

2021-01-05, 02:06   #15
Prime95
P90 years forever!

Aug 2002
Yeehaw, FL

32×853 Posts

Quote:
 Originally Posted by axn Unless primes are being paired the exact same way from before, there is a good chance that 30.4 with B-S enabled will not find the factor
Doh! Of course you are right.

2021-01-05, 02:56   #16
LaurV
Romulan Interpreter

"name field"
Jun 2011
Thailand

265016 Posts

Quote:
 Originally Posted by axn Unless primes are being paired the exact same way from before, there is a good chance that 30.4 with B-S enabled will not find the factor
Or will find factors that the first run didn't

 2021-01-05, 07:52 #17 tha     Dec 2002 82710 Posts Code: Your choice: [Work thread Jan 5 08:51] Worker starting [Work thread Jan 5 08:51] Setting affinity to run worker on CPU core #1 [Work thread Jan 5 08:51] [Work thread Jan 5 08:51] P-1 on M15575663 with B1=1500000, B2=30000000 [Work thread Jan 5 08:51] Setting affinity to run helper thread 1 on CPU core #2 [Work thread Jan 5 08:51] Using FMA3 FFT length 800K, Pass1=320, Pass2=2560, clm=4, 4 threads [Work thread Jan 5 08:51] Setting affinity to run helper thread 3 on CPU core #4 [Work thread Jan 5 08:51] Cannot continue stage 2 from old P-1 save file. Restarting stage 2 from the beginning. [Work thread Jan 5 08:51] Setting affinity to run helper thread 2 on CPU core #3 [Work thread Jan 5 08:51] D: 840, relative primes: 1713, stage 2 primes: 1743704, pair%=90.11 [Work thread Jan 5 08:51] Using 11061MB of memory. [Work thread Jan 5 08:51] Stage 2 init complete. 16961 transforms. Time: 12.169 sec. Segmentation fault (core dumped) reproducible I renamed the file mF57663 so mprime couldn't find it and restarted it. Seems to work. About 33% increase in speed, what is the background behind that? Last fiddled with by tha on 2021-01-05 at 08:30
 2021-01-05, 12:56 #18 tha     Dec 2002 827 Posts I turned on Brent–Suyama again manually to compare the results. About a 3% penalty for an occasional factor circumventing the B2 value. I leave it on.
2021-01-05, 19:27   #19
lycorn

"GIMFS"
Sep 2002
Oeiras, Portugal

5E316 Posts

Quote:
 Originally Posted by Prime95 There was a bug or two in accessing the prime pairing bit array. Try build 5 and keep me updated.
I just got home to find Prime95 (Build 5) had stopped after ~ 20 hours of work. The symptoms and exception code are the same as before. In case you´re willing to do some debugging, the fault offset is 0x0000000002345399.
This was the only error recorded. The application had been functioning perfectly since I launched it. Just restarted it and it´s happily chugging along.

2021-01-05, 22:52   #20
Prime95
P90 years forever!

Aug 2002
Yeehaw, FL

32×853 Posts

Quote:
 Originally Posted by Prime95 The overallocating memory problem seems to be specific to Linux. If I add a malloc_trim() call at the end of each curve, the mprime process is not killed.
I'm still working on this. A Ubuntu build with debugging on seems to work. A CentOS build without debugging (the way official versions are built) does not. I presume the -g command line arg links in a different heap allocator.

Quote:
 Originally Posted by tha Code: [Work thread Jan 5 08:51] Cannot continue stage 2 from old P-1 save file. Restarting stage 2 from the beginning. Segmentation fault (core dumped) About 33% increase in speed, what is the background behind that?
I have a fix for this. Making new builds will be spotty as my wife has my laptop. Her Mac is in the shop for butterfly keyboard repair.

Dig around in the 20M thread. The speed boost comes from new gwnum feature that does (a+b)*c in one call saving some memory bandwidth. More speed comes from better prime pairing ~90% vs. ~30% using a Mihai Preda idea.

Quote:
 Originally Posted by tha I turned on Brent–Suyama again manually to compare the results. About a 3% penalty for an occasional factor circumventing the B2 value. I leave it on.
BS will find fewer factors than before. With 3x better prime pairing, there are 1/3 fewer opportunities for a BS prime > B2 to be included.

Quote:
 Originally Posted by lycorn I just got home to find Prime95 (Build 5) had stopped after ~ 20 hours of work. The symptoms and exception code are the same as before. In case you´re willing to do some debugging, the fault offset is 0x0000000002345399. This was the only error recorded. The application had been functioning perfectly since I launched it. Just restarted it and it´s happily chugging along.
Can you send me details as to your machine, worktodo, and memory settings? Thanks.

2021-01-05, 23:54   #21
lycorn

"GIMFS"
Sep 2002
Oeiras, Portugal

11·137 Posts

Quote:
 Originally Posted by Prime95 Can you send me details as to your machine, worktodo, and memory settings? Thanks.
CPU: Intel i5-7400 (kaby Lake) @ 3GHz. 16 GB DDR4 2400 memory (dual channel - 2 x 8GB). Everything at default settings. The machine was made by Dell, and has been rock solid since November 2018, when I started using it. Never had crashes, BSODs, etc, and it has found a fair number of ECM factors for exponents < 1M. It came pre installed with Windows 10, and the regular updates have been done without any problems. The first time Prime95 died was on December, 29th, while running v 30.4 build 3 (or 4, I´m not 100% sure). Then it happened again on January, 3rd, running build 4, and then today, running build 5. As I posted earlier, the symptoms were similar on all occasions.

The worktodo I´m currently using is:

[Worker #1]

ECM2=blah-blah,1,2,547273,-1,250000,25000000,400
ECM2=blah-blah,1,2,547291,-1,250000,25000000,400
ECM2=blah-blah,1,2,542911,-1,250000,25000000,300

[Worker #2]

ECM2=blah-blah,1,2,547453,-1,250000,25000000,400
ECM2=blah-blah,1,2,547487,-1,250000,25000000,400
ECM2=blah-blah,1,2,542987,-1,250000,25000000,300

[Worker #3]

ECM2=blah-blah,1,2,547583,-1,250000,25000000,400
ECM2=blah-blah,1,2,543019,-1,250000,25000000,300

[Worker #4]

ECM2=blah-blah,1,2,547397,-1,250000,25000000,400
ECM2=blah-blah,1,2,547609,-1,250000,25000000,400

Last fiddled with by LaurV on 2021-01-06 at 03:25 Reason: removed keys from the assignments - general wisdom is that is not good to post those publicly

2021-01-06, 03:16   #22
LaurV
Romulan Interpreter

"name field"
Jun 2011
Thailand

24·613 Posts

Quote:
 Originally Posted by lycorn The machine [...] has been rock solid since November 2018, when I started using it. Never had crashes, BSODs, etc, and it has found
How much are the temperatures of the toy? (that's the most important info, which I didn't see in the post).

It may be, or not be related to switching to v30, or just be coincidental. The machine is old, so it may need some maintenance, you know, removing the dust clogs from the fans, re-seating of the CPU (change/reapply the thermal paste), etc. We do this yearly, or even every 6 months or so. You know, my grandma was virgin for a very long time, but suddenly she wasn't. Luckily for me, otherwise I won't be anymore, and who would post stupid things on mersenneforum? Haha.

Your system may as well need nothing of it, but the new version of the program may be stressing the hardware a bit more than the old one, pushing it over the limit of stability. When you (general you) say your computer is stable, it is/was for the conditions you used it at. Any stable computer becomes unstable if you push it, and any crap computer is stable if you only type text documents in it. You may try to temporarily revert to v29 (or v30.3?) that was stable before, and see if the machine is still stable for a week or so. If you do only ECM, it won't matter much anyhow. If it is not stable anymore, you need dusting/re-seating, change or oil the fans, etc., like I said. Stable computers can become suddenly unstable sometimes.

If it is still stable with the old version, you still don't know if the issue is the new version of the program. It may be a bug in the new version, but it also could be that the new version is pushing the system a bit more, behind of its stability limit, of which you were very close before. The best way in that case, after upgrading to v30.4 again, is to try reducing the clocks just a little. If it becomes stable again, then the issue is not with P95. You still need dusting. Take the mop.

On the other hand, it still could be some new introduced bug in v30.4, it happened in the past, so you did well reporting it. If so, George will fix it, as usual (for sure, he is now at home in quarantine and has absolutely nothing else to do )

Last fiddled with by LaurV on 2021-01-06 at 07:19

All times are UTC. The time now is 12:55.

Tue Nov 30 12:55:52 UTC 2021 up 130 days, 7:24, 0 users, load averages: 1.35, 1.31, 1.23