mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2015-05-24, 11:39   #78
harlee
 
harlee's Avatar
 
Sep 2006
Odenton, MD, USA

22·41 Posts
Default

I just installed the 28.6 Windows 32-bit software onto my older P4 system. I'm doing P1 testing and noticed that the B1 and B2 bounds are now lower. Just wondering if this is correct as I didn't see anything about the bounds changing in the whatsnew.txt file.

Last fiddled with by harlee on 2015-05-24 at 11:42 Reason: fixed the number of bits from 35 to 32
harlee is offline   Reply With Quote
Old 2015-05-25, 03:06   #79
Madpoo
Serpentine Vermin Jar
 
Madpoo's Avatar
 
Jul 2014

63358 Posts
Default Version 28.6 and "ScaleOutputFrequency=1"

I just updated my triple-checking systems to version 28.6 and I'm noticing a difference in how the "ScaleOutputFrequency=1" option is working.

It doesn't seem to scale the update frequency of vastly different workers like it did in 28.5. The most extreme example is one system where I'm testing M383838383 on one socket, and a little 30M exponent on the other.

I previously had the iterations between screen outputs set to 30000 and that worked fairly well. I could see progress on the big 383M and the 30M exponents moving along.

Now it seems to ignore that option entirely and only updates either one at the actual specified rate, no scaling.

I peeked at the source code changes between 28.5 and 28.6 and I do see some changes that happened in there, so I'm guessing that's the reason, but I didn't see anyone else mention this yet.
Madpoo is offline   Reply With Quote
Old 2015-05-25, 05:03   #80
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

22·1,873 Posts
Default

Quote:
Originally Posted by Madpoo View Post
I just updated my triple-checking systems to version 28.6 and I'm noticing a difference in how the "ScaleOutputFrequency=1" option is working.

It doesn't seem to scale the update frequency of vastly different workers like it did in 28.5. The most extreme example is one system where I'm testing M383838383 on one socket, and a little 30M exponent on the other.
A bug was reported and it should be fixed in 28.7 --- please try that version if and when it becomes available.
Prime95 is online now   Reply With Quote
Old 2015-05-25, 11:27   #81
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

101010100002 Posts
Default 128GB RAM, E=12 in P-1.

I run mprime on a system with 128GB of free memory, and I run a single thread of P-1. It uses about about 30GB in stage 2 and the status is always E=12. The B2 bound is about 15M (B1 about 700K).

My questions are:
- what's the meaning of E=12?
- would P-1 benefit from using more memory in stage 2? If yes, why it does not use it?

In local.txt I have:
Memory=400000 during 7:30-1:00 else 400000
preda is offline   Reply With Quote
Old 2015-05-25, 11:57   #82
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

11·101 Posts
Default

AFAIK current version of Prime95/mprime won't use more than ~30GB of memory in P-1 stage 2 for current P-1 wavefront assignments.
I've tried 2,5TiB for a single instance of P-1, it simply just utilizes ~30GB at most.

E=12 is for Brent-Suyama extension

Oliver
TheJudger is offline   Reply With Quote
Old 2015-05-25, 15:54   #83
Madpoo
Serpentine Vermin Jar
 
Madpoo's Avatar
 
Jul 2014

37·89 Posts
Default

Quote:
Originally Posted by Prime95 View Post
A bug was reported and it should be fixed in 28.7 --- please try that version if and when it becomes available.
Cool, thanks. I don't have any Haswell systems, but how does the AVX2 stuff look in 28.6/28.7? I think I saw some changes in the source related to that, which would be cool. I'm specing out some new servers which will have Haswell-E chips so at some point I'll be able to do some burn-in with those if you need some feedback down the road.
Madpoo is offline   Reply With Quote
Old 2015-05-25, 17:58   #84
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3·29·83 Posts
Default

Quote:
Originally Posted by preda View Post
I run mprime on a system with 128GB of free memory, and I run a single thread of P-1. It uses about about 30GB in stage 2 and the status is always E=12. The B2 bound is about 15M (B1 about 700K).

My questions are:
- what's the meaning of E=12?
- would P-1 benefit from using more memory in stage 2? If yes, why it does not use it?

In local.txt I have:
Memory=400000 during 7:30-1:00 else 400000
The status outputs every so often should have a message during stage 2 of the form "Processing x out of y relative primes" where x and y are numbers. Each relative prime takes up a fair amount of memory... if you have 15GB available, Prime95 would use those 15GB and give a value for x that's approximately half the value of y (processing the first half in 15GB, then the second half in 15GB, as opposed to all at once in 30GB). It just so happens that for a B2 of ~15M, appropriate for the current LL wavefront, 30GB is enough to process all relative primes at once.
Dubslow is offline   Reply With Quote
Old 2015-06-03, 01:23   #85
Batalov
 
Batalov's Avatar
 
"Serge"
Mar 2008
Phi(4,2^7658614+1)/2

9,433 Posts
Question Continuing on the previous GWNUM library modification

Quote:
Originally Posted by Prime95 View Post
Serge, try the attached gwnum.c
George,

You've already improved the special case k=1; how about k=2? Just strictly k=2; there is no interest in higher k values but b is relatively large.

Debug case is prepared:
Can a PRP test for "PRP=2,67607,371171,1" use not a generic reduction AVX FFT length 640K (which the library choses):
Code:
[Work thread Jun 2 18:08] Resuming PRP test of 2*67607^371171+1 using generic reduction AVX FFT length 640K, Pass1=640, Pass2=1K
[Work thread Jun 2 18:08] Iteration: 2 / 5955397 [0.00%].
[Work thread Jun 2 18:08] Iteration: 500 / 5955397 [0.00%], roundoff: 0.047, ms/iter: 18.580, ETA: 30:44:04
[Work thread Jun 2 18:08] Iteration: 1000 / 5955397 [0.01%], roundoff: 0.047, ms/iter: 17.244, ETA: 28:31:18
[Work thread Jun 2 18:08] Iteration: 1500 / 5955397 [0.02%], roundoff: 0.047, ms/iter: 17.248, ETA: 28:31:33
[Work thread Jun 2 18:09] Iteration: 2000 / 5955397 [0.03%], roundoff: 0.047, ms/iter: 17.241, ETA: 28:30:41
[Work thread Jun 2 18:09] Iteration: 2500 / 5955397 [0.04%], roundoff: 0.047, ms/iter: 17.246, ETA: 28:31:03
but instead use:
Code:
[Work thread Jun 2 18:05] Starting PRP test of 2*67607^371171+1 using all-complex AVX FFT length 1M, Pass1=256, Pass2=4K
[Work thread Jun 2 18:05] Iteration: 500 / 5955397 [0.00%], roundoff: 0.395, ms/iter: 31.972, ETA: 52:53:07
[Work thread Jun 2 18:05] Iteration: 1000 / 5955397 [0.01%], roundoff: 0.395, ms/iter:  7.315, ETA: 12:05:54
[Work thread Jun 2 18:06] Iteration: 1500 / 5955397 [0.02%], roundoff: 0.395, ms/iter:  7.326, ETA: 12:06:56
[Work thread Jun 2 18:06] Iteration: 2000 / 5955397 [0.03%], roundoff: 0.395, ms/iter:  7.319, ETA: 12:06:12
[Work thread Jun 2 18:06] Iteration: 2500 / 5955397 [0.04%], roundoff: 0.395, ms/iter:  7.309, ETA: 12:05:12
[Work thread Jun 2 18:06] Iteration: 3000 / 5955397 [0.05%], roundoff: 0.395, ms/iter:  7.318, ETA: 12:05:57
(which I actually can force it to choose with PRP=FFT2=1M,2,67607,371171,1 ).
Can we generalize/automate a similar all-complex FFT2 choice for any PRP=2,67607,n,1 up to n of say 1M?
Or maybe even improve to a better choice of a special FFT, if some light optimization is needed in the library?

This is a special base for which no primes 2*b^n+1 are known. A prime of this form would divide some Phi(M,2) where M can be determined post hoc (with what is essentially a modified base 2 PRP test); for a large b, M is very likely to be b^n or b^{n-1}.
Batalov is offline   Reply With Quote
Old 2015-06-03, 13:41   #86
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

165048 Posts
Default

Did you look at the roundoff error on your 1M all-complex FFT? The gwnum code isn't using the 1M FFT because it is afraid of a fatal roundoff error during the PRP test.

gwnum does support some options to be a little less conservative in choosing FFT lengths, I'll look and see if prime95 exposes any of those features.
Prime95 is online now   Reply With Quote
Old 2015-06-03, 15:33   #87
Batalov
 
Batalov's Avatar
 
"Serge"
Mar 2008
Phi(4,2^7658614+1)/2

223318 Posts
Default

There is some blanket rule that kicks in at n>10000 for FFT choice -- and only generic reduction AVX FFT is used for all n's.
(For n<10000, all-complex AVX FFT is used most of the time; I can send you a lightly sieved set of n, or else for debugging tests you can use any n.)
If you use n=351111 for example, the error will be well-controlled for all-complex AVX FFT, yet it will not be chosen.

I think an appropriate sized all-complex AVX FFT for this form can always be used, but even with forced FFT2=NNN I cannot force it because for some ranges of n, even the forced FFT2 does not force all-complex, but a zero-padded instead.
Batalov is offline   Reply With Quote
Old 2015-06-07, 15:00   #88
ramshanker
 
ramshanker's Avatar
 
"Ram Shanker"
May 2015
Delhi

2×19 Posts
Default Official release of 28.7?

When is Ver 28.7 being released on official download page?
ramshanker is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Prime95 version 27.3 Prime95 Software 148 2012-03-18 19:24
Prime95 version 26.3 Prime95 Software 76 2010-12-11 00:11
Prime95 version 25.5 Prime95 PrimeNet 369 2008-02-26 05:21
Prime95 version 25.4 Prime95 PrimeNet 143 2007-09-24 21:01
When the next prime95 version ? pacionet Software 74 2006-12-07 20:30

All times are UTC. The time now is 02:12.

Sat May 15 02:12:43 UTC 2021 up 36 days, 20:53, 0 users, load averages: 1.78, 1.95, 2.12

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.