mersenneforum.org Torture and Benchmarking test of Prome95.Doubts in implementation
 Register FAQ Search Today's Posts Mark Forums Read

 2012-01-05, 18:38 #1 paramveer   Jan 2012 32 Posts Torture and Benchmarking test of Prome95.Doubts in implementation I was trying to understand Torture and benchmarking part of Prime95.,so that i can implement something like this . My doubts are: 1) Why do we need to provide Memory size control settings like(small FFT,large,Blend) in torture test?what effect does these have on output of the torture? 2)What is the procedure followed to benchmark as the 3rd step where output is Trial factoring a number with some bit length(like 69bit etc) factor,from coding point of view? 3)What is the procedure of Second checking in Double cheking method? Thanks in advance.
2012-01-05, 22:46   #2

"Richard B. Woods"
Aug 2002
Wisconsin USA

22×3×641 Posts

Quote:
 Originally Posted by paramveer I was trying to understand Torture and benchmarking part of Prime95.,so that i can implement something like this . My doubts are: 1) Why do we need to provide Memory size control settings like(small FFT,large,Blend) in torture test?what effect does these have on output of the torture?
That setting controls whether the torture test is to use small FFTs, large FFTs, or a blend of small and large FFTs.

If the test uses a small FFT, then it can execute mostly or entirely within L1 cache, so it stresses the circuitry of the ALU and L1 fetches without much or any effect from transfers between L2 and main RAM, or L1 and L2.

A test using a large FFT has to continually cycle through large amounts of data that won't fit entirely in L1, so puts more stress on the paths to/from main RAM and L2.

Quote:
 2)What is the procedure followed to benchmark as the 3rd step where output is Trial factoring a number with some bit length(like 69bit etc) factor,from coding point of view?
I'm not sure how to interpret that question. Could you restate it in a different way?

Quote:
 3)What is the procedure of Second checking in Double cheking method?
A GIMPS doublecheck is simply a second LL test, using the same algorithm and code as a first-time check, but it's intended to be performed on a different system, in case there was a hardware error in the system that performed the first-time LL test. Afterwards, PrimeNet compares the residues returned from the first and second tests. If they match, it's presumed that both are correct. If they don't match, then at least one of the LL tests had an error, so there's a triple-check assigned...

Last fiddled with by cheesehead on 2012-01-05 at 22:51

2012-01-05, 23:23   #3
Dubslow

"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

1C3516 Posts

Quote:
 Originally Posted by paramveer 2)What is the procedure followed to benchmark as the 3rd step where output is Trial factoring a number with some bit length(like 69bit etc) factor,from coding point of view?
I'll take a stab at it.

Before running the expensive primality tests, GIMPS will attempt to find small factors for a number, proving it composite and eliminating the primality tests. This is done (mostly) with Trial Factoring. That means that, to Trial Factor up to (for example) 68 bits, the program will create a list of prime numbers x*, where x < 268, and then test each and every x to see if it is a factor of the Mersenne number in question. If every x below 268 is tested and no factor is found, then the Mersenne number will be assigned to Trial Factor all 268 < x < 269, up to a certain point where the trial factoring becomes more expensive than the primality test.

The benchmark then just times how long it takes to do the initial factoring from 0 to 68 bits, i.e. will try all 0 < x < 268.

Please tell me if I misinterpreted your question and you want more specifics of how this is done. Perhaps you mean how each candidate x is tested to determine if it's a factor?

 2012-01-06, 01:18 #4 Prime95 P90 years forever!     Aug 2002 Yeehaw, FL 794910 Posts Small correction: The small FFT torture test runs out of the L1 and L2 caches. The large FFTs also stress your L3 cache (if any) and main memory. The benchmark is useless outside of the context of the GIMPS project. If you want to compare Intel and AMD chips for normal workloads, the prime95 benchmark is not meaningful.
 2012-01-06, 01:38 #5 Christenson     Dec 2010 Monticello 70316 Posts Slight correction for Mr Cheesehead, and possibly also for Parmveer: A GIMPS DC(double-check) arranges to test the same number for primality using different data, using the properties of the arithmetic to scramble all the bits being worked on. When the bits are unscrambled at the end, the residue from the two LL[Lucas-Lehmer primality] tests should match, assuming no errors cropped into the calculation. There are no concerns about a CPU-based DC on the same system as the original LL check, because the chances of a fixed error affecting the data in the same way are either nearly or completely nil. Trial Factoring is a standard GIMPS operation...when factors are found, which happens every so often, LL checks become unnecessary. In terms of GIMPS effectiveness, however, the operation is now basically obsolete on CPUs because GPUs are 10-100x faster at it.
2012-01-06, 03:18   #6
sdbardwick

Aug 2002
North San Diego County

743 Posts

Quote:
 Originally Posted by paramveer 2)What is the procedure followed to benchmark as the 3rd step where output is Trial factoring a number with some bit length(like 69bit etc) factor,from coding point of view?
My interpretation of this question is "Given the line below, what is measured that takes 3.118ms to complete?
Code:
Best time for 58 bit trial factors: 3.118 ms
(Tortured and potentially inaccurate explanation follows - others should aggressively correct)

As I understand it, that is the time it takes for that processor to check if one value from the range of values available from 58 binary bits evenly divides potential prime $x$ , and is thus a factor of $x$ that establishes $x$ is composite.

To simplify, 10 bits contains 1024 possible values (0-1023 decimal; 0000000000 to 111111111 binary), so a line like
Code:
Best time for 10 bit trial factors: 0.118 ms
would mean that it took .118ms to check one of the values from that range.

Last fiddled with by sdbardwick on 2012-01-06 at 03:25 Reason: bits vs. power of 2 exponent error

 2012-01-06, 03:51 #7 LaurV Romulan Interpreter     "name field" Jun 2011 Thailand 1001310 Posts As I understood the questions, the OP is not interested in finding primes, or in the algorithms behind. In fact, he does not give a sh*t, according with the fact that even the P95's name is spelled wrong not only one time. His interest is in implementing some torture/benchmark test, and I would guess he works in domain (for some cpus or systems factory) and need some stuff to test/burn out the thingies (hence his second questions mentions "from coding POV", I would gues the right answer is "p95 is taking an exponent for which the results of TF is known, and it is doing a bit of sieving, a bit of exponentiation, then compares the times and the results with previous known times and results", If I got the question right, and I hope he is not asking what TF is, or how to TF to "some bitlevel"). I went through this myself, trying to port parts of P95 to WinCE5 and xscale/arm (pxa 320) and during that time I was asking myself almost the same questions. Therefore, going in detail about how and why gimps is doing what is doing, and why LL and DC are different, would not make too much sense for him. The algorithms are explained well on gimps page and on wiki, and even on this forum, if he wants them. I took his questions more like "how these things are working, and WHY are they working?", i.e. "why this stuff can be a reliable benchmark or stress test, and how I can program something like that without going too much into algorithms detail?" To which I don't really know how to answer. edit: However, I love Christenson's explanation, "arranges to test the same exponent using different data", wonderful said. I will copy it somewhere to have it ready to be posted when some other guys will ask about LL/DC difference. Last fiddled with by LaurV on 2012-01-06 at 04:07
 2012-01-06, 07:00 #8 paramveer   Jan 2012 32 Posts Thanks for reply, @LaurV: sorry for the typo :( 1) So now my understanding is that,Memory size control is used to select range of FFT to be used for calculations.SO if my concern is to stress only CPU then i should not bother about these settings? 2) Thanks for comments,i got solution to my doubt. Thanks all:) 3) (please correct,if wrong) Acc to me in torture test steps are: a) find mersenne Prime b) Run LL primality test to confirm its primality ( *S0=4 ) c) [First Check of Double checking] Run LL test second time on different machine and compare the residue in (b) and (c) for equality. ( *S0=4 ) d) [Second Check of Double Checking] Run LL test with modified algo,in this *S0 is assigned left shifted 4,by random amount.and operform test then compare residue with(b) [doubt is] Acc to LL algo: S0 should be assigned value=4,but in step (d) we are assigning it value other than 4,then how will this test give correct output? Please refer:["http://en.wikipedia.org/wiki/Lucas-Lehmer_primality_test"] algo: // Determine if Mp = 2p − 1 is prime Lucas–Lehmer(p) var s = 4 // (value assigned to *S0 ) var M = 2p − 1 repeat p − 2 times: s = ((s × s) − 2) mod M if s = 0 return PRIME else return COMPOSITE END I hope this time i have made my points clear .Still if more clarification is needed then please ask,Sorry for last time.
2012-01-06, 08:06   #9

"Richard B. Woods"
Aug 2002
Wisconsin USA

11110000011002 Posts

Quote:
 Originally Posted by paramveer Thanks for reply, @LaurV: sorry for the typo :( 1) So now my understanding is that,Memory size control is used to select range of FFT to be used for calculations.SO if my concern is to stress only CPU then i should not bother about these settings?
You can't stress only the ALU if that's what you meant. Any FFT work _has_ to involve at least L1 and probably L2 cache memory (noting that L3 also exists, which I forgot before). To concentrate the stress on only those to the maximum extent (which I think is your intent), choose small FFTs only.

Large FFTs will spread out the stress to include more L3 and main memory, thus diluting the stress on the ALU and L1/L2 cache. That choice would be good _if_ you wanted to test your memory. It stresses memory in a different way than memtest does, so sometimes it'll find faulty memory when memtest won't (and vice versa).

Quote:
 3) (please correct,if wrong) Acc to me in torture test steps are:
The torture test uses a table of known values of L-L results for several exponents (which are not Mersenne primes).

For small FFTs, it uses only small exponents. It runs a fixed number of iterations (which is only part of a full L-L test) on each exponent, then compares the residue given by the last of those iterations to the known, precalculated, hard-coded residue value stored in the table for that exponent. If they match, there's no error. If they don't match, then something's wrong with your hardware.

Double-checking has nothing to do with torture testing. So, ignore double-checking, for your purposes.

The random shifting is also not part of torture testing. Ignore that, too.

Also, it's not necessary to find a Mersenne prime to do the torture test. The torture test includes the most stressful part of the L-L test that is used on any Mersenne number, whether prime or not. So, ignore Mersenne primes, for your purposes.

Last fiddled with by cheesehead on 2012-01-06 at 08:15

 2012-01-06, 08:35 #10 paramveer   Jan 2012 32 Posts @cheesehead 1) i got that,but what i want to know is if only purpose is to Utilize CPU to 100% irrespective of it uses RAM or cache (that is not my concern), then still do i have to implement these Memory controls or in that case i simply run test for FFT in a particular range so that CPU utilization is 100%(which currently i am able to do). Currently my code runs LL test for range of FFT and shows 100% cpu utilization.Here My purpose of the code is solved ,but what am worried about is,do these memory setting play critical role in the test which i am unable to understand except that they target cache or RAM, such that later it create problem with execution of code? 2)regarding torture test,in the output it display Pass 1 and Pass2 results,and these are output of Double checking {first check and second check},So still do you think i should not implement these?
 2012-01-06, 09:12 #11 fivemack (loop (#_fork))     Feb 2006 Cambridge, England 11001001101102 Posts Paramveer: could you tell us more exactly what it is that you want to do? You can run a CPU to 100% utilisation as displayed in Task Manager in crores of ways - displaying a Web page containing an infinite-loop written in Javascript is a trivial one. You would want to use prime95 rather than something more trivial if you want to test that the CPU gives right answers when running unusually stressful work. If you want to stick a 'CERTIFIED: runs torture test successfully for 24 hours at 4.7GHz' on a machine you're selling, then the test to use should be the blended one, and you should write up somewhere on the certificate the fact that you used the blended one (and the precise version of prime95 that you used). Last fiddled with by fivemack on 2012-01-06 at 09:13

 Similar Threads Thread Thread Starter Forum Replies Last Post ZFR Software 4 2018-02-02 20:18 swinster Software 2 2007-12-01 17:54 Cyclamen Persicum Software 2 2004-04-03 14:52 teotic_hk Hardware 8 2004-03-22 20:23 cmokruhl Software 3 2003-01-08 00:14

All times are UTC. The time now is 08:49.

Sat Aug 13 08:49:54 UTC 2022 up 37 days, 3:37, 2 users, load averages: 0.75, 1.05, 1.09