20120105, 18:38  #1 
Jan 2012
3^{2} Posts 
Torture and Benchmarking test of Prome95.Doubts in implementation
I was trying to understand Torture and benchmarking part of Prime95.,so that i can implement something like this .
My doubts are: 1) Why do we need to provide Memory size control settings like(small FFT,large,Blend) in torture test?what effect does these have on output of the torture? 2)What is the procedure followed to benchmark as the 3rd step where output is Trial factoring a number with some bit length(like 69bit etc) factor,from coding point of view? 3)What is the procedure of Second checking in Double cheking method? Thanks in advance. 
20120105, 22:46  #2  
"Richard B. Woods"
Aug 2002
Wisconsin USA
2^{2}·3·641 Posts 
Quote:
If the test uses a small FFT, then it can execute mostly or entirely within L1 cache, so it stresses the circuitry of the ALU and L1 fetches without much or any effect from transfers between L2 and main RAM, or L1 and L2. A test using a large FFT has to continually cycle through large amounts of data that won't fit entirely in L1, so puts more stress on the paths to/from main RAM and L2. Quote:
Quote:
Last fiddled with by cheesehead on 20120105 at 22:51 

20120105, 23:23  #3  
Basketry That Evening!
"Bunslow the Bold"
Jun 2011
40<A<43 89<O<88
3×29×83 Posts 
Quote:
Before running the expensive primality tests, GIMPS will attempt to find small factors for a number, proving it composite and eliminating the primality tests. This is done (mostly) with Trial Factoring. That means that, to Trial Factor up to (for example) 68 bits, the program will create a list of prime numbers x*, where x < 2^{68}, and then test each and every x to see if it is a factor of the Mersenne number in question. If every x below 2^{68} is tested and no factor is found, then the Mersenne number will be assigned to Trial Factor all 2^{68} < x < 2^{69}, up to a certain point where the trial factoring becomes more expensive than the primality test. The benchmark then just times how long it takes to do the initial factoring from 0 to 68 bits, i.e. will try all 0 < x < 2^{68}. Please tell me if I misinterpreted your question and you want more specifics of how this is done. Perhaps you mean how each candidate x is tested to determine if it's a factor? 

20120106, 01:18  #4 
P90 years forever!
Aug 2002
Yeehaw, FL
5^{2}·311 Posts 
Small correction: The small FFT torture test runs out of the L1 and L2 caches. The large FFTs also stress your L3 cache (if any) and main memory.
The benchmark is useless outside of the context of the GIMPS project. If you want to compare Intel and AMD chips for normal workloads, the prime95 benchmark is not meaningful. 
20120106, 01:38  #5 
Dec 2010
Monticello
5×359 Posts 
Slight correction for Mr Cheesehead, and possibly also for Parmveer:
A GIMPS DC(doublecheck) arranges to test the same number for primality using different data, using the properties of the arithmetic to scramble all the bits being worked on. When the bits are unscrambled at the end, the residue from the two LL[LucasLehmer primality] tests should match, assuming no errors cropped into the calculation. There are no concerns about a CPUbased DC on the same system as the original LL check, because the chances of a fixed error affecting the data in the same way are either nearly or completely nil. Trial Factoring is a standard GIMPS operation...when factors are found, which happens every so often, LL checks become unnecessary. In terms of GIMPS effectiveness, however, the operation is now basically obsolete on CPUs because GPUs are 10100x faster at it. 
20120106, 03:18  #6  
Aug 2002
North San Diego County
19×37 Posts 
Quote:
Code:
Best time for 58 bit trial factors: 3.118 ms As I understand it, that is the time it takes for that processor to check if one value from the range of values available from 58 binary bits evenly divides potential prime , and is thus a factor of that establishes is composite. To simplify, 10 bits contains 1024 possible values (01023 decimal; 0000000000 to 111111111 binary), so a line like Code:
Best time for 10 bit trial factors: 0.118 ms Last fiddled with by sdbardwick on 20120106 at 03:25 Reason: bits vs. power of 2 exponent error 

20120106, 03:51  #7 
Romulan Interpreter
"name field"
Jun 2011
Thailand
2·3·5·7·47 Posts 
As I understood the questions, the OP is not interested in finding primes, or in the algorithms behind. In fact, he does not give a sh*t, according with the fact that even the P95's name is spelled wrong not only one time. His interest is in implementing some torture/benchmark test, and I would guess he works in domain (for some cpus or systems factory) and need some stuff to test/burn out the thingies (hence his second questions mentions "from coding POV", I would gues the right answer is "p95 is taking an exponent for which the results of TF is known, and it is doing a bit of sieving, a bit of exponentiation, then compares the times and the results with previous known times and results", If I got the question right, and I hope he is not asking what TF is, or how to TF to "some bitlevel"). I went through this myself, trying to port parts of P95 to WinCE5 and xscale/arm (pxa 320) and during that time I was asking myself almost the same questions.
Therefore, going in detail about how and why gimps is doing what is doing, and why LL and DC are different, would not make too much sense for him. The algorithms are explained well on gimps page and on wiki, and even on this forum, if he wants them. I took his questions more like "how these things are working, and WHY are they working?", i.e. "why this stuff can be a reliable benchmark or stress test, and how I can program something like that without going too much into algorithms detail?" To which I don't really know how to answer. edit: However, I love Christenson's explanation, "arranges to test the same exponent using different data", wonderful said. I will copy it somewhere to have it ready to be posted when some other guys will ask about LL/DC difference. Last fiddled with by LaurV on 20120106 at 04:07 
20120106, 07:00  #8 
Jan 2012
3^{2} Posts 
Thanks for reply,
@LaurV: sorry for the typo :( 1) So now my understanding is that,Memory size control is used to select range of FFT to be used for calculations.SO if my concern is to stress only CPU then i should not bother about these settings? 2) Thanks for comments,i got solution to my doubt. Thanks all:) 3) (please correct,if wrong) Acc to me in torture test steps are: a) find mersenne Prime b) Run LL primality test to confirm its primality ( *S0=4 ) c) [First Check of Double checking] Run LL test second time on different machine and compare the residue in (b) and (c) for equality. ( *S0=4 ) d) [Second Check of Double Checking] Run LL test with modified algo,in this *S0 is assigned left shifted 4,by random amount.and operform test then compare residue with(b) [doubt is] Acc to LL algo: S0 should be assigned value=4,but in step (d) we are assigning it value other than 4,then how will this test give correct output? Please refer:["http://en.wikipedia.org/wiki/LucasLehmer_primality_test"] algo: // Determine if Mp = 2p − 1 is prime Lucas–Lehmer(p) var s = 4 // (value assigned to *S0 ) var M = 2p − 1 repeat p − 2 times: s = ((s × s) − 2) mod M if s = 0 return PRIME else return COMPOSITE END I hope this time i have made my points clear .Still if more clarification is needed then please ask,Sorry for last time. 
20120106, 08:06  #9  
"Richard B. Woods"
Aug 2002
Wisconsin USA
2^{2}×3×641 Posts 
Quote:
Large FFTs will spread out the stress to include more L3 and main memory, thus diluting the stress on the ALU and L1/L2 cache. That choice would be good _if_ you wanted to test your memory. It stresses memory in a different way than memtest does, so sometimes it'll find faulty memory when memtest won't (and vice versa). Quote:
For small FFTs, it uses only small exponents. It runs a fixed number of iterations (which is only part of a full LL test) on each exponent, then compares the residue given by the last of those iterations to the known, precalculated, hardcoded residue value stored in the table for that exponent. If they match, there's no error. If they don't match, then something's wrong with your hardware. Doublechecking has nothing to do with torture testing. So, ignore doublechecking, for your purposes. The random shifting is also not part of torture testing. Ignore that, too. Also, it's not necessary to find a Mersenne prime to do the torture test. The torture test includes the most stressful part of the LL test that is used on any Mersenne number, whether prime or not. So, ignore Mersenne primes, for your purposes. Last fiddled with by cheesehead on 20120106 at 08:15 

20120106, 08:35  #10 
Jan 2012
11_{8} Posts 
@cheesehead
1) i got that,but what i want to know is if only purpose is to Utilize CPU to 100% irrespective of it uses RAM or cache (that is not my concern), then still do i have to implement these Memory controls or in that case i simply run test for FFT in a particular range so that CPU utilization is 100%(which currently i am able to do). Currently my code runs LL test for range of FFT and shows 100% cpu utilization.Here My purpose of the code is solved ,but what am worried about is,do these memory setting play critical role in the test which i am unable to understand except that they target cache or RAM, such that later it create problem with execution of code? 2)regarding torture test,in the output it display Pass 1 and Pass2 results,and these are output of Double checking {first check and second check},So still do you think i should not implement these? 
20120106, 09:12  #11 
(loop (#_fork))
Feb 2006
Cambridge, England
2^{2}·1,613 Posts 
Paramveer: could you tell us more exactly what it is that you want to do?
You can run a CPU to 100% utilisation as displayed in Task Manager in crores of ways  displaying a Web page containing an infiniteloop written in Javascript is a trivial one. You would want to use prime95 rather than something more trivial if you want to test that the CPU gives right answers when running unusually stressful work. If you want to stick a 'CERTIFIED: runs torture test successfully for 24 hours at 4.7GHz' on a machine you're selling, then the test to use should be the blended one, and you should write up somewhere on the certificate the fact that you used the blended one (and the precise version of prime95 that you used). Last fiddled with by fivemack on 20120106 at 09:13 
Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
Is it possible to disable benchmarking while torture tests are running?  ZFR  Software  4  20180202 20:18 
Will the torture test, test ALL available memory?  swinster  Software  2  20071201 17:54 
How to use Prime95 For benchmarking and torture testing only  Cyclamen Persicum  Software  2  20040403 14:52 
torture test help  teotic_hk  Hardware  8  20040322 20:23 
Torture test not torture enough?  cmokruhl  Software  3  20030108 00:14 