mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2017-03-24, 18:45   #1
rudi_m
 
rudi_m's Avatar
 
Jul 2005

2×7×13 Posts
Default TF benchmarks, mprime vs GPU

Hi,

since mprime 29.1 I'm really excited about comparing x86 vs GPU trial factoring benchmarks again. Also I don't have a AVX-512 machine and would like to see what would be the benefit of Xeon over consumer i5/7

Here I start with a real job benchmark, using a plain i7-6700 CPU. The table below shows the times needed to progress 1.27% of factoring M701000023 from 78 to 79 bit.

The columns show
1. used CPU cores
2. with/without hyper threading
3. speed factor regarding single thread
4. absolute time in seconds for 1.27% progress (OutputIterations=500000)
5. hyper threading benefit in percent

Code:
1 cores -ht x1.00 4543.147
1 cores +ht x1.22 3714.403 (+22.3%)
2 cores -ht x1.99 2285.129
2 cores +ht x2.43 1869.446 (+22.2%)
3 cores -ht x2.94 1545.755
3 cores +ht x3.59 1264.282 (+22.3%)
4 cores -ht x3.89 1168.383
4 cores +ht x4.65  977.141 (+19.6%)
As you see the multi-threading benefit is almost perfectly linear. (3 and 4 threads may look a bit worse because I had many other processes running on the server and also because of "less turbo GHz")

So now I'm curious how would AVX-512 or GPU perform for a similar job.

Last fiddled with by rudi_m on 2017-03-24 at 18:53
rudi_m is offline   Reply With Quote
Old 2017-03-24, 18:58   #2
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

11×101 Posts
Default

Stock GTX 1060 is around 4h 30m for M701000023 from 278 to 279.
Your 977s for 1.27% translates to ~21h 20m.
So don't waste your CPU time with TF and do some LL tests, we (GIMPS) are doing way to much TF already.

Oliver

Last fiddled with by TheJudger on 2017-03-24 at 18:58
TheJudger is offline   Reply With Quote
Old 2017-03-24, 19:38   #3
rudi_m
 
rudi_m's Avatar
 
Jul 2005

2×7×13 Posts
Default

Quote:
Originally Posted by TheJudger View Post
Stock GTX 1060 is around 4h 30m for M701000023 from 278 to 279.
Your 977s for 1.27% translates to ~21h 20m.
Ok, this is still factor 4.7 faster than CPU (without AVX-512) but estimated only factor ~2.3 better regarding power consumption.

Quote:
Originally Posted by TheJudger View Post
So don't waste your CPU time with TF and do some LL tests, we (GIMPS) are doing way to much TF already.
Yes Sir ;) Only have to finish my project to write a "column of one's" into the "P-1 Available" field on the "work distribution map" https://www.mersenne.org/primenet/

It's all about fun ;)
rudi_m is offline   Reply With Quote
Old 2017-03-24, 19:55   #4
rudi_m
 
rudi_m's Avatar
 
Jul 2005

2×7×13 Posts
Default

BTW in case one wants to help to safe my CPU power for LL you may pick up these for jobs for me :

Code:
Factor=602000051,78,79
Factor=603000011,78,79
Factor=604000003,78,79
Factor=605000003,78,79
Factor=606000001,78,79
Factor=607000003,78,79
Factor=608000017,78,79
Factor=609000137,78,79
Factor=611000003,78,79
Factor=612000029,78,79
Factor=613000009,78,79
Factor=614000021,78,79
Factor=615000019,78,79
Factor=616000127,78,79
Factor=617000107,78,79
Factor=618000017,78,79
Factor=619000009,78,79
Factor=621000007,78,79
Factor=622000039,78,79
Factor=623000033,78,79
Factor=625000069,78,79
Factor=626000047,78,79
Factor=627000053,78,79
Factor=628000057,78,79
Factor=629000011,78,79
Factor=631000001,78,79
Factor=632000011,78,79
Factor=633000013,78,79
Factor=634000019,78,79
Factor=635000089,78,79
Factor=636000031,78,79
Factor=637000009,78,79
Factor=638000003,78,79
Factor=639000083,78,79
Factor=641000117,78,79
Factor=642000031,78,79
Factor=643000031,78,79
Factor=644000011,78,79
Factor=645000007,78,79
Factor=646000049,78,79
Factor=647000017,78,79
Factor=648000083,78,79
Factor=649000073,78,79
Factor=651000047,78,79
Factor=652000003,78,79
Factor=653000003,78,79
Factor=654000059,78,79
Factor=655000097,78,79
Factor=656000003,78,79
Factor=657000121,78,79
Factor=658000139,78,79
Factor=659000137,78,79
Factor=661000117,78,79
Factor=662000011,78,79
Factor=663000031,78,79
Factor=665000029,78,79
Factor=666000019,78,79
Factor=667000001,78,79
Factor=668000131,78,79
Factor=669000047,78,79
Factor=671000017,78,79
Factor=672000013,78,79
Factor=673000037,78,79
Factor=674000023,78,79
Factor=675000037,78,79
Factor=676000007,78,79
Factor=677000201,78,79
Factor=678000031,78,79
Factor=679000019,78,79
Factor=681000013,78,79
Factor=682000063,78,79
Factor=683000053,78,79
Factor=684000067,78,79
Factor=685000003,78,79
Factor=686000009,78,79
Factor=687000049,78,79
Factor=688000009,78,79
Factor=689000027,78,79
Factor=691000099,78,79
Factor=692000017,78,79
Factor=693000109,78,79
Factor=694000171,78,79
Factor=695000003,78,79
Factor=696000013,78,79
Factor=697000013,78,79
Factor=699000017,78,79
rudi_m is offline   Reply With Quote
Old 2017-04-02, 05:13   #5
storm5510
Random Account
 
storm5510's Avatar
 
Aug 2009

111101001002 Posts
Default Too Much Trial Factoring

Quote:
Originally Posted by TheJudger View Post
... we (GIMPS) are doing way to much TF already.

Oliver
I've been using mfaktc since this past November. I can run about 173 GHz days per day on this hardware (GTX-750 Ti). This is in the 146 million to 152 million range.

I did a double-check test with an exponent in the 44 million range. CuLu estimated completion in 6 days and 19 hours. Prime95 indicated 6 days and 9 hours. I was a bit baffled that Prime95 could do this faster. Note: I did not make any changed to the CuLu configuration file.

Should I being doing other work or continue with mfaktc?

Thanks!
storm5510 is offline   Reply With Quote
Old 2017-04-02, 10:08   #6
VictordeHolland
 
VictordeHolland's Avatar
 
"Victor de Hollander"
Aug 2011
the Netherlands

23·3·72 Posts
Default

Quote:
Originally Posted by storm5510 View Post
I've been using mfaktc since this past November. I can run about 173 GHz days per day on this hardware (GTX-750 Ti). This is in the 146 million to 152 million range.

I did a double-check test with an exponent in the 44 million range. CuLu estimated completion in 6 days and 19 hours. Prime95 indicated 6 days and 9 hours. I was a bit baffled that Prime95 could do this faster. Note: I did not make any changed to the CuLu configuration file.

Should I being doing other work or continue with mfaktc?

Thanks!
GPUs excel at TF, but for LL they're about the same as a quad-core CPU.
VictordeHolland is offline   Reply With Quote
Old 2017-04-02, 12:56   #7
ATH
Einyen
 
ATH's Avatar
 
Dec 2003
Denmark

2·1,579 Posts
Default

Quote:
Originally Posted by storm5510 View Post
I did a double-check test with an exponent in the 44 million range. CuLu estimated completion in 6 days and 19 hours. Prime95 indicated 6 days and 9 hours. I was a bit baffled that Prime95 could do this faster. Note: I did not make any changed to the CuLu configuration file.
The problem is LL test requires double precision while TF only uses single precision. If you look at your GTX 750 Ti here:
https://en.wikipedia.org/wiki/GeForc...eries#Products

You can see it does 1306 GFLOPS single precision but only 40.8 GFLOPS (1/32 * 1306) in double precision, that is why LL test is so slow compared to TF.

Almost all "consumer" graphic cards have DP = 1/32 * SP or DP = 1/24 * SP except the original Titan, Titan Black and Titan Z from 2013/2014, they have DP = 1/3 * SP. You can see them at the bottom of that same list.
ATH is offline   Reply With Quote
Old 2017-04-02, 12:57   #8
Mark Rose
 
Mark Rose's Avatar
 
"/X\(‘-‘)/X\"
Jan 2013

2·5·293 Posts
Default

The GTX 750 Ti isn't a particularly powerful GPU. A 780 or 580 is about 4 times faster for LL. A 1080 is about 7 times faster.

The project is getting more than enough TF work to keep ahead of LL. I understand we're only a little ahead of the P-1 work. We're probably two decades away from the 146M to 152M range.

The GTX 750 Ti may not be the world's fastest GPU, but put it this way: you can get double your LL throughput by using it for LL, so why not.
Mark Rose is offline   Reply With Quote
Old 2017-04-02, 15:39   #9
storm5510
Random Account
 
storm5510's Avatar
 
Aug 2009

7A416 Posts
Default

Quote:
Originally Posted by Mark Rose View Post
The GTX 750 Ti isn't a particularly powerful GPU. A 780 or 580 is about 4 times faster for LL. A 1080 is about 7 times faster.

The project is getting more than enough TF work to keep ahead of LL. I understand we're only a little ahead of the P-1 work. We're probably two decades away from the 146M to 152M range.

The GTX 750 Ti may not be the world's fastest GPU, but put it this way: you can get double your LL throughput by using it for LL, so why not.
The 580 and 780 are way outside my cost range.

The CPU in this machine is an i5-3570 quad @ 3.4 GHz. Under a load, it will go up to 3.7. I've done pairs of P-1 tests with Prime95 in about 20 hours using two worker windows. The other two cores assist. This system can use an i7. So far, I haven't seen any reason to change it.

Now, I'll ask a question that I've been wondering about: What is the connection between TF and P-1, ECM, and so on?
storm5510 is offline   Reply With Quote
Old 2017-04-02, 19:01   #10
Mark Rose
 
Mark Rose's Avatar
 
"/X\(‘-‘)/X\"
Jan 2013

2·5·293 Posts
Default

Quote:
Originally Posted by storm5510 View Post
The 580 and 780 are way outside my cost range.
Shouldn't be. I've bought a used GTX 580 as cheap as $50 here in Canada. That being said, it consumes far more power than a GTX 750 Ti. Gaming performance between the two will be about the same.

Quote:
The CPU in this machine is an i5-3570 quad @ 3.4 GHz. Under a load, it will go up to 3.7. I've done pairs of P-1 tests with Prime95 in about 20 hours using two worker windows. The other two cores assist. This system can use an i7. So far, I haven't seen any reason to change it.
There's no point getting an i7 for Prime95.

Quote:
Now, I'll ask a question that I've been wondering about: What is the connection between TF and P-1, ECM, and so on?
GP2 wrote an excellent post answering that.

Last fiddled with by Mark Rose on 2017-04-02 at 19:07
Mark Rose is offline   Reply With Quote
Old 2017-04-02, 23:46   #11
Gordon
 
Gordon's Avatar
 
Nov 2008

50110 Posts
Default

Quote:
Originally Posted by Mark Rose View Post
Shouldn't be. I've bought a used GTX 580 as cheap as $50 here in Canada. That being said, it consumes far more power than a GTX 750 Ti. Gaming performance between the two will be about the same.
Remembering that a 580 on full power sounds like a jet engine taking off and the power consumption....
Gordon is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
LLR Benchmarks pinhodecarlos Software 22 2017-04-28 19:13
Where are the Benchmarks Sandman192 Homework Help 17 2012-04-05 19:03
Benchmarks for i7 965 lavalamp Hardware 21 2009-01-06 04:32
LLR benchmarks Retep Riesel Prime Search 4 2008-11-06 22:15
Problem with mprime (Fixed with mprime -d) antiroach Software 2 2004-07-19 04:07

All times are UTC. The time now is 17:32.


Sun Aug 1 17:32:01 UTC 2021 up 9 days, 12:01, 0 users, load averages: 1.39, 1.49, 1.37

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.