mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Software (https://www.mersenneforum.org/forumdisplay.php?f=10)
-   -   TF benchmarks, mprime vs GPU (https://www.mersenneforum.org/showthread.php?t=22151)

rudi_m 2017-03-24 18:45

TF benchmarks, mprime vs GPU
 
Hi,

since mprime 29.1 I'm really excited about comparing x86 vs GPU trial factoring benchmarks again. Also I don't have a AVX-512 machine and would like to see what would be the benefit of Xeon over consumer i5/7

Here I start with a real job benchmark, using a plain i7-6700 CPU. The table below shows the times needed to progress 1.27% of factoring M701000023 from 78 to 79 bit.

The columns show
1. used CPU cores
2. with/without hyper threading
3. speed factor regarding single thread
4. absolute time in seconds for 1.27% progress (OutputIterations=500000)
5. hyper threading benefit in percent

[CODE]
1 cores -ht x1.00 4543.147
1 cores +ht x1.22 3714.403 (+22.3%)
2 cores -ht x1.99 2285.129
2 cores +ht x2.43 1869.446 (+22.2%)
3 cores -ht x2.94 1545.755
3 cores +ht x3.59 1264.282 (+22.3%)
4 cores -ht x3.89 1168.383
4 cores +ht x4.65 977.141 (+19.6%)
[/CODE]

As you see the multi-threading benefit is almost perfectly linear. (3 and 4 threads may look a bit worse because I had many other processes running on the server and also because of "less turbo GHz")

So now I'm curious how would AVX-512 or GPU perform for a similar job.

TheJudger 2017-03-24 18:58

Stock GTX 1060 is around 4h 30m for M701000023 from 2[SUP]78[/SUP] to 2[SUP]79[/SUP].
Your 977s for 1.27% translates to ~21h 20m.
So don't waste your CPU time with TF and do some LL tests, we (GIMPS) are doing way to much TF already.

Oliver

rudi_m 2017-03-24 19:38

[QUOTE=TheJudger;455425]Stock GTX 1060 is around 4h 30m for M701000023 from 2[SUP]78[/SUP] to 2[SUP]79[/SUP].
Your 977s for 1.27% translates to ~21h 20m.
[/QUOTE]
Ok, this is still factor 4.7 faster than CPU (without AVX-512) but estimated only factor ~2.3 better regarding power consumption.

[QUOTE=TheJudger;455425]
So don't waste your CPU time with TF and do some LL tests, we (GIMPS) are doing way to much TF already.
[/QUOTE]

Yes Sir ;) Only have to finish my project to write a "column of one's" into the "P-1 Available" field on the "work distribution map" [url]https://www.mersenne.org/primenet/[/url]

It's all about fun ;)

rudi_m 2017-03-24 19:55

BTW in case one wants to help to safe my CPU power for LL you may pick up these for jobs for me :smile: :

[CODE]
Factor=602000051,78,79
Factor=603000011,78,79
Factor=604000003,78,79
Factor=605000003,78,79
Factor=606000001,78,79
Factor=607000003,78,79
Factor=608000017,78,79
Factor=609000137,78,79
Factor=611000003,78,79
Factor=612000029,78,79
Factor=613000009,78,79
Factor=614000021,78,79
Factor=615000019,78,79
Factor=616000127,78,79
Factor=617000107,78,79
Factor=618000017,78,79
Factor=619000009,78,79
Factor=621000007,78,79
Factor=622000039,78,79
Factor=623000033,78,79
Factor=625000069,78,79
Factor=626000047,78,79
Factor=627000053,78,79
Factor=628000057,78,79
Factor=629000011,78,79
Factor=631000001,78,79
Factor=632000011,78,79
Factor=633000013,78,79
Factor=634000019,78,79
Factor=635000089,78,79
Factor=636000031,78,79
Factor=637000009,78,79
Factor=638000003,78,79
Factor=639000083,78,79
Factor=641000117,78,79
Factor=642000031,78,79
Factor=643000031,78,79
Factor=644000011,78,79
Factor=645000007,78,79
Factor=646000049,78,79
Factor=647000017,78,79
Factor=648000083,78,79
Factor=649000073,78,79
Factor=651000047,78,79
Factor=652000003,78,79
Factor=653000003,78,79
Factor=654000059,78,79
Factor=655000097,78,79
Factor=656000003,78,79
Factor=657000121,78,79
Factor=658000139,78,79
Factor=659000137,78,79
Factor=661000117,78,79
Factor=662000011,78,79
Factor=663000031,78,79
Factor=665000029,78,79
Factor=666000019,78,79
Factor=667000001,78,79
Factor=668000131,78,79
Factor=669000047,78,79
Factor=671000017,78,79
Factor=672000013,78,79
Factor=673000037,78,79
Factor=674000023,78,79
Factor=675000037,78,79
Factor=676000007,78,79
Factor=677000201,78,79
Factor=678000031,78,79
Factor=679000019,78,79
Factor=681000013,78,79
Factor=682000063,78,79
Factor=683000053,78,79
Factor=684000067,78,79
Factor=685000003,78,79
Factor=686000009,78,79
Factor=687000049,78,79
Factor=688000009,78,79
Factor=689000027,78,79
Factor=691000099,78,79
Factor=692000017,78,79
Factor=693000109,78,79
Factor=694000171,78,79
Factor=695000003,78,79
Factor=696000013,78,79
Factor=697000013,78,79
Factor=699000017,78,79
[/CODE]

storm5510 2017-04-02 05:13

Too Much Trial Factoring
 
[QUOTE=TheJudger;455425]... we (GIMPS) are doing way to much TF already.

Oliver[/QUOTE]

I've been using [I]mfaktc[/I] since this past November. I can run about 173 GHz days per day on this hardware (GTX-750 Ti). This is in the 146 million to 152 million range.

I did a double-check test with an exponent in the 44 million range. [I]CuLu[/I] estimated completion in 6 days and 19 hours. [I]Prime95[/I] indicated 6 days and 9 hours. I was a bit baffled that [I]Prime95[/I] could do this faster. Note: I did not make any changed to the [I]CuLu[/I] configuration file.

Should I being doing other work or continue with [I]mfaktc[/I]?

Thanks!

VictordeHolland 2017-04-02 10:08

[QUOTE=storm5510;456007]I've been using [I]mfaktc[/I] since this past November. I can run about 173 GHz days per day on this hardware (GTX-750 Ti). This is in the 146 million to 152 million range.

I did a double-check test with an exponent in the 44 million range. [I]CuLu[/I] estimated completion in 6 days and 19 hours. [I]Prime95[/I] indicated 6 days and 9 hours. I was a bit baffled that [I]Prime95[/I] could do this faster. Note: I did not make any changed to the [I]CuLu[/I] configuration file.

Should I being doing other work or continue with [I]mfaktc[/I]?

Thanks![/QUOTE]
GPUs excel at TF, but for LL they're about the same as a quad-core CPU.

ATH 2017-04-02 12:56

[QUOTE=storm5510;456007]I did a double-check test with an exponent in the 44 million range. [I]CuLu[/I] estimated completion in 6 days and 19 hours. [I]Prime95[/I] indicated 6 days and 9 hours. I was a bit baffled that [I]Prime95[/I] could do this faster. Note: I did not make any changed to the [I]CuLu[/I] configuration file.[/QUOTE]

The problem is LL test requires double precision while TF only uses single precision. If you look at your GTX 750 Ti here:
[url]https://en.wikipedia.org/wiki/GeForce_700_series#Products[/url]

You can see it does 1306 GFLOPS single precision but only 40.8 GFLOPS (1/32 * 1306) in double precision, that is why LL test is so slow compared to TF.

Almost all "consumer" graphic cards have DP = 1/32 * SP or DP = 1/24 * SP except the original Titan, Titan Black and Titan Z from 2013/2014, they have DP = 1/3 * SP. You can see them at the bottom of that same list.

Mark Rose 2017-04-02 12:57

The GTX 750 Ti isn't a particularly powerful GPU. A 780 or 580 is about 4 times faster for LL. A 1080 is about 7 times faster.

The project is getting more than enough TF work to keep ahead of LL. I understand we're only a little ahead of the P-1 work. We're probably two decades away from the 146M to 152M range.

The GTX 750 Ti may not be the world's fastest GPU, but put it this way: you can get double your LL throughput by using it for LL, so why not.

storm5510 2017-04-02 15:39

[QUOTE=Mark Rose;456032]The GTX 750 Ti isn't a particularly powerful GPU. A 780 or 580 is about 4 times faster for LL. A 1080 is about 7 times faster.

The project is getting more than enough TF work to keep ahead of LL. I understand we're only a little ahead of the P-1 work. We're probably two decades away from the 146M to 152M range.

The GTX 750 Ti may not be the world's fastest GPU, but put it this way: you can get double your LL throughput by using it for LL, so why not.[/QUOTE]

The 580 and 780 are way outside my cost range. :sad:

The CPU in this machine is an i5-3570 quad @ 3.4 GHz. Under a load, it will go up to 3.7. I've done pairs of P-1 tests with Prime95 in about 20 hours using two worker windows. The other two cores assist. This system can use an i7. So far, I haven't seen any reason to change it.

Now, I'll ask a question that I've been wondering about: What is the connection between TF and P-1, ECM, and so on?

Mark Rose 2017-04-02 19:01

[QUOTE=storm5510;456039]The 580 and 780 are way outside my cost range. :sad:[/quote]

Shouldn't be. I've bought a used GTX 580 as cheap as $50 here in Canada. That being said, it consumes far more power than a GTX 750 Ti. Gaming performance between the two will be about the same.

[quote]
The CPU in this machine is an i5-3570 quad @ 3.4 GHz. Under a load, it will go up to 3.7. I've done pairs of P-1 tests with Prime95 in about 20 hours using two worker windows. The other two cores assist. This system can use an i7. So far, I haven't seen any reason to change it.[/quote]

There's no point getting an i7 for Prime95.

[quote]Now, I'll ask a question that I've been wondering about: What is the connection between TF and P-1, ECM, and so on?[/QUOTE]

GP2 wrote an [url=http://www.mersenneforum.org/showpost.php?p=453273&postcount=3]excellent post[/url] answering that.

Gordon 2017-04-02 23:46

[QUOTE=Mark Rose;456047]Shouldn't be. I've bought a used GTX 580 as cheap as $50 here in Canada. That being said, it consumes far more power than a GTX 750 Ti. Gaming performance between the two will be about the same.
[/QUOTE]

Remembering that a 580 on full power sounds like a jet engine taking off and the power consumption....


All times are UTC. The time now is 22:08.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.