mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   Second-hand CPU vs brand new GPU (https://www.mersenneforum.org/showthread.php?t=21903)

fivemack 2017-01-07 12:05

Second-hand CPU vs brand new GPU
 
A bargainhardware.com dual E5-2650v1 machine costs £300, uses 220 watts, and does 7.2ms per iteration on four 44M exponents in parallel

A GTX1080 costs £665, uses about 150 watts, and does 2.4ms per iteration on a single 45M exponent.

So the machine designed five years ago for no-holds-barred double-precision wins quite handily even without the benefits of AVX2, and I should probably stop running cudalucas.

Running equivalent comparison for ECM now

Gordon 2017-01-07 12:28

[QUOTE=fivemack;450624]A bargainhardware.com dual E5-2650v1 machine costs £300, uses 220 watts, and does 7.2ms per iteration on four 44M exponents in parallel

A GTX1080 costs £665, uses about 150 watts, and does 2.4ms per iteration on a single 45M exponent.

So the machine designed five years ago for no-holds-barred double-precision wins quite handily even without the benefits of AVX2, and I should probably stop running cudalucas.

Running equivalent comparison for ECM now[/QUOTE]

Isn't the real question, why on earth would you run LL testing on a GPU? The 1080 is a beast at running TF.

fivemack 2017-01-07 12:38

[QUOTE=Gordon;450625]Isn't the real question, why on earth would you run LL testing on a GPU? The 1080 is a beast at running TF.[/QUOTE]

I get the impression there is more than enough GPU TF effort already in place. Mostly I got the GPU for factorisation, running ECM and polynomial selection, which it does pretty well.

ATH 2017-01-07 15:10

GTX 1080 is one of those newer card with DP = 1/32th of SP performance, 257 GFLOPS vs 8228 GFLOPS, so it is not best for LL tests. There is really a need for a new consumer card with better DP performance.

Mark Rose 2017-01-07 18:09

Apparently this year's Vega from AMD will have 1/16th DP.

That dual processor machine looks like a good deal for LL.

Batalov 2017-01-07 18:47

1 Attachment(s)
[QUOTE=fivemack;450624].., I should probably stop running cudalucas.
[/QUOTE]
Amen to that!

In contrast to cudalucas (which makes calls to some vanilla FFT and hits all the artificially NVIDIA-imposed bottlenecks), one should really want to run some DWT or NTT algorithm (like geneFer or Cyclo) to make GPUs really shine.

Prime95 2017-01-07 19:11

[QUOTE=Batalov;450645]In contrast to cudalucas (which makes calls to some vanilla FFT and hits all the artificially NVIDIA-imposed bottlenecks), one should really want to run some DWT or NTT algorithm.[/QUOTE]

CUDALucas does use a DWT algorithm.

I briefly explored an all-integer solution (not NTT though). My conclusion was I was unlikely to significantly beat the current CUDALucas timings. IIRC, it would be roughly +/- 20%. I think this was on a 6xx GPU.

mackerel 2017-01-07 19:35

[QUOTE=fivemack;450624]A bargainhardware.com dual E5-2650v1 machine costs £300, uses 220 watts, and does 7.2ms per iteration on four 44M exponents in parallel[/QUOTE]

They're taking offers at £250 nominal on ebay for the 32GB model. They seem to use 4GB 2R modules so you shouldn't run into ram bandwidth problems. When I got the 64GB model I offered a bit lower than their asking and they took it.

It would also be more fair to compare against previous generation used. Taking a 980Ti for example, it is approx 2/3 the rated boost SP FLOPS with a target cost under half a 1080, although TDP is higher at 250W.

If you really want DP, what about the R9 280X? It was possibly the last fast consumer card before they started to cripple DP. A quick look on ebay shows them under £150, and that gets you ball park of 1 DP TFLOP. Still 250W TDP though. If anyone can give me idiot proof instructions on how to bench it, I can do it on mine. I've bios mod lowered voltage so in practice it only takes around 200W now.

[QUOTE=ATH;450635]GTX 1080 is one of those newer card with DP = 1/32th of SP performance, 257 GFLOPS vs 8228 GFLOPS, so it is not best for LL tests. There is really a need for a new consumer card with better DP performance.[/QUOTE]

Unfortunately probably not going to happen, unless you can find a compelling consumer DP requirement. If anything, the trend seems to be going the other way, with ever more FLOPS at lower precision.

Batalov 2017-01-07 19:38

Re: DWT
 
Right! (I must have been thinking about llrCUDA -- which simply calls FFTw)

NTT gave GeNeFer a new life (b ranges were extended, and now that it is implemented in OCL it is free of NVIDIA shackles).

Mark Rose 2017-01-07 22:57

[QUOTE=mackerel;450650]Unfortunately probably not going to happen, unless you can find a compelling consumer DP requirement. If anything, the trend seems to be going the other way, with ever more FLOPS at lower precision.[/QUOTE]

Rumours are Vega 20 will have 1/2 DP in 2018.

pepi37 2017-01-07 23:06

Can llrCUDA can be rewritten as llrocl?


All times are UTC. The time now is 03:13.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.