![]() |
Second-hand CPU vs brand new GPU
A bargainhardware.com dual E5-2650v1 machine costs £300, uses 220 watts, and does 7.2ms per iteration on four 44M exponents in parallel
A GTX1080 costs £665, uses about 150 watts, and does 2.4ms per iteration on a single 45M exponent. So the machine designed five years ago for no-holds-barred double-precision wins quite handily even without the benefits of AVX2, and I should probably stop running cudalucas. Running equivalent comparison for ECM now |
[QUOTE=fivemack;450624]A bargainhardware.com dual E5-2650v1 machine costs £300, uses 220 watts, and does 7.2ms per iteration on four 44M exponents in parallel
A GTX1080 costs £665, uses about 150 watts, and does 2.4ms per iteration on a single 45M exponent. So the machine designed five years ago for no-holds-barred double-precision wins quite handily even without the benefits of AVX2, and I should probably stop running cudalucas. Running equivalent comparison for ECM now[/QUOTE] Isn't the real question, why on earth would you run LL testing on a GPU? The 1080 is a beast at running TF. |
[QUOTE=Gordon;450625]Isn't the real question, why on earth would you run LL testing on a GPU? The 1080 is a beast at running TF.[/QUOTE]
I get the impression there is more than enough GPU TF effort already in place. Mostly I got the GPU for factorisation, running ECM and polynomial selection, which it does pretty well. |
GTX 1080 is one of those newer card with DP = 1/32th of SP performance, 257 GFLOPS vs 8228 GFLOPS, so it is not best for LL tests. There is really a need for a new consumer card with better DP performance.
|
Apparently this year's Vega from AMD will have 1/16th DP.
That dual processor machine looks like a good deal for LL. |
1 Attachment(s)
[QUOTE=fivemack;450624].., I should probably stop running cudalucas.
[/QUOTE] Amen to that! In contrast to cudalucas (which makes calls to some vanilla FFT and hits all the artificially NVIDIA-imposed bottlenecks), one should really want to run some DWT or NTT algorithm (like geneFer or Cyclo) to make GPUs really shine. |
[QUOTE=Batalov;450645]In contrast to cudalucas (which makes calls to some vanilla FFT and hits all the artificially NVIDIA-imposed bottlenecks), one should really want to run some DWT or NTT algorithm.[/QUOTE]
CUDALucas does use a DWT algorithm. I briefly explored an all-integer solution (not NTT though). My conclusion was I was unlikely to significantly beat the current CUDALucas timings. IIRC, it would be roughly +/- 20%. I think this was on a 6xx GPU. |
[QUOTE=fivemack;450624]A bargainhardware.com dual E5-2650v1 machine costs £300, uses 220 watts, and does 7.2ms per iteration on four 44M exponents in parallel[/QUOTE]
They're taking offers at £250 nominal on ebay for the 32GB model. They seem to use 4GB 2R modules so you shouldn't run into ram bandwidth problems. When I got the 64GB model I offered a bit lower than their asking and they took it. It would also be more fair to compare against previous generation used. Taking a 980Ti for example, it is approx 2/3 the rated boost SP FLOPS with a target cost under half a 1080, although TDP is higher at 250W. If you really want DP, what about the R9 280X? It was possibly the last fast consumer card before they started to cripple DP. A quick look on ebay shows them under £150, and that gets you ball park of 1 DP TFLOP. Still 250W TDP though. If anyone can give me idiot proof instructions on how to bench it, I can do it on mine. I've bios mod lowered voltage so in practice it only takes around 200W now. [QUOTE=ATH;450635]GTX 1080 is one of those newer card with DP = 1/32th of SP performance, 257 GFLOPS vs 8228 GFLOPS, so it is not best for LL tests. There is really a need for a new consumer card with better DP performance.[/QUOTE] Unfortunately probably not going to happen, unless you can find a compelling consumer DP requirement. If anything, the trend seems to be going the other way, with ever more FLOPS at lower precision. |
Re: DWT
Right! (I must have been thinking about llrCUDA -- which simply calls FFTw)
NTT gave GeNeFer a new life (b ranges were extended, and now that it is implemented in OCL it is free of NVIDIA shackles). |
[QUOTE=mackerel;450650]Unfortunately probably not going to happen, unless you can find a compelling consumer DP requirement. If anything, the trend seems to be going the other way, with ever more FLOPS at lower precision.[/QUOTE]
Rumours are Vega 20 will have 1/2 DP in 2018. |
Can llrCUDA can be rewritten as llrocl?
|
| All times are UTC. The time now is 03:13. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.