![]() |
[QUOTE=msft;223307]llrpsrc.zip include small treasure, Complex FFT version IBDWT.[/QUOTE]
But CUFFT(CUDA3.1) non power of 2 performance is miserable.:no: |
The core i7 machine I have draws about 300W and runs 21Gflops peak (four cores * 2.66GHz * two flops per cycle), so say 70MFLOP/J
|
[QUOTE=fivemack;223345]The core i7 machine I have draws about 300W and runs 21Gflops peak (four cores * 2.66GHz * [B]two flops[/B] per cycle), so say 70MFLOP/J[/QUOTE]
Four flops / cycle. |
[quote=msft;223319]But CUFFT(CUDA3.1) non power of 2 performance is miserable.:no:[/quote]
Yeah, LLR includes code for doing N-1/N+1 tests on bases other than 2 (since LLR tests only work on k*2^n-1), as well as Proth tests for k*2^n+1. Even if you can't get the non-power-of-2 stuff very fast, just having base 2 available at full speed would be awesome! :smile: |
[QUOTE=mdettweiler;223368]Yeah, LLR includes code for doing N-1/N+1 tests on bases other than 2 (since LLR tests only work on k*2^n-1), as well as Proth tests for k*2^n+1. Even if you can't get the non-power-of-2 stuff very fast, just having base 2 available at full speed would be awesome! :smile:[/QUOTE]
Not for those who've spent thousands of dollars on their own CPU farm. I feel sorry for the person who bought a lot of quad cores for prime hunting, only to have those primes wiped off the top 5000 list by a few GPUs :sad: And to add insult to injury, that person may be forced to unreserve ranges too, since his/her progress will be considered "too slow". edit: mdettweiler, I know you mentioned limiting GPUs to certain parts of your project, like megabit primes and primes too small for the top 5000 list. While that's a reasonable idea, it's unlikely that Primegrid would show the same restraint. |
I find oddball's attitude here pretty much incomprehensible.
If someone had the resources to buy a cluster, they did so to play with the big boys, and probably have the resources to stick GF460 cards in it and continue playing with the big boys. If someone reserved a range which takes a time significant in comparison with the rate of change of technology, that was an error. |
[QUOTE=axn;223346]Four flops / cycle.[/QUOTE]Indeed, 4 flops/cycle, however using that formula still does not yield an answer that matches the reported throughput of the i7 980 XE.
flop/cycle * freq * cores = 4 * 3.33 * 6 = 80 DP GFLOP/s But the reported performance is 107.55, not 80. 107.55 is what wikipedia says it is with no source, and a lot of other places also say that's what it is, but appear to have copied it straight from the wikipedia page. If the multiplier was increased from 25 to 27 (as it can be with turbo mode, but only when one core is active), and instead of 4 flops/cycle it is somehow 5, then you can get close to that number: 5 * 3.6 * 6 = 108 GFLOP/s So it is either an error that was copied and reported as a fact everywhere, or something weird is going on. |
What I find funny is that Intel reports much lower numbers: [URL]http://www.intel.com/support/processors/sb/cs-023143.htm[/URL]
Oh wait that's the export compliance metrics page :razz: |
[QUOTE=fivemack;223373]
If someone had the resources to buy a cluster, they did so to play with the big boys, and probably have the resources to stick GF460 cards in it and continue playing with the big boys. [/QUOTE] First of all, there may be compatibility issues. Second of all, even if they do have the resources to "stick GF460 cards in it", those GF460 cards are still a significant cost. The situation is like if someone decided to introduce a new law that required all cars to have a sunroof. If you have the resources to buy a car, you probably have the resources to pay to have that sunroof installed and continue driving, but I bet you'd be pretty upset about the situation. |
[QUOTE=axn;223346]Four flops / cycle.[/QUOTE]
Just out of curiosity, is that true for all the recent AMD processors too? By recent, I mean the K10 series. |
[quote=Oddball;223379]Second of all, even if they do have the resources to "stick GF460 cards in it", those GF460 cards are still a significant cost. The situation is like if someone decided to introduce a new law that required all cars to have a sunroof. If you have the resources to buy a car, you probably have the resources to pay to have that sunroof installed and continue driving, but I bet you'd be pretty upset about the situation.[/quote]
I don't understand... There's no law, just some much faster CPU/GPU that will make your computer/cluster useless. Unless you were thinking of Moore's law? :smile: I see many Ferrari and Porsche where I live, I can still go to work, but I'd never race against such cars. I guess for number crunching it's the same: if your hardware is obsolete use it for some less demanding tasks, and leave all the glory to the big boys with deep pockets. |
| All times are UTC. The time now is 22:30. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.