![]() |
[QUOTE=Uncwilly;198570]Time for a new prime!!![/QUOTE]
Yeah! We can run the verification run on a GPU. :) |
[quote=ET_;198551]And by chance a multi-threaded trial-factoring?[/quote]George has previously pointed out that performance improvements in TF have relatively little effect on GIMPS throughput.
If we [I]doubled[/I] TF speed, the optimum bit level for an exponent range would be raised by only 1. For example, if the current TF limit were 73, a doubling of TF speed would mean taking it to 74 instead, which has only a 1/73 extra chance of finding a factor in return for the doubling of TF speed. Not much throughput leverage there. |
[quote=Prime95;198567]Nine weeks. South America.[/quote]Iguassu Falls, by any chance?
- - - (Coincidentally, David Letterman's guest just said she's going to Buenos Aires soon.) |
[QUOTE=willmore;198598]Because of CPU execution improvements, the balance between the CPU and memory being the bottleneck will be pushed more towards the memory? Just guessing here.[/QUOTE]
In two-pass FFTs, you can reduce memory requirements with the cost of some extra complex multiplies. The current FFTs were optimized for a P4 where a cache line took ~150 clocks to read in. My Core i7 takes ~30 clocks to read a cache line. It makes sense to re-evaluate some of the FFT design choices in light of these new circumstances. |
[QUOTE=cheesehead;198601]Iguassu Falls, by any chance?
(Coincidentally, David Letterman's guest just said she's going to Buenos Aires soon.)[/QUOTE] Both of those are on the itinerary :) |
what about a X3450?
Googled this relatively fresh review --
[URL]http://ixbtlabs.com/articles3/cpu/intel-xeon-x3450-p1.html[/URL] Did anyone even consider an X3450 (possibly with 16Gb+ of RDIMM/UDIMM)? (Well, I know that [I]I[/I] didn't until tonight. I am mainly thinking about this in the context of an efficient algebra box with more than 12Gb of memory... Had to look up workstation boards like ASUS P7F-E or X. The desktop boards specifically appear to disclaim ECC even though [I]now[/I] it seems to cost them nothing to allow for it - the CPU does all the job anyway!) |
[QUOTE=Prime95;198637]In two-pass FFTs, you can reduce memory requirements with the cost of some extra complex multiplies. The current FFTs were optimized for a P4 where a cache line took ~150 clocks to read in. My Core i7 takes ~30 clocks to read a cache line. It makes sense to re-evaluate some of the FFT design choices in light of these new circumstances.[/QUOTE]
I agree that every new CPU that changes the balance in costs of any operation, be it +, *, etc. or memory access, should be accomponied by a reconsideration of the code structure. It's just that the small gains to be had may not be enough to justify the work of rewriting the code. I take it the changes between the P4(netburst) and i7(*nehelem) are finally enough to make it worthwhile? Yay. :) I, along with the others, eagerly await what this refactoring will bring. But, first, the vacation. Ahh, okay. You're trading fewer CPU operations for more memory operations. Makes sense to me. Happy hunting! Oh, and have fun on the vacation. We'll keep the CPUs warm while you're away. :) |
[QUOTE=petrw1;197956]Modest OC...I just used EasyTune and for the first attempt went to 3.0 Ghz. CoreTemp went to between 61 and 65.[/QUOTE]
Went to the next (top) level of OC using EasyTune. Default is 2.67; steps are 2.8, 3.0, 3.2. A few interesting observations: - Going from the default 2.67 to 3.0 changed the iteration times (1260FFT) from 0.024 to 0.021. Going from 3.0 to 3.2 changed the iteration times very little. Some cores to 0.020 and others to still 0.021 with the odd 0.020. I calculated (and hoped) it whould have dropped to just under 0.020 (I guess it depends if it started at near 0.02400 or closer to 0.02449 - The CoreTemp has not changed(!) actually so far it is 1 degree cooler. - The P4 equivalency factor changed back to 100%. It will now take a few weeks for it to adjust properly; in the mean time estimated completion times are about triple what they should be. |
The i5-750 DOES scale well...
[CODE]
DC Benchmark (FFT 1280) PC (1 core) All Cores DC 1 P1-S1 1 P1-S2 E6550 (Duo) 30 33 33 34 Q9550 (Quad) 22 28 29 30 i5-750 (Quad) 20 21 20 21 Note: i5 is OC'd to 3.2 [/CODE] To clarify the 4 numerical columns represent: 1. The time from the Benchmark page which represents the theoretical best time running DC one 1 core only. 2. My observed time when all (2 or 4) cores are running DC. 3. My observed DC times when 1 core is running P1-Stage 1 while the rest (1 or 3) are doing DC. 4. My observed DC times when 1 core is running P1-Stage 2 while the rest (1 or 3) are doing DC. |
Okay, so the i5 and the i7 run at full speed regardless of memory BW pressure. Good to know. Not like I wouldn't be aware of the impact that memory BW pressure has on Prime95 with my C2Q 6600 @ 3.2GHz. My good 1066-5-5-5-15 memory died and was temporarily replaced with 800-5-6-6-15 memory. Yes, the difference in Prime95 speed was *very* visible.
Short summary. 'dual bank' memory on 'Core' chipsets really aren't. They are, at best, two banks interleaved. If you want true dual banks, you need an AMD chip or an i5/i7. |
Note that i5 seems more flexible on overclocking than i7.
Luigi |
| All times are UTC. The time now is 23:26. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.