![]() |
![]() |
#1 |
Oct 2008
4016 Posts |
![]()
Hello, this is my first post, and I must say I feel overwhelmed by all the mathematical terminology that goes around in this place
![]() Anyhoo I'm a regular folder, so I am in the 'know' in regards to distributed computing and pc hardware. I know that for folding, the general consensus is that the bigger the L2 cache, the quicker the cpu can process data. Does this apply to gimps? Also, (don't quote me for this) I heard that AMD's are better for gimps as they have the integrated memory controller. Is this true? Is it better to have a lower clocked, say, Athlon X3 than a higher clocked intel e5200 proc? (these cpu's are the one's I'm thinking of getting). What about Phenoms, are they any good for gimps? And how much RAM should you dedicate to the application? (the default being 8mb). Finally, I've downloaded the latest client, and it shows on the main screen Worker #1 and Worker #2. I'm assuming these are the 2 instances of the work being divided on my dual core? (my cpu is running at 100% in task manager). Sorry for the noob questions, I've tried the FAQs and what not, but there's still a lot I don't get. Cheers hj |
![]() |
![]() |
![]() |
#2 |
Aug 2002
Termonfeckin, IE
3·919 Posts |
![]()
No AMD is NOT better than Intel for Prime95. The Phenom are better but I still don't think there is parity at clockspeed.
A bigger CPU cache helps, but only to a degree. Prime95 is coded in assembly and has been optimized for certain cache sizes. I believe the benefits are marginal after 1 or 2MB per core. The determining factor for Prime95 is memory bandwidth. Unless you are doing P-1 factoring - which most people don't - 128MB should be more than enough. If you do trial factoring, 8MB is enough. Yes the two workers are for two cores. |
![]() |
![]() |
![]() |
#3 |
Oct 2008
4016 Posts |
![]()
Hi, thanks for your reply.
So for clarification, would and AMD 9550 be better or worse off than an overclocked e5200? Cheers |
![]() |
![]() |
![]() |
#4 | |||
"Richard B. Woods"
Aug 2002
Wisconsin USA
769210 Posts |
![]() Quote:
That is, on most current CPUs, prime95's main compute loops will execute about as fast as the memory controller can feed the caches. So, if you can determine that speed, that's the best single measure of potential prime95 speed. Furthermore, if on a multi-core system you have prime95 running on more than one core, that limitation is true for each of the instances. If each core has its own dedicated memory controller, that will usually be faster than a system with only one memory controller shared by all cores, because usually a single memory controller cannot feed multiple caches all at their top speeds simultaneously. Note that I wrote "caches"! If all cores share a single cache, there may be contention between cores for cache space. A dual-core system with a single shared 1MB L2 cache may be slower than an otherwise-identical system that has a 512KB L2 cache dedicated to each core. If you read some of the benchmark threads, you'll see examples where 2 simultaneous L-Ls will each have a slightly slower iteration speed than a single L-L running alone, and that's usually because of memory contention. Three or four simultaneous L-Ls on separate cores will each have significantly slower iterations than when fewer instances are simultaneous. This nonlinearity may be caused by either a shared memory controller, a shared cache (usually L2), or both. One recommended way to get around this limitation is to assign trial factoring (TF) to one or two cores, and do L-L testing on the other(s), because TF uses less memory than any other function. If you want to do four (, eight, whatever) simultaneous L-Ls, you can go ahead and do so; it just won't give you four (, eight, whatever) times the total throughput of a single L-L running alone, so from a GIMPS perspective (though not from a $150,000 prize-winning perspective) sharing a bit of TF with L-Ls is better. Quote:
"Daytime available memory" and "Nighttime available memory" are only for you to specify how much extra memory prime95 can use for special workareas during stage 2 of P-1 factoring or stage 2 of ECM factoring. (Yes, it ought to say so more prominently!) The default values of 8MB there are enough for prime95 to perform those stages, but if you specify higher amounts, it can search faster/farther for factors during P-1 stage 2 and ECM stage 2. A few past threads where this stuff has been discussed: http://mersenneforum.org/showthread.php?t=2157 http://mersenneforum.org/showthread.php?t=3828 http://mersenneforum.org/showthread.php?t=10198 Yes, most people will not be performing explicit P-1 assignments. Quote:
Last fiddled with by cheesehead on 2008-10-27 at 08:36 Reason: Revised responses to garo. |
|||
![]() |
![]() |
![]() |
#5 | |
Jan 2008
France
3·181 Posts |
![]() Quote:
I know what you meant by the above statement, I just wanted to make it clearer in case some beginner reads too quickly ![]() |
|
![]() |
![]() |
![]() |
#6 | |
"Richard B. Woods"
Aug 2002
Wisconsin USA
22×3×641 Posts |
![]() Quote:
I was thinking of memory bandwidth in terms of how fast contents of RAM could be transferred to (and from) L2 (and L1) cache. I was assuming that data transfer between L1 cache and any more-inner parts of a CPU would always be at least as fast as RAM->L2->L1. (So, does that put me in the "many people" category, or not?) Can you explain more about memory bandwidth, and tighten-up any previous statements I made that need such? Will you please interpret for us the meaning of the "Sandra XII SP1 Memory Bandwidth" chart at http://www.legitreviews.com/article/597/4 and explain what's shown there that is, or is not, relevant to prime95 performance? Last fiddled with by cheesehead on 2008-10-27 at 12:06 |
|
![]() |
![]() |
![]() |
#7 | |
Jan 2008
France
3×181 Posts |
![]() Quote:
The reason the Phenom is so much faster is the integrated memory controller. The soon-to-be-released core i7 will also have one, and will surely fly (IIRC, they can reach 16 GB/s using 3 banks of DDR3). The obvious conclusion is that the bandwidth with main memory is not enough to qualify the speed of GIMPS, given how Phenom and C2 compare :) There are many other factors that come into play from the memory subsystem point of view (where "memory subsystem" is made of all the components that are between the RAM and the computation units); as examples: - efficiency of preload instructions (how many can be in fly? do they block other parts of the processor?) - efficiency of TLB (how many entries in the TLB? number of levels of TLB?) - cache access latency and bandwidth. This list can be very long. As usual when comparing two things a single criterion is far from enough ![]() As far as prime95 is concerned, I don't know the source code enough to tell you how all these factors play a role; what I can say is that: - the TLB plays a very important role; a single entry usually maps 4 Kb data and you will very quickly run out of TLB entries when playing with huge data sets; so more entries at level 0 is very important (I don't know the numbers for Phenom and C2) - prefetching is also extremely important, and it looks like the C2 is much more efficient here - C2 has a much higher bandwidth to L1 and L2 caches IIRC, and for prime95 it's very important. To sum up: memory subsystems are hugely complex beasts that can be more difficult to design than a CPU and also more difficult to use efficiently in a program ![]() I'm afraid this was very technical... Last fiddled with by ldesnogu on 2008-10-27 at 13:59 |
|
![]() |
![]() |
![]() |
#8 |
Sep 2006
Brussels, Belgium
110011100112 Posts |
![]()
I cannot compare AMD's products to those of Intel, but I know by experience that memory chips speed is a limiting factor of the first order. On one and the same processor (P4 or Core2Duo Quad) Prime 95 performance is directly proportional to memory speed. It is possible that the memory controller speed kicks in as a limiting factor with very fast DDR3, but I have no experience about that.
Jacob |
![]() |
![]() |
![]() |
#9 |
Oct 2008
26 Posts |
![]()
Yeah this is pretty technical. So basically intel/amd's trade performance with each other in some tests?
So is it possible to answer somewhat my original question? (whether an e5200 is better for prime95 than a phenom 9550?). Sorry for the [annoying] questions, I just want to get this cleared up :) Cheers people ;) |
![]() |
![]() |
![]() |
#10 |
Aug 2002
North San Diego County
2×11×31 Posts |
![]()
The Phenom 9550 will do more work per unit of time, but it will use more power. My 9500 is equal to about 2 of my E7200 boxes (although the 7200s are oddly slow - probably due to the cheapo G31 boards they are on).
|
![]() |
![]() |
![]() |
#11 |
Oct 2008
Germany, Hamburg
5×13 Posts |
![]()
hi, please take a look at
http://v5www.mersenne.org/report_benchmarks/ here you can compare many cpu-types and performances directly. (no e5200 now, but my appear later...) But keep in mind, that the times noted are only for one core activ. So they didn't show the panelty when all cores are active and using the memory I own a Q9450 @3200 and had a very lucky hand with my memory-sticks. They are running at 1200Mhz (original speed is 800Mhz). So all my 4 Cores can run with nearly no panelty (2560K FFT) at 55ms/Iteration. (47ms one core active, which is faster then the fastest noted Phenom :-) Phenom is running at 2.4 the core 2 at 2.5 Mhz, so no big difference here. But when you think of overclocking, the Core 2 is much better. Some guys have managed stable 4GHz (woudn't try that with Prime) Would produce less heat, and use less power then Phenom. With a Dualcore (phenom or Intel) the memorybandwith shouldn't be that problem. Last fiddled with by Phantomas on 2008-10-28 at 20:20 |
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Newb who needs help with PC | EddieTheBear | Hardware | 19 | 2015-10-23 13:22 |
Newb question | PicGrabber | Msieve | 20 | 2014-10-31 20:06 |
I have a few questions about getting my GPU working for GIMPS | Red Raven | GPU Computing | 73 | 2014-10-13 20:26 |
Newb help (it crashes) | Proggie | Software | 4 | 2005-01-05 07:35 |
linux question ( newb) | crash893 | Software | 2 | 2003-12-26 18:50 |