mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware

Reply
 
Thread Tools
Old 2008-11-26, 11:12   #12
ldesnogu
 
ldesnogu's Avatar
 
Jan 2008
France

10001001102 Posts
Default

Quote:
Originally Posted by S485122 View Post
I do not agree, on P4 D and on Core2 Quad the performance of Prime95 was proportional to the memory speed (measured from 533 MHz DDR2 to 1066 MHz DDR2.)
At equal core frequency and different memory speeds? Could you please provide detailed benchmarks? :)
ldesnogu is offline   Reply With Quote
Old 2008-11-26, 16:05   #13
S485122
 
S485122's Avatar
 
Sep 2006
Brussels, Belgium

13·131 Posts
Default

I, and others posted details in the Hardware subforum. I do not have the time to try to find them now (I did a quick search : the threads Quad Core and Quad Core and P95 should contain the necessary data.). All parameters except memory were constant (Motherboard, FSB speed, CPU.

Jacob
S485122 is offline   Reply With Quote
Old 2008-12-02, 03:42   #14
db597
 
db597's Avatar
 
Jan 2003

CB16 Posts
Default

Quote:
Originally Posted by Phantomas View Post
Yes, that's right. And when I interpret my results right, than each LL will run on one DualCore, so (my impression) one LL can use the 6MB L2 Cache alone, an it doesn't need to access the ordinary RAM so often.

But it seems to be important to run one test on Core [1,2], and the other on core [0,3]. Else my itteration time went up 20%.
Interesting... must be the combined cache kicking in. What settings are needed to ensure we run on core [1,2] and core [0,3]? Is it achievable only on 25.8 with the affinityscramble setting?
db597 is offline   Reply With Quote
Old 2008-12-02, 05:37   #15
petrw1
1976 Toyota Corona years forever!
 
petrw1's Avatar
 
"Wayne"
Nov 2006
Saskatchewan, Canada

22×3×17×23 Posts
Default

Quote:
Originally Posted by db597 View Post
Interesting... must be the combined cache kicking in. What settings are needed to ensure we run on core [1,2] and core [0,3]? Is it achievable only on 25.8 with the affinityscramble setting?
Yes, so says George after my attempts to do the same with 25.7 with mixed results. See post ...

http://www.mersenneforum.org/showpos...0&postcount=29

... and the next 3

Last fiddled with by petrw1 on 2008-12-02 at 05:38
petrw1 is online now   Reply With Quote
Old 2008-12-02, 17:29   #16
Phantomas
 
Phantomas's Avatar
 
Oct 2008
Germany, Hamburg

5·13 Posts
Default

Quote:
Originally Posted by db597 View Post
Interesting... must be the combined cache kicking in. What settings are needed to ensure we run on core [1,2] and core [0,3]? Is it achievable only on 25.8 with the affinityscramble setting?
Yes, only 25.8 gives you full control which core to use. But (at least) in my system I noticed, that the core-binding depends and varies on the FSB and/or CPU speed. Can't explain why, but it is reproducible....
See http://mersenneforum.org/showpost.ph...2&postcount=24
and http://mersenneforum.org/showpost.ph...2&postcount=26

Last fiddled with by Phantomas on 2008-12-02 at 17:31
Phantomas is offline   Reply With Quote
Old 2008-12-03, 07:39   #17
db597
 
db597's Avatar
 
Jan 2003

7×29 Posts
Default

Thanks guys. I'll download 25.8 and give it a try tonight. Even running 24/7, it takes me over a month to complete 1 LL (first time tests), so it's a welcoming thought to be able to get 2 workers on the same exponent without sacrificing any speed (or even get a tiny speedup is a fantastic bonus!).
db597 is offline   Reply With Quote
Old 2008-12-12, 13:34   #18
stars10250
 
stars10250's Avatar
 
Jul 2008
San Francisco, CA

110010012 Posts
Default

While examining the quad core performance of my system I noticed something interesting. When running 4 LL tests I get the equivalent of about 3.2 cores-worth of performance if I pick as a reference the speed of a single core operating on a single exponent. This is a well known issue and agrees with the observations of others (aka memory bottleneck). This made me initially think it is only minimally worth the effort of running the 4th core for LL, as getting 0.2x performance out of it isn't all that good. However, when I run 3 cores on LL I don't get 3 cores-worth of performance. I get 2.6. Only when I go down to 2 cores do I get twice the single core performance. So running the fourth core on LL has more than a 0.2 effect, as it takes me from 2.6 to 3.2. I believe others have noticed this too, as I've seen some recommend running 2 LL and 2 TF (instead of 3 LL and 1 TF). My quad is overclocked to 3.2GHz, with 1066DDR2 memory running at 533MHz, and yet I still see this behavior. Nonetheless, I'm happy with its performance as it far exceeds the stock performance and is exactly double that of my dual-core E8500 (3.16GHz) which I always thought was fast and not suffering from a memory bottleneck.

Last fiddled with by stars10250 on 2008-12-12 at 13:35
stars10250 is offline   Reply With Quote
Old 2008-12-14, 13:41   #19
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT/BST)

2×33×109 Posts
Default

Quote:
Originally Posted by stars10250 View Post
While examining the quad core performance of my system I noticed something interesting. When running 4 LL tests I get the equivalent of about 3.2 cores-worth of performance if I pick as a reference the speed of a single core operating on a single exponent. This is a well known issue and agrees with the observations of others (aka memory bottleneck). This made me initially think it is only minimally worth the effort of running the 4th core for LL, as getting 0.2x performance out of it isn't all that good. However, when I run 3 cores on LL I don't get 3 cores-worth of performance. I get 2.6. Only when I go down to 2 cores do I get twice the single core performance. So running the fourth core on LL has more than a 0.2 effect, as it takes me from 2.6 to 3.2. I believe others have noticed this too, as I've seen some recommend running 2 LL and 2 TF (instead of 3 LL and 1 TF). My quad is overclocked to 3.2GHz, with 1066DDR2 memory running at 533MHz, and yet I still see this behavior. Nonetheless, I'm happy with its performance as it far exceeds the stock performance and is exactly double that of my dual-core E8500 (3.16GHz) which I always thought was fast and not suffering from a memory bottleneck.
i bet if you remove your overclocking but keep the memory at the same speed it will scale better
henryzz is online now   Reply With Quote
Old 2008-12-14, 15:25   #20
stars10250
 
stars10250's Avatar
 
Jul 2008
San Francisco, CA

3×67 Posts
Default

Quote:
Originally Posted by henryzz View Post
i bet if you remove your overclocking but keep the memory at the same speed it will scale better

I tried this and did get better scaling but overall lower performance. Here are the numbers:

3.2 GHz Q6600 (8x), 400 MHz FSB, 533 MHz DRAM
...4 cores (0,1,2,3) ....3.2 core-equivalent performance (total # of iter in 1 hr: 239016)
...3 cores (1,2,3) .......2.6 core-equivalent performance
...2 cores (1,3) ..........2.0 core-equivalent performance
...1 core. (3) .............1.0 core-equivalent performance (48 ms iter time, M47.8)

2.8 GHz Q6600 (7x), 400 MHz FSB, 533 MHz DRAM
...4 cores (0,1,2,3) ....3.4 core-equivalent performance (total # of iter in 1 hr: 220699)
...3 cores (1,2,3) .......2.7 core-equivalent performance
...2 cores (1,3) ..........2.0 core-equivalent performance
...1 core. (3) .............1.0 core-equivalent performance (55 ms iter time, M47.8)

2.4 GHz Q6600 (6x), 400 MHz FSB, 533 MHz DRAM
...4 cores (0,1,2,3) ....3.5 core-equivalent performance (total # of iter in 1 hr: 200000)
...3 cores (1,2,3) .......2.8 core-equivalent performance
...2 cores (1,3) ..........2.0 core-equivalent performance
...1 core. (3) .............1.0 core-equivalent performance (63 ms iter time, M47.8)

Overall, the maximum number of iterations performed in a given time is achieved by running all 4 cores at the highest CPU (and memory) speed.
stars10250 is offline   Reply With Quote
Old 2008-12-14, 19:21   #21
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT/BST)

2×33×109 Posts
Default

exactly as i expected
computer speed isnt so based on cpu speed as people used to think
at some point i will so some benchmarks with different memory speeds to show the difference
henryzz is online now   Reply With Quote
Old 2008-12-15, 00:20   #22
jasong
 
jasong's Avatar
 
"Jason Goatcher"
Mar 2005

3·7·167 Posts
Default

This is not to cause a stink, but Prime95 is specifically made for Intel processors. I've heard opinions that modern AMD processors would kick butt if there were a publicly available LLR client for AMDs.

If one were made available publicly(the one I heard about is integer-based and probably still alpha) would it be something that a decent number of people would be interested in?

I guess I should be more direct: If an LLR client(Prime95 is an LLR client made specifically for Mersenne numbers) were made available for AMD computers, but making the same residues(when there's not an error) would a good chunk of the community be interested in using that program?

Last fiddled with by jasong on 2008-12-15 at 00:21
jasong is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Read-only error? Xyzzy Msieve 2 2015-11-06 01:20
PLEASE READ BEFORE POSTING! ewmayer Lounge 0 2006-04-12 18:48
I am sorry please read this meeztamike Miscellaneous Math 3 2006-01-03 01:47
chance of finding a factor?......Read me read me read me :) Firedog18 Software 9 2003-07-25 17:10
Please read!!!!! andi314 Lone Mersenne Hunters 1 2003-02-20 13:53

All times are UTC. The time now is 13:59.


Mon Aug 2 13:59:03 UTC 2021 up 10 days, 8:28, 0 users, load averages: 3.96, 3.23, 2.58

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.