mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Software (https://www.mersenneforum.org/forumdisplay.php?f=10)
-   -   Beta version 24.6 - Athlon users wanted (https://www.mersenneforum.org/showthread.php?t=3387)

stippix 2004-12-17 07:33

small bugs ...
 
Hi all!

Prime95 v24.6 has still some minor bugs inside ... (this holds for the 7. Dec and 14. Dec. 2004 versions):

1.) for example M7, M19, ... crash Prime95 completely
2.) for example M31,M61, M107, ... give shift counter errors/corruptions
3.) slower than 23.8 for 4096K in general

greetings stippix

Dresdenboy 2004-12-17 09:21

[QUOTE=sdbardwick]The 24.6 version increases my CPU temps by about 5 degrees Fahrenheit![/QUOTE]That's ok. The faster code means the same work is done in shorter time. More work done per time unit means more power consumed by the CPU (less idling, which was caused by waiting for L2 data), and more consumed power means more heat produced.

stippix 2004-12-17 11:13

24.6 good for duallies
 
Hi all!

I discovered the following with 24.6:

M20996011: 107.912 ms/it for 1 thread, 118.936 ms/it for 2 threads in parallel
M34286443: 156.622 ms/it for 1 thread, 167.962 ms/it for 2 threads in parallel
M79299959: 395.164 ms/it for 1 thread, 404.041 ms/it for 2 threads in parallel

so the second parallel running thread on a dual cpu machine does not slow down the first one too much ...

23.8:

M20996011: 136.390 ms/it for 1 thread, 195.300 ms/it for 2 threads in parallel
M34286443: 198.750 ms/it for 1 thread, 277.630 ms/it for 2 threads in parallel
M79299959: 573.540 ms/it for 1 thread, 754.360 ms/it for 2 threads in parallel

so in the old version the slow down on a duallie is much much worse ...

greetings stippix

stippix 2004-12-17 15:09

Sorry, I forgot to post the system specs:

Dual Athlon MP 2200+, 1GB RAM


And also remember: Athlon XP/MP systems have only one memory channel ...

greetings stippix

Jeff Gilchrist 2004-12-17 17:47

Interesting observation. I know each instance of Prim95 on my dual Xeon box slows down quite a bit if I have two Prime95's running compared just one, I wonder if it will make a difference on a P4 as well. Time to check it out...

Jeff Gilchrist 2004-12-17 19:44

As I suspected, since the SSE2 code wasn't changed there was no difference.

With both the 24.6 and the older code with a P4 2.8 Xeon dual processor box, for an M34323451 size exponent running two instances of Prime95 took [b]0.075[/b] sec/iter while only running one took [b]0.066[/b] sec/iter.

Turning off the SSE2 code, the speed dropped down to [b]0.146[/b] sec/iter so obviously no point in doing that.

sdbardwick 2004-12-17 20:42

[QUOTE=stippix]
I discovered the following with 24.6:

M20996011: 107.912 ms/it for 1 thread, 118.936 ms/it for 2 threads in parallel[/QUOTE]stippix, by "2 threads in parallel" do you mean you have 2 instances of Prime95 running simultaneously?
If so, is processor affinity set correctly? Just asking, because I get very little increase in iteration time when running two processes compared to one. (24.6, 2x MP1900+, 256K L2)
M20996011: 119 ms/it for 1 instance, 123 ms/it on each of 2 instances

sdbardwick 2004-12-17 21:01

[Oops! Hit wrong button]

24.6 does bring large improvements to my box as well, esp. regarding 2 instances. Here's the 23.8 numbers:
M20996011: 138 ms/it for 1 instance, 166 ms/it on each of 2 instances

E_tron 2004-12-18 07:46

I'm running M13474399 on my barton chip with v 24.6. It appears to be stable.

I've done some power testing with the new code and it seems to be using a little more electrical power. For example, with version 23.8, i can run my barton @ 2.2ghz with a 1.65 Vcore(prime stable), but with version 24.6 i can't. I must up to a 1.675 Vcore to be stable. The same senario plays out on this chip at multiple clock settings (i tested 1.4-2.5). The gap got worse as the clock speed increased. I'll see if i get the same results from other chips as well.

garo 2004-12-18 09:33

E_tron, that is to be expected as the new code is more efficient and hence does more work per unit time and therefore needs more power.

cheesehead 2004-12-18 21:37

[QUOTE=Dresdenboy]That's ok. The faster code means the same work is done in shorter time. More work done per time unit means more power consumed by the CPU (less idling, which was caused by waiting for L2 data), and more consumed power means more heat produced.[/QUOTE][quote=garo]E_tron, that is to be expected as the new code is more efficient and hence does more work per unit time and therefore needs more power.[/quote]Whoa, guys! Let's not confuse (a) "work" in the sense of accomplishing a calculation, and (b) "work" in the thermodynamic sense.

It's true that there is an ultimate lower limit on the amount of thermodynamic work needed to perform a given calculation, but that's not what we're talking about here -- we're a long, long way from that limit.

Just because the faster code does the same [i]calculation work[/i] in a shorter time doesn't necessarily mean that it causes greater power consumption. An increase in calculation speed theoretically might have been achieved by using instructions which collectively cause the CPU to consume less power (performing less thermodynamic work) than before. Now, in our actual case it seems that the increase in speed was done primarily by eliminating some of the time waiting for memory accesses, which does indeed increase CPU power consumption, but that's not necessarily so. It's not justified to say that both kinds of work necessarily change in the same direction.

Also, "more efficient" needs to be considered in the context of what efficiency is being measured. Efficiency = [i]calculation work[/i] / elapsed time? or = [i]thermodynamic work[/i] / [i]calculation work[/i]? or ... what? If one lowered the electrical power required to perform an L-L test, would you really object to saying that that was [i]more[/i] efficient because it used [i]less[/i] power?


All times are UTC. The time now is 07:24.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.