mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware

Reply
 
Thread Tools
Old 2003-06-10, 18:23   #12
AP
 
Sep 2002

2×7 Posts
Default

Hi all,

any news about Pentium-M´s PRIME95-performance? Will there be a special version of the programme which will run faster on Pentium-M than it does now?

Regards
Achim
AP is offline   Reply With Quote
Old 2003-06-10, 18:38   #13
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

205716 Posts
Default

Quote:
any news about Pentium-M´s PRIME95-performance? Will there be a special version of the programme which will run faster on Pentium-M than it does now?
Pentium-M's poor performance remains a mystery. Without access to one it is impossible to investigate any further.
Prime95 is offline   Reply With Quote
Old 2003-07-18, 13:46   #14
patrik
 
patrik's Avatar
 
"Patrik Johansson"
Aug 2002
Uppsala, Sweden

52·17 Posts
Default

I also bought one now, Inspiron 500m.
Below the speed is given as about 600 MHz, but when Prime95 is running it increases to 1494 MHz, i.e. the processor was in a slower state when Prime95 read its speed, but stepped up to full speed when the iterations started.
---------------------
Intel(R) Pentium(R) M processor 1500MHz
CPU speed: 597.54 MHz
CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE, SSE2
L1 cache size: 32 KB
L2 cache size: 1024 KB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
TLBS: 128
Prime95 version 23.5, RdtscTiming=1
Best time for 384K FFT length: 42.585 ms.
Best time for 448K FFT length: 51.091 ms.
Best time for 512K FFT length: 57.896 ms.
Best time for 640K FFT length: 71.241 ms.
Best time for 768K FFT length: 87.245 ms.
Best time for 896K FFT length: 104.281 ms.
Best time for 1024K FFT length: 117.193 ms.
Best time for 1280K FFT length: 149.718 ms.
Best time for 1536K FFT length: 183.071 ms.
Best time for 1792K FFT length: 217.146 ms.
Best time for 2048K FFT length: 244.745 ms.
patrik is offline   Reply With Quote
Old 2003-08-15, 16:57   #15
Dresdenboy
 
Dresdenboy's Avatar
 
Apr 2003
Berlin, Germany

192 Posts
Default

Maybe here we see a reason...

The full thread can be found here:
http://www.aceshardware.com/forum?read=105030620

some snippet:
Code:
BLAS 
Banias' implementation of SSE2 seems really really crummy for some reason. Here's the results: 

Pentium-M 1300 mhz: 
DGEMM 
795.63459 SSE2 scalar 
963.83745 SSE2 packed 
853.25131 compiler 
1060.59689 x87 
SGEMM 
2669.249 SSE 
795.5295 compiler
The values seem to be MFLOPs for matrix multiply (D=double precision, S=single precision) in ScienceMark 2.0. And I think "compiler" means pure C-code.

Who could test Prime95 on Pentium M with disabled SSE2?

DDB
Dresdenboy is offline   Reply With Quote
Old 2003-08-15, 20:22   #16
patrik
 
patrik's Avatar
 
"Patrik Johansson"
Aug 2002
Uppsala, Sweden

42510 Posts
Default

With SSE2:Intel(R) Pentium(R) M processor 1500MHz
CPU speed: 597.56 MHz
CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE, SSE2
L1 cache size: 32 KB
L2 cache size: 1024 KB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
TLBS: 128
Prime95 version 23.5, RdtscTiming=1
Best time for 384K FFT length: 43.602 ms.
Best time for 448K FFT length: 51.810 ms.
Best time for 512K FFT length: 58.217 ms.
Best time for 640K FFT length: 71.767 ms.
Best time for 768K FFT length: 88.046 ms.
Best time for 896K FFT length: 104.792 ms.
Best time for 1024K FFT length: 118.280 ms.
Best time for 1280K FFT length: 151.527 ms.
Best time for 1536K FFT length: 184.617 ms.
Best time for 1792K FFT length: 218.575 ms.
Best time for 2048K FFT length: 247.353 ms.

With CpuSupportsSSE2=0 in local.ini:Intel(R) Pentium(R) M processor 1500MHz
CPU speed: 597.55 MHz
CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE
L1 cache size: 32 KB
L2 cache size: 1024 KB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
TLBS: 128
Prime95 version 23.5, RdtscTiming=1
Best time for 384K FFT length: 40.883 ms.
Best time for 448K FFT length: 51.315 ms.
Best time for 512K FFT length: 57.026 ms.
Best time for 640K FFT length: 74.742 ms.
Best time for 768K FFT length: 93.959 ms.
Best time for 896K FFT length: 110.191 ms.
Best time for 1024K FFT length: 128.112 ms.
Best time for 1280K FFT length: 168.165 ms.
Best time for 1536K FFT length: 199.251 ms.
Best time for 1792K FFT length: 245.195 ms.
Best time for 2048K FFT length: 267.028 ms.
patrik is offline   Reply With Quote
Old 2003-08-16, 18:45   #17
Dresdenboy
 
Dresdenboy's Avatar
 
Apr 2003
Berlin, Germany

192 Posts
Default

Thank you.

Seems, that Banias is just hindered by it's clock. For FFTs up to 512K the non-SSE2 code is even a little bit faster. It's like for Opteron where SSE2 uses the same fast fmul/fadd units while on P4 SSE2 uses faster units than normal x87 code.
Dresdenboy is offline   Reply With Quote
Old 2003-08-16, 19:24   #18
trif
 
trif's Avatar
 
Aug 2002

2·101 Posts
Default

Seems like more than just the clock. How do the timings compare to a Northwood 1.6 GHz with and without SSE2?
trif is offline   Reply With Quote
Old 2003-08-17, 02:20   #19
sdbardwick
 
sdbardwick's Avatar
 
Aug 2002
North San Diego Coun

821 Posts
Default

P4 1600 Northwood with SSE2:
[code:1]
Intel(R) Pentium(R) 4 CPU 1.60GHz
CPU speed: 1614.45 MHz
CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE, SSE2
L1 cache size: 8 KB
L2 cache size: 512 KB
L1 cache line size: 64 bytes
L2 cache line size: 128 bytes
TLBS: 64
Prime95 version 23.5, RdtscTiming=1
Best time for 384K FFT length: 23.685 ms.
Best time for 448K FFT length: 28.112 ms.
Best time for 512K FFT length: 32.148 ms.
Best time for 640K FFT length: 39.006 ms.
Best time for 768K FFT length: 47.399 ms.
Best time for 896K FFT length: 55.532 ms.
Best time for 1024K FFT length: 62.595 ms.
Best time for 1280K FFT length: 83.227 ms.
Best time for 1536K FFT length: 101.208 ms.
Best time for 1792K FFT length: 120.354 ms.
Best time for 2048K FFT length: 136.787 ms.
[/code:1]

With CpuSupportsSSE2=0 in local.ini:
[code:1]
Intel(R) Pentium(R) 4 CPU 1.60GHz
CPU speed: 1614.39 MHz
CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE
L1 cache size: 8 KB
L2 cache size: 512 KB
L1 cache line size: 64 bytes
L2 cache line size: 128 bytes
TLBS: 64
Prime95 version 23.5, RdtscTiming=1
Best time for 384K FFT length: 70.493 ms.
Best time for 448K FFT length: 84.616 ms.
Best time for 512K FFT length: 94.171 ms.
Best time for 640K FFT length: 121.662 ms.
Best time for 768K FFT length: 152.774 ms.
Best time for 896K FFT length: 180.725 ms.
Best time for 1024K FFT length: 206.301 ms.
Best time for 1280K FFT length: 269.297 ms.
Best time for 1536K FFT length: 322.897 ms.
Best time for 1792K FFT length: 383.031 ms.
Best time for 2048K FFT length: 438.059 ms.[/code:1]
sdbardwick is offline   Reply With Quote
Old 2003-08-17, 18:45   #20
Dresdenboy
 
Dresdenboy's Avatar
 
Apr 2003
Berlin, Germany

192 Posts
Default

And here a comparable Athlon system (found in the v23.5 benchmarks thread):
Quote:
Originally Posted by lycorn
AMD Athlon(TM) XP1800+
CPU speed: 1544.73 MHz
CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE
L1 cache size: 64 KB
L2 cache size: 256 KB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
L1 TLBS: 32
L2 TLBS: 256
Prime95 version 23.5, RdtscTiming=1
Best time for 384K FFT length: 43.030 ms.
Best time for 448K FFT length: 49.304 ms.
Best time for 512K FFT length: 53.280 ms.
Best time for 640K FFT length: 70.483 ms.
Best time for 768K FFT length: 84.218 ms.
Best time for 896K FFT length: 100.923 ms.
Best time for 1024K FFT length: 113.367 ms.
Best time for 1280K FFT length: 150.287 ms.
Best time for 1536K FFT length: 179.212 ms.
Best time for 1792K FFT length: 223.835 ms.
Best time for 2048K FFT length: 257.452 ms.
Here it looks like Banias is Intel's Athlon (that idea came up in some other board) ;) Clock for clock they are at equal speed using x87 code with a small gain for Banias running SSE2 code (thanks to easier register adressing and code density).
Dresdenboy is offline   Reply With Quote
Old 2003-08-18, 03:07   #21
QuintLeo
 
QuintLeo's Avatar
 
Oct 2002
Lost in the hills of Iowa

26·7 Posts
Default

More like the Banias uses the P-III core with some enhancements, like SSE2 support and a larger cache.

The P-III was VERY close in most cases in IPC to the Athlon - small add one way on some stuff, small add the other on other stuff, typically less than a 5% speed diff overall.


Intel just had MAJOR problems getting the P-III archetecture to ramp up in speed (the original 1 Ghz P-III got recalled 'cause it had too many problems), so they decided to completely change the archetecture for the P-IV.
QuintLeo is offline   Reply With Quote
Old 2003-08-18, 06:47   #22
Dresdenboy
 
Dresdenboy's Avatar
 
Apr 2003
Berlin, Germany

192 Posts
Default

Quote:
Originally Posted by QuintLeo
More like the Banias uses the P-III core with some enhancements, like SSE2 support and a larger cache.
I know. That was described in detail in some articles before the release of Banias to the public. It's better to say, it is based on the P-III core. They also modified the decoders (I think this was called "µOP Fusion") and changed everything so, that every unit can be dynamically switched on/off for power saving. So besides the faster FSB and larger cache these better decoder structures caused the higher SPEC scores clock for clock compared to P-III. It seems, they reached the same IPC level like the Athlon Palomino and newer cores.

DDB
Dresdenboy is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
29.2 benchmark help Prime95 Software 69 2017-05-23 23:49
Benchmark Estimate Primeinator Information & Answers 8 2009-06-11 23:39
Does anyone have i7 920? for Benchmark? cipher Twin Prime Search 2 2009-04-14 20:16
Not happy on Centrino delta_t NFSNET Discussion 7 2004-01-09 16:03
Centrino has problems with Prime95... magicfan241 Software 1 2003-11-03 20:46

All times are UTC. The time now is 16:12.


Fri Jul 7 16:12:34 UTC 2023 up 323 days, 13:41, 0 users, load averages: 2.02, 1.54, 1.29

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔