mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2005-05-31, 20:19   #56
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

23·1,021 Posts
Default

Quote:
Originally Posted by db597
When compiling some subroutines, it says "vectorised". That could be the culprit instead.
Vectorised means the compiler found a way to do two floating point ops with one instruction. This is the key (on a P4 using SSE2) to getting double the throughput. If the compiler cannot vectorise the code, then you are stuck with the same throughput as plain old x87 FPU code.
Prime95 is online now   Reply With Quote
Old 2005-05-31, 22:16   #57
Joël Harismendy
 
Joël Harismendy's Avatar
 
Aug 2004
France (Aquitaine)

168 Posts
Default Voici le test effectué sur ma machine ce soir.

[Tue May 31 23:31:32 2005]
Compare your results to other computers at http://www.mersenne.org/bench.htm
That web page also contains instructions on how your results can be included.

Intel(R) Pentium(R) 4 CPU 3.20GHz
CPU speed: 3207.39 MHz
CPU features: RDTSC, CMOV, Prefetch, MMX, SSE, SSE2
L1 cache size: 16 KB
L2 cache size: 1024 KB
L1 cache line size: 64 bytes
L2 cache line size: 128 bytes
TLBS: 64
Prime95 32-bit version 24.12, RdtscTiming=1
Time FFTlen=4K, Levels2=0, clm=0: 0.081 ms.
Time FFTlen=5K, Levels2=0, clm=0: 0.102 ms.
Time FFTlen=6K, Levels2=0, clm=0: 0.127 ms.
Time FFTlen=7K, Levels2=0, clm=0: 0.154 ms.
Time FFTlen=8K, Levels2=0, clm=0: 0.166 ms.
Time FFTlen=10K, Levels2=8, clm=4: 0.219 ms.
Time FFTlen=12K, Levels2=8, clm=4: 0.260 ms.
Time FFTlen=14K, Levels2=8, clm=4: 0.323 ms.
Time FFTlen=16K, Levels2=8, clm=4: 0.345 ms.
Time FFTlen=20K, Levels2=8, clm=4: 0.448 ms.
Time FFTlen=24K, Levels2=8, clm=4: 0.547 ms.
Time FFTlen=28K, Levels2=8, clm=4: 0.679 ms.
Time FFTlen=32K, Levels2=8, clm=4: 0.727 ms.
Time FFTlen=40K, Levels2=11, clm=4: 1.007 ms.
Time FFTlen=48K, Levels2=11, clm=4: 1.219 ms.
Time FFTlen=56K, Levels2=11, clm=4: 1.474 ms.
Time FFTlen=64K, Levels2=11, clm=4: 1.608 ms.
Time FFTlen=80K, Levels2=11, clm=4: 2.141 ms.
Time FFTlen=96K, Levels2=11, clm=4: 2.605 ms.
Time FFTlen=112K, Levels2=11, clm=4: 3.198 ms.
Time FFTlen=128K, Levels2=11, clm=4: 3.490 ms.
Time FFTlen=160K, Levels2=11, clm=4: 4.544 ms.
Time FFTlen=192K, Levels2=11, clm=4: 5.527 ms.
Time FFTlen=224K, Levels2=11, clm=4: 6.771 ms.
Time FFTlen=256K, Levels2=11, clm=4: 7.464 ms.
Time FFTlen=320K, Levels2=11, clm=4: 9.773 ms.
Time FFTlen=320K, Levels2=11, clm=4: 10.134 ms.
Time FFTlen=384K, Levels2=11, clm=4: 11.969 ms.
Time FFTlen=384K, Levels2=11, clm=4: 12.634 ms.
Time FFTlen=448K, Levels2=11, clm=4: 14.472 ms.
Time FFTlen=448K, Levels2=11, clm=4: 15.527 ms.
Time FFTlen=512K, Levels2=11, clm=4: 16.406 ms.
Time FFTlen=512K, Levels2=11, clm=2: 16.558 ms.
Time FFTlen=512K, Levels2=11, clm=1: 16.686 ms.
Time FFTlen=512K, Levels2=11, clm=4: 17.383 ms.
Time FFTlen=640K, Levels2=12, clm=4: 20.407 ms.
Time FFTlen=640K, Levels2=11, clm=8: 20.769 ms.
Time FFTlen=640K, Levels2=11, clm=4: 20.439 ms.
Time FFTlen=640K, Levels2=11, clm=2: 20.810 ms.
Time FFTlen=640K, Levels2=11, clm=1: 21.836 ms.
Time FFTlen=640K, Levels2=8, clm=1: 22.914 ms.
Time FFTlen=768K, Levels2=12, clm=4: 24.800 ms.
Time FFTlen=768K, Levels2=11, clm=4: 24.996 ms.
Time FFTlen=768K, Levels2=11, clm=2: 25.509 ms.
Time FFTlen=768K, Levels2=11, clm=1: 26.472 ms.
Time FFTlen=768K, Levels2=11, clm=1: 30.693 ms.
Time FFTlen=768K, Levels2=8, clm=1: 27.923 ms.
Time FFTlen=896K, Levels2=12, clm=4: 29.964 ms.
Time FFTlen=896K, Levels2=11, clm=4: 29.962 ms.
Time FFTlen=896K, Levels2=11, clm=2: 30.835 ms.
Time FFTlen=896K, Levels2=11, clm=1: 32.130 ms.
Time FFTlen=896K, Levels2=11, clm=1: 36.863 ms.
Time FFTlen=896K, Levels2=8, clm=1: 33.855 ms.
Time FFTlen=1024K, Levels2=12, clm=4: 33.632 ms.
Time FFTlen=1024K, Levels2=11, clm=4: 33.826 ms.
Time FFTlen=1024K, Levels2=11, clm=2: 34.622 ms.
Time FFTlen=1024K, Levels2=11, clm=1: 35.441 ms.
Time FFTlen=1024K, Levels2=11, clm=1: 40.703 ms.
Time FFTlen=1024K, Levels2=8, clm=1: 38.164 ms.
Time FFTlen=1280K, Levels2=13, clm=4: 44.265 ms.
Time FFTlen=1280K, Levels2=12, clm=4: 42.350 ms.
Time FFTlen=1280K, Levels2=11, clm=4: 44.512 ms.
Time FFTlen=1280K, Levels2=11, clm=2: 45.337 ms.
Time FFTlen=1280K, Levels2=11, clm=1: 47.114 ms.
Time FFTlen=1280K, Levels2=11, clm=1: 53.827 ms.
Time FFTlen=1536K, Levels2=13, clm=4: 53.583 ms.
Time FFTlen=1536K, Levels2=12, clm=4: 51.520 ms.
Time FFTlen=1536K, Levels2=11, clm=4: 54.132 ms.
Time FFTlen=1536K, Levels2=11, clm=2: 54.928 ms.
Time FFTlen=1536K, Levels2=11, clm=1: 56.787 ms.
Time FFTlen=1536K, Levels2=11, clm=1: 64.775 ms.
Time FFTlen=1792K, Levels2=13, clm=4: 65.109 ms.
Time FFTlen=1792K, Levels2=12, clm=4: 62.115 ms.
Time FFTlen=1792K, Levels2=11, clm=4: 65.089 ms.
Time FFTlen=1792K, Levels2=11, clm=2: 66.399 ms.
Time FFTlen=1792K, Levels2=11, clm=1: 68.588 ms.
Time FFTlen=1792K, Levels2=11, clm=1: 77.281 ms.
Time FFTlen=2048K, Levels2=13, clm=4: 72.979 ms.
Time FFTlen=2048K, Levels2=12, clm=4: 69.123 ms.
Time FFTlen=2048K, Levels2=11, clm=4: 72.724 ms.
Time FFTlen=2048K, Levels2=11, clm=2: 74.115 ms.
Time FFTlen=2048K, Levels2=11, clm=1: 76.767 ms.
Time FFTlen=2048K, Levels2=11, clm=1: 86.107 ms.
Time FFTlen=2560K, Levels2=13, clm=4: 91.366 ms.
Time FFTlen=2560K, Levels2=12, clm=4: 90.617 ms.
Time FFTlen=2560K, Levels2=12, clm=2: 93.312 ms.
Time FFTlen=2560K, Levels2=11, clm=2: 92.472 ms.
Time FFTlen=2560K, Levels2=11, clm=1: 95.622 ms.
Time FFTlen=2560K, Levels2=11, clm=1: 109.816 ms.
Time FFTlen=3072K, Levels2=13, clm=4: 111.032 ms.
[Tue May 31 23:36:33 2005]
Time FFTlen=3072K, Levels2=12, clm=4: 110.138 ms.
Time FFTlen=3072K, Levels2=12, clm=2: 112.246 ms.
Time FFTlen=3072K, Levels2=11, clm=2: 115.892 ms.
Time FFTlen=3072K, Levels2=11, clm=1: 118.505 ms.
Time FFTlen=3072K, Levels2=11, clm=1: 135.116 ms.
Time FFTlen=3584K, Levels2=13, clm=4: 133.286 ms.
Time FFTlen=3584K, Levels2=12, clm=4: 132.951 ms.
Time FFTlen=3584K, Levels2=12, clm=2: 135.589 ms.
Time FFTlen=3584K, Levels2=11, clm=2: 139.702 ms.
Time FFTlen=3584K, Levels2=11, clm=1: 143.545 ms.
Time FFTlen=3584K, Levels2=11, clm=1: 162.577 ms.
Time FFTlen=4096K, Levels2=13, clm=4: 149.720 ms.
Time FFTlen=4096K, Levels2=13, clm=2: 152.760 ms.
Time FFTlen=4096K, Levels2=12, clm=4: 148.850 ms.
Time FFTlen=4096K, Levels2=12, clm=2: 151.702 ms.
Time FFTlen=4096K, Levels2=11, clm=2: 155.541 ms.
Time FFTlen=4096K, Levels2=11, clm=1: 159.041 ms.
Time FFTlen=4096K, Levels2=11, clm=1: 180.927 ms.
Time FFTlen=5120K, Levels2=13, clm=4: 195.857 ms.
Time FFTlen=5120K, Levels2=13, clm=2: 201.852 ms.
Time FFTlen=5120K, Levels2=12, clm=4: 186.872 ms.
Time FFTlen=5120K, Levels2=12, clm=2: 189.250 ms.
Time FFTlen=5120K, Levels2=12, clm=1: 196.209 ms.
Time FFTlen=5120K, Levels2=12, clm=1: 223.416 ms.
Time FFTlen=5120K, Levels2=11, clm=1: 206.253 ms.
Time FFTlen=5120K, Levels2=11, clm=1: 233.297 ms.
Time FFTlen=6144K, Levels2=13, clm=4: 236.791 ms.
[Tue May 31 23:41:35 2005]
Time FFTlen=6144K, Levels2=13, clm=2: 240.638 ms.
Time FFTlen=6144K, Levels2=12, clm=4: 237.514 ms.
Time FFTlen=6144K, Levels2=12, clm=2: 235.479 ms.
Time FFTlen=6144K, Levels2=12, clm=1: 241.567 ms.
Time FFTlen=6144K, Levels2=11, clm=1: 251.771 ms.
Time FFTlen=6144K, Levels2=11, clm=1: 284.030 ms.
Time FFTlen=7168K, Levels2=13, clm=4: 284.626 ms.
Time FFTlen=7168K, Levels2=13, clm=2: 294.381 ms.
Time FFTlen=7168K, Levels2=12, clm=4: 293.611 ms.
Time FFTlen=7168K, Levels2=12, clm=2: 283.338 ms.
Time FFTlen=7168K, Levels2=12, clm=1: 292.341 ms.
Time FFTlen=7168K, Levels2=11, clm=1: 306.402 ms.
Time FFTlen=7168K, Levels2=11, clm=1: 343.604 ms.
Time FFTlen=8192K, Levels2=13, clm=4: 313.635 ms.
Time FFTlen=8192K, Levels2=13, clm=2: 320.222 ms.
Time FFTlen=8192K, Levels2=12, clm=4: 336.785 ms.
Time FFTlen=8192K, Levels2=12, clm=2: 310.891 ms.
[Tue May 31 23:46:50 2005]
Time FFTlen=8192K, Levels2=12, clm=1: 321.482 ms.
Time FFTlen=8192K, Levels2=11, clm=1: 345.232 ms.
Time FFTlen=8192K, Levels2=11, clm=1: 381.611 ms.
Time FFTlen=10240K, Levels2=13, clm=4: 396.255 ms.
Time FFTlen=10240K, Levels2=13, clm=2: 400.127 ms.
Time FFTlen=10240K, Levels2=12, clm=2: 411.073 ms.
Time FFTlen=10240K, Levels2=12, clm=1: 416.539 ms.
Time FFTlen=10240K, Levels2=12, clm=1: 470.059 ms.
Time FFTlen=12288K, Levels2=13, clm=4: 509.040 ms.
Time FFTlen=12288K, Levels2=13, clm=2: 498.231 ms.
Time FFTlen=12288K, Levels2=12, clm=2: 529.746 ms.
Time FFTlen=12288K, Levels2=12, clm=1: 516.208 ms.
[Tue May 31 23:52:22 2005]
Time FFTlen=12288K, Levels2=12, clm=1: 573.755 ms.
Time FFTlen=14336K, Levels2=13, clm=4: 635.748 ms.
Time FFTlen=14336K, Levels2=13, clm=2: 597.758 ms.
Time FFTlen=14336K, Levels2=12, clm=2: 677.349 ms.
Time FFTlen=14336K, Levels2=12, clm=1: 631.310 ms.
Time FFTlen=14336K, Levels2=12, clm=1: 692.449 ms.
Time FFTlen=16384K, Levels2=13, clm=4: 745.504 ms.
Time FFTlen=16384K, Levels2=13, clm=2: 662.271 ms.
[Tue May 31 23:57:34 2005]
Time FFTlen=16384K, Levels2=12, clm=2: 816.375 ms.
Time FFTlen=16384K, Levels2=12, clm=1: 735.943 ms.
Time FFTlen=16384K, Levels2=12, clm=1: 780.150 ms.
Time FFTlen=20480K, Levels2=13, clm=2: 893.250 ms.
Time FFTlen=20480K, Levels2=13, clm=1: 882.563 ms.
Time FFTlen=20480K, Levels2=13, clm=1: 991.682 ms.
[Wed Jun 01 00:02:40 2005]
Time FFTlen=24576K, Levels2=13, clm=2: 1151.838 ms.
Time FFTlen=24576K, Levels2=13, clm=1: 1087.809 ms.
Time FFTlen=24576K, Levels2=13, clm=1: 1207.410 ms.
Time FFTlen=28672K, Levels2=13, clm=2: 1521.905 ms.
Time FFTlen=28672K, Levels2=13, clm=1: 1323.539 ms.
[Wed Jun 01 00:08:31 2005]
Time FFTlen=28672K, Levels2=13, clm=1: 1456.507 ms.
Time FFTlen=32768K, Levels2=13, clm=2: 1875.072 ms.
Time FFTlen=32768K, Levels2=13, clm=1: 1552.439 ms.
Time FFTlen=32768K, Levels2=13, clm=1: 1657.770 ms.
Best time for 58 bit trial factors: 9.790 ms.
Best time for 59 bit trial factors: 8.549 ms.
Best time for 60 bit trial factors: 8.595 ms.
Best time for 61 bit trial factors: 8.603 ms.
Best time for 62 bit trial factors: 12.062 ms.
Best time for 63 bit trial factors: 12.091 ms.
Best time for 64 bit trial factors: 13.688 ms.
Best time for 65 bit trial factors: 13.678 ms.
Best time for 66 bit trial factors: 13.655 ms.
Best time for 67 bit trial factors: 13.583 ms.

I hope it will be usefull for you.
@+ Joël Harismendy
Joël Harismendy is offline   Reply With Quote
Old 2005-06-02, 04:40   #58
sHORTY
 
Jun 2005

616 Posts
Default

Running WinXP64 - willing to run 64 bit benchmarks if needed, let me know.

[Wed Jun 01 22:50:39 2005]
Compare your results to other computers at http://www.mersenne.org/bench.htm
That web page also contains instructions on how your results can be included.

AMD Athlon(tm) 64 Processor 3200+
CPU speed: 2210.82 MHz
CPU features: RDTSC, CMOV, Prefetch, 3DNow!, MMX, SSE, SSE2
L1 cache size: 64 KB
L2 cache size: 512 KB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
L1 TLBS: 32
L2 TLBS: 512
Prime95 32-bit version 24.12, RdtscTiming=1
Time FFTlen=10K, Levels2=8, clm=4: 0.334 ms.
Time FFTlen=12K, Levels2=8, clm=4: 0.396 ms.
Time FFTlen=14K, Levels2=8, clm=4: 0.482 ms.
Time FFTlen=16K, Levels2=8, clm=4: 0.553 ms.
Time FFTlen=20K, Levels2=8, clm=4: 0.702 ms.
Time FFTlen=24K, Levels2=8, clm=4: 0.845 ms.
Time FFTlen=28K, Levels2=8, clm=4: 1.001 ms.
Time FFTlen=32K, Levels2=8, clm=4: 1.127 ms.
Time FFTlen=40K, Levels2=11, clm=4: 1.556 ms.
Time FFTlen=48K, Levels2=11, clm=4: 1.953 ms.
Time FFTlen=56K, Levels2=11, clm=4: 2.282 ms.
Time FFTlen=64K, Levels2=11, clm=4: 2.709 ms.
Time FFTlen=80K, Levels2=11, clm=4: 3.620 ms.
Time FFTlen=96K, Levels2=11, clm=4: 4.221 ms.
Time FFTlen=112K, Levels2=11, clm=4: 5.223 ms.
Time FFTlen=128K, Levels2=11, clm=4: 6.234 ms.
Time FFTlen=160K, Levels2=11, clm=4: 7.822 ms.
Time FFTlen=192K, Levels2=11, clm=4: 9.426 ms.
Time FFTlen=224K, Levels2=11, clm=4: 11.134 ms.
Time FFTlen=256K, Levels2=11, clm=4: 12.551 ms.
Time FFTlen=320K, Levels2=11, clm=4: 14.136 ms.
Time FFTlen=320K, Levels2=11, clm=4: 15.785 ms.
Time FFTlen=384K, Levels2=11, clm=4: 17.518 ms.
Time FFTlen=384K, Levels2=11, clm=4: 19.476 ms.
Time FFTlen=448K, Levels2=11, clm=4: 21.082 ms.
Time FFTlen=448K, Levels2=11, clm=4: 23.541 ms.
Time FFTlen=512K, Levels2=11, clm=4: 23.888 ms.
Time FFTlen=512K, Levels2=11, clm=4: 26.071 ms.
Time FFTlen=640K, Levels2=12, clm=8: 30.488 ms.
Time FFTlen=640K, Levels2=12, clm=4: 29.353 ms.
Time FFTlen=640K, Levels2=11, clm=8: 30.948 ms.
Time FFTlen=640K, Levels2=11, clm=4: 30.748 ms.
Time FFTlen=768K, Levels2=12, clm=4: 36.005 ms.
Time FFTlen=768K, Levels2=11, clm=4: 37.609 ms.
Time FFTlen=768K, Levels2=11, clm=2: 37.117 ms.
Time FFTlen=896K, Levels2=12, clm=4: 43.700 ms.
Time FFTlen=896K, Levels2=11, clm=4: 44.815 ms.
Time FFTlen=896K, Levels2=11, clm=2: 44.856 ms.
Time FFTlen=1024K, Levels2=12, clm=4: 48.828 ms.
Time FFTlen=1024K, Levels2=11, clm=4: 50.207 ms.
Time FFTlen=1024K, Levels2=11, clm=2: 50.211 ms.
Time FFTlen=1280K, Levels2=13, clm=4: 66.340 ms.
Time FFTlen=1280K, Levels2=12, clm=4: 63.036 ms.
Time FFTlen=1280K, Levels2=11, clm=4: 66.039 ms.
Time FFTlen=1280K, Levels2=11, clm=2: 66.113 ms.
Time FFTlen=1280K, Levels2=11, clm=1: 64.788 ms.
Time FFTlen=1536K, Levels2=13, clm=4: 82.212 ms.
Time FFTlen=1536K, Levels2=12, clm=4: 76.866 ms.
Time FFTlen=1536K, Levels2=11, clm=4: 80.873 ms.
Time FFTlen=1536K, Levels2=11, clm=2: 82.636 ms.
Time FFTlen=1536K, Levels2=11, clm=1: 79.742 ms.
Time FFTlen=1792K, Levels2=13, clm=4: 97.579 ms.
Time FFTlen=1792K, Levels2=12, clm=4: 92.277 ms.
Time FFTlen=1792K, Levels2=11, clm=4: 98.244 ms.
Time FFTlen=1792K, Levels2=11, clm=2: 98.549 ms.
Time FFTlen=1792K, Levels2=11, clm=1: 95.883 ms.
Time FFTlen=1792K, Levels2=11, clm=1: 104.855 ms.
Time FFTlen=2048K, Levels2=13, clm=4: 111.893 ms.
Time FFTlen=2048K, Levels2=12, clm=4: 103.335 ms.
Time FFTlen=2048K, Levels2=11, clm=4: 113.062 ms.
Time FFTlen=2048K, Levels2=11, clm=2: 112.444 ms.
Time FFTlen=2048K, Levels2=11, clm=1: 110.858 ms.
Time FFTlen=2048K, Levels2=11, clm=1: 117.527 ms.
Time FFTlen=2560K, Levels2=13, clm=4: 141.937 ms.
Time FFTlen=2560K, Levels2=12, clm=4: 136.707 ms.
Time FFTlen=2560K, Levels2=11, clm=2: 141.356 ms.
Time FFTlen=2560K, Levels2=11, clm=1: 142.781 ms.
Time FFTlen=2560K, Levels2=11, clm=1: 151.581 ms.
Time FFTlen=3072K, Levels2=13, clm=4: 172.440 ms.
Time FFTlen=3072K, Levels2=12, clm=4: 164.792 ms.
Time FFTlen=3072K, Levels2=11, clm=2: 174.573 ms.
Time FFTlen=3072K, Levels2=11, clm=1: 173.302 ms.
Time FFTlen=3072K, Levels2=11, clm=1: 184.935 ms.
Time FFTlen=3584K, Levels2=13, clm=4: 206.586 ms.
Time FFTlen=3584K, Levels2=12, clm=4: 203.400 ms.
[Wed Jun 01 22:55:41 2005]
Time FFTlen=3584K, Levels2=12, clm=2: 200.116 ms.
Time FFTlen=3584K, Levels2=11, clm=2: 220.878 ms.
Time FFTlen=3584K, Levels2=11, clm=1: 210.592 ms.
Time FFTlen=3584K, Levels2=11, clm=1: 223.519 ms.
Time FFTlen=4096K, Levels2=13, clm=4: 230.477 ms.
Time FFTlen=4096K, Levels2=13, clm=2: 232.587 ms.
Time FFTlen=4096K, Levels2=12, clm=4: 233.871 ms.
Time FFTlen=4096K, Levels2=12, clm=2: 224.869 ms.
Time FFTlen=4096K, Levels2=11, clm=2: 269.888 ms.
Time FFTlen=4096K, Levels2=11, clm=1: 241.456 ms.
Time FFTlen=4096K, Levels2=11, clm=1: 257.267 ms.
Time FFTlen=5120K, Levels2=13, clm=4: 303.423 ms.
Time FFTlen=5120K, Levels2=13, clm=2: 304.560 ms.
Time FFTlen=5120K, Levels2=12, clm=4: 338.058 ms.
Time FFTlen=5120K, Levels2=12, clm=2: 283.509 ms.
Time FFTlen=5120K, Levels2=12, clm=1: 287.935 ms.
Time FFTlen=5120K, Levels2=11, clm=1: 356.841 ms.
Time FFTlen=5120K, Levels2=11, clm=1: 347.363 ms.
Time FFTlen=6144K, Levels2=13, clm=4: 371.925 ms.
Time FFTlen=6144K, Levels2=13, clm=2: 381.796 ms.
Time FFTlen=6144K, Levels2=12, clm=4: 497.503 ms.
Time FFTlen=6144K, Levels2=12, clm=2: 359.281 ms.
[Wed Jun 01 23:00:51 2005]
Time FFTlen=6144K, Levels2=12, clm=1: 352.328 ms.
Time FFTlen=6144K, Levels2=11, clm=1: 493.170 ms.
Time FFTlen=6144K, Levels2=11, clm=1: 441.553 ms.
Time FFTlen=7168K, Levels2=13, clm=4: 447.400 ms.
Time FFTlen=7168K, Levels2=13, clm=2: 445.667 ms.
Time FFTlen=7168K, Levels2=12, clm=4: 688.384 ms.
Time FFTlen=7168K, Levels2=12, clm=2: 473.028 ms.
Time FFTlen=7168K, Levels2=12, clm=1: 423.424 ms.
Time FFTlen=7168K, Levels2=11, clm=1: 629.790 ms.
Time FFTlen=7168K, Levels2=11, clm=1: 559.439 ms.
Time FFTlen=8192K, Levels2=13, clm=4: 512.229 ms.
Time FFTlen=8192K, Levels2=13, clm=2: 500.789 ms.
Time FFTlen=8192K, Levels2=12, clm=4: 831.657 ms.
Time FFTlen=8192K, Levels2=12, clm=2: 598.135 ms.
[Wed Jun 01 23:06:04 2005]
Time FFTlen=8192K, Levels2=12, clm=1: 485.508 ms.
Time FFTlen=8192K, Levels2=11, clm=1: 781.230 ms.
Time FFTlen=8192K, Levels2=11, clm=1: 659.122 ms.
Time FFTlen=10240K, Levels2=13, clm=4: 736.201 ms.
Time FFTlen=10240K, Levels2=13, clm=2: 631.099 ms.
Time FFTlen=10240K, Levels2=12, clm=2: 936.888 ms.
Time FFTlen=10240K, Levels2=12, clm=1: 677.468 ms.
Time FFTlen=10240K, Levels2=12, clm=1: 690.486 ms.
Time FFTlen=12288K, Levels2=13, clm=4: 1079.738 ms.
Time FFTlen=12288K, Levels2=13, clm=2: 776.189 ms.
[Wed Jun 01 23:11:31 2005]
Time FFTlen=12288K, Levels2=12, clm=2: 1292.248 ms.
Time FFTlen=12288K, Levels2=12, clm=1: 923.493 ms.
Time FFTlen=12288K, Levels2=12, clm=1: 872.223 ms.
Time FFTlen=14336K, Levels2=13, clm=4: 1465.257 ms.
Time FFTlen=14336K, Levels2=13, clm=2: 1033.963 ms.
Time FFTlen=14336K, Levels2=12, clm=2: 1634.469 ms.
Time FFTlen=14336K, Levels2=12, clm=1: 1242.544 ms.
[Wed Jun 01 23:17:06 2005]
Time FFTlen=14336K, Levels2=12, clm=1: 1110.328 ms.
Time FFTlen=16384K, Levels2=13, clm=4: 1736.965 ms.
Time FFTlen=16384K, Levels2=13, clm=2: 1263.497 ms.
Time FFTlen=16384K, Levels2=12, clm=2: 1925.906 ms.
Time FFTlen=16384K, Levels2=12, clm=1: 1528.661 ms.
Time FFTlen=16384K, Levels2=12, clm=1: 1306.528 ms.
[Wed Jun 01 23:22:44 2005]
Time FFTlen=20480K, Levels2=13, clm=2: 1874.792 ms.
Time FFTlen=20480K, Levels2=13, clm=1: 1522.919 ms.
Time FFTlen=20480K, Levels2=13, clm=1: 1524.264 ms.
Time FFTlen=24576K, Levels2=13, clm=2: 2694.478 ms.
Time FFTlen=24576K, Levels2=13, clm=1: 2041.614 ms.
[Wed Jun 01 23:28:41 2005]
Time FFTlen=24576K, Levels2=13, clm=1: 1924.441 ms.
Time FFTlen=28672K, Levels2=13, clm=2: 3527.380 ms.
Time FFTlen=28672K, Levels2=13, clm=1: 2751.240 ms.
Time FFTlen=28672K, Levels2=13, clm=1: 2428.551 ms.
[Wed Jun 01 23:35:32 2005]
Time FFTlen=32768K, Levels2=13, clm=2: 4166.205 ms.
Time FFTlen=32768K, Levels2=13, clm=1: 3350.865 ms.
Time FFTlen=32768K, Levels2=13, clm=1: 2879.025 ms.
Best time for 58 bit trial factors: 5.323 ms.
Best time for 59 bit trial factors: 5.321 ms.
Best time for 60 bit trial factors: 5.303 ms.
Best time for 61 bit trial factors: 5.320 ms.
Best time for 62 bit trial factors: 9.949 ms.
Best time for 63 bit trial factors: 9.925 ms.
Best time for 64 bit trial factors: 12.615 ms.
Best time for 65 bit trial factors: 12.531 ms.
Best time for 66 bit trial factors: 12.539 ms.
Best time for 67 bit trial factors: 12.529 ms.
sHORTY is offline   Reply With Quote
Old 2005-06-02, 07:26   #59
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

2×557 Posts
Default

Quote:
Originally Posted by db597
Having read your replies, I've just done a few more tests on the compiler options. In the code, there are some very small numbers in the order of 1E-10. Normally without the -xW switch, for one of the numbers I get -4.xxxxxE-10. Once I turn on the SSE2, I get +7.xxxxxxE-10. A big difference in the opposite direction!

My code has a lot of subtraction of numbers, since a differential:

dx
---
dy

in numerical simulation is (x1-x2)/(y1-y2). As TheJudger mentioned, if the numbers are very close to each other, small differences in the precision makes a big difference.

Still, I'm not entirely sure if it's the SSE2 or the other optimisations that the compiler does when I invoke the -xW switch. When compiling some subroutines, it says "vectorised". That could be the culprit instead.
You could try to run your program with bigger reals REAL*16 (if possible) or software-driven floatings with arbitary precission to see what happens... perhaps even your x87-results are far away from what you really want ;(
TheJudger is offline   Reply With Quote
Old 2005-06-11, 07:03   #60
RMAC9.5
 
RMAC9.5's Avatar
 
Jun 2003

32×17 Posts
Default

AMD Athlon(tm) XP 2400+
CPU speed: 2035.72 MHz
CPU features: RDTSC, CMOV, Prefetch, 3DNow!, MMX, SSE
L1 cache size: 64 KB
L2 cache size: 256 KB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
L1 TLBS: 32
L2 TLBS: 256
Prime95 32-bit version 24.12, RdtscTiming=1
Best time for 512K FFT length: 33.028 ms.
Best time for 640K FFT length: 45.434 ms.
Best time for 768K FFT length: 55.494 ms.
Best time for 896K FFT length: 67.619 ms.
Best time for 1024K FFT length: 75.656 ms.
Best time for 1280K FFT length: 96.741 ms.
Best time for 1536K FFT length: 116.474 ms.
Best time for 1792K FFT length: 140.557 ms.
Best time for 2048K FFT length: 157.839 ms.
Best time for 2560K FFT length: 214.248 ms.
Best time for 3072K FFT length: 261.975 ms.
Best time for 3584K FFT length: 316.809 ms.
Best time for 4096K FFT length: 354.172 ms.
Best time for 58 bit trial factors: 6.151 ms.
Best time for 59 bit trial factors: 6.156 ms.
Best time for 60 bit trial factors: 6.129 ms.
Best time for 61 bit trial factors: 6.126 ms.
Best time for 62 bit trial factors: 11.490 ms.
Best time for 63 bit trial factors: 11.535 ms.
Best time for 64 bit trial factors: 27.085 ms.
Best time for 65 bit trial factors: 27.653 ms.
Best time for 66 bit trial factors: 28.269 ms.
Best time for 67 bit trial factors: 28.182 ms.
RMAC9.5 is offline   Reply With Quote
Old 2005-06-11, 07:35   #61
outlnder
 
outlnder's Avatar
 
Aug 2002

2·3·53 Posts
Default

This system has a PCI video card. Everything else is the same as the second benchmark. Motherboard, Ram, PSU, OS.

Intel(R) Pentium(R) 4 CPU 3.20GHz
CPU speed: 3519.51 MHz
CPU features: RDTSC, CMOV, Prefetch, MMX, SSE, SSE2
L1 cache size: 16 KB
L2 cache size: 1024 KB
L1 cache line size: 64 bytes
L2 cache line size: 128 bytes
TLBS: 64
Prime95 32-bit version 24.12, RdtscTiming=1
Best time for 512K FFT length: 14.794 ms.
Best time for 640K FFT length: 19.220 ms.
Best time for 768K FFT length: 23.395 ms.
Best time for 896K FFT length: 27.566 ms.
Best time for 1024K FFT length: 31.651 ms.
Best time for 1280K FFT length: 39.083 ms.
Best time for 1536K FFT length: 47.293 ms.
Best time for 1792K FFT length: 56.141 ms.
Best time for 2048K FFT length: 63.335 ms.
Best time for 2560K FFT length: 81.853 ms.
Best time for 3072K FFT length: 100.654 ms.
Best time for 3584K FFT length: 119.830 ms.
Best time for 4096K FFT length: 135.195 ms.
Best time for 58 bit trial factors: 7.795 ms.
Best time for 59 bit trial factors: 7.810 ms.
Best time for 60 bit trial factors: 7.808 ms.
Best time for 61 bit trial factors: 7.788 ms.
Best time for 62 bit trial factors: 10.902 ms.
Best time for 63 bit trial factors: 10.920 ms.
Best time for 64 bit trial factors: 12.622 ms.
Best time for 65 bit trial factors: 12.630 ms.
Best time for 66 bit trial factors: 12.612 ms.
Best time for 67 bit trial factors: 12.662 ms.


Slower, but just a smidge.

Intel(R) Pentium(R) 4 CPU 3.20GHz
CPU speed: 3519.39 MHz
CPU features: RDTSC, CMOV, Prefetch, MMX, SSE, SSE2
L1 cache size: 16 KB
L2 cache size: 1024 KB
L1 cache line size: 64 bytes
L2 cache line size: 128 bytes
TLBS: 64
Prime95 32-bit version 24.12, RdtscTiming=1
Best time for 512K FFT length: 14.826 ms.
Best time for 640K FFT length: 19.283 ms.
Best time for 768K FFT length: 23.468 ms.
Best time for 896K FFT length: 27.726 ms.
Best time for 1024K FFT length: 31.725 ms.
Best time for 1280K FFT length: 39.022 ms.
Best time for 1536K FFT length: 47.539 ms.
Best time for 1792K FFT length: 56.220 ms.
Best time for 2048K FFT length: 63.582 ms.
Best time for 2560K FFT length: 82.120 ms.
Best time for 3072K FFT length: 101.099 ms.
Best time for 3584K FFT length: 119.798 ms.
Best time for 4096K FFT length: 135.728 ms.
Best time for 58 bit trial factors: 7.777 ms.
Best time for 59 bit trial factors: 7.809 ms.
Best time for 60 bit trial factors: 7.799 ms.
Best time for 61 bit trial factors: 7.832 ms.
Best time for 62 bit trial factors: 10.897 ms.
Best time for 63 bit trial factors: 10.917 ms.
Best time for 64 bit trial factors: 12.653 ms.
Best time for 65 bit trial factors: 12.618 ms.
Best time for 66 bit trial factors: 12.685 ms.
Best time for 67 bit trial factors: 12.694 ms.

Last fiddled with by outlnder on 2005-06-11 at 07:36
outlnder is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Benchmarks MurrayInfoSys Information & Answers 3 2011-04-14 17:10
LLR benchmarks Oddball No Prime Left Behind 11 2010-08-06 21:39
benchmarks Unregistered Information & Answers 15 2009-08-18 16:44
Benchmarks for i7 965 lavalamp Hardware 21 2009-01-06 04:32
Benchmarks Vandy Hardware 6 2002-10-28 13:45

All times are UTC. The time now is 20:39.


Tue Feb 7 20:39:12 UTC 2023 up 173 days, 18:07, 1 user, load averages: 1.36, 1.01, 1.02

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔