mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Hardware (https://www.mersenneforum.org/forumdisplay.php?f=9)
-   -   Perpetual benchmark thread... (https://www.mersenneforum.org/showthread.php?t=59)

R.D. Silverman 2009-12-07 12:30

NFS Threads
 
[QUOTE=S485122;196263]Looking at Garo's benchmark that seems not to be true any more with Intel's i5 (and i7) architecture that seems to have the necessary memory bandwidth : the slowdown for 4 threads is less than 10 % for FFT sizes from 1792 KB to 5120 KB.

But I agree that benchmarks with 4 different tests in parallel would be very interesting and would probably suffer less from inter core communication bottlenecks.

Jacob[/QUOTE]

I just bought a new toy: an I7/920 with 12Gbytes.
This is a 4-core machine with HT @2.66GHz. I could have bought one
at 3.3Ghz, but the cost was 60% higher for only 25% more speed.
(another $600 for the i7/975)

I have benchmark data for SNFS with my current number: 2,1149-
The factor bases are each 2.8million ideals, the sieve region is 8K x 16K,
and the LP bound is 800Million (i.e. betweeen 29 and 30 bits)

The data given is the number of special-q's processed per minute.


1 core; 7.4
2 : 15
3 : 22
4 : 29
5 : 31.5
6 : 34
7 : 37
8 : 40

This is for code compiled as 32-bit code. I intend to recompile as 64-bit.
However, I am busy with 60 hour weeks on my real job, so this will take
a while.

starrynte 2009-12-23 01:14

[code][Mon Dec 21 11:46:07 2009]
Compare your results to other computers at http://www.mersenne.org/report_benchmarks
Intel(R) Core(TM) i5 CPU 750 @ 2.67GHz
CPU speed: 2660.19 MHz, 2 hyperthreaded cores
CPU features: RDTSC, CMOV, Prefetch, MMX, SSE, SSE2, SSE4
L1 cache size: 32 KB
L2 cache size: 256 KB, L3 cache size: 8 MB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
TLBS: 64
Prime95 64-bit version 25.9, RdtscTiming=1
Best time for 768K FFT length: 11.604 ms.
Best time for 896K FFT length: 14.140 ms.
Best time for 1024K FFT length: 15.979 ms.
Best time for 1280K FFT length: 20.101 ms.
Best time for 1536K FFT length: 24.264 ms.
Best time for 1792K FFT length: 29.377 ms.
Best time for 2048K FFT length: 33.267 ms.
Best time for 2560K FFT length: 44.351 ms.
Best time for 3072K FFT length: 54.405 ms.
Best time for 3584K FFT length: 65.842 ms.
Best time for 4096K FFT length: 74.242 ms.
Best time for 5120K FFT length: 95.047 ms.
Best time for 6144K FFT length: 113.294 ms.
Best time for 7168K FFT length: 138.047 ms.
Best time for 8192K FFT length: 155.555 ms.
Timing FFTs using 2 threads on 1 physical CPUs.
Best time for 768K FFT length: 6.152 ms.
Best time for 896K FFT length: 7.403 ms.
Best time for 1024K FFT length: 8.465 ms.
Best time for 1280K FFT length: 10.617 ms.
Best time for 1536K FFT length: 12.875 ms.
Best time for 1792K FFT length: 15.602 ms.
Best time for 2048K FFT length: 17.595 ms.
Best time for 2560K FFT length: 23.082 ms.
Best time for 3072K FFT length: 27.964 ms.
Best time for 3584K FFT length: 33.929 ms.
Best time for 4096K FFT length: 38.249 ms.
Best time for 5120K FFT length: 49.172 ms.
Best time for 6144K FFT length: 59.253 ms.
Best time for 7168K FFT length: 73.083 ms.
Best time for 8192K FFT length: 82.076 ms.
Timing FFTs using 4 threads on 2 physical CPUs.
Best time for 768K FFT length: 4.839 ms.
Best time for 896K FFT length: 5.510 ms.
Best time for 1024K FFT length: 7.375 ms.
Best time for 1280K FFT length: 6.883 ms.
Best time for 1536K FFT length: 8.144 ms.
Best time for 1792K FFT length: 9.401 ms.
Best time for 2048K FFT length: 10.752 ms.
Best time for 2560K FFT length: 13.527 ms.
Best time for 3072K FFT length: 16.757 ms.
Best time for 3584K FFT length: 19.925 ms.
Best time for 4096K FFT length: 22.849 ms.
Best time for 5120K FFT length: 29.466 ms.
Best time for 6144K FFT length: 39.497 ms.
Best time for 7168K FFT length: 47.455 ms.
Best time for 8192K FFT length: 55.049 ms.
Best time for 58 bit trial factors: 2.428 ms.
Best time for 59 bit trial factors: 2.455 ms.
Best time for 60 bit trial factors: 2.479 ms.
Best time for 61 bit trial factors: 2.737 ms.
Best time for 62 bit trial factors: 2.877 ms.
Best time for 63 bit trial factors: 3.383 ms.
Best time for 64 bit trial factors: 4.177 ms.
Best time for 65 bit trial factors: 4.417 ms.
Best time for 66 bit trial factors: 4.626 ms.
Best time for 67 bit trial factors: 4.576 ms.[/code](Prime95 incorrectly thinks I have 2 hyperthreaded cores (actually 4 physical cores))
Note that the times for 1 and 2 threads may be faster than expected due to turbo boost

Mini-Geek 2009-12-23 03:54

[quote=starrynte;199654](Prime95 incorrectly thinks I have 2 hyperthreaded cores (actually 4 physical cores))[/quote]
Here's a way to override it and make it know what the CPU really is: (from undoc.txt)
[CODE]The program automatically computes the number of CPUs, hyperthreading, and speed.
This information is used to calculate how much work to get.
If the program did not correctly figure out your CPU information,
you can override the info in local.txt:
NumCPUs=n
CpuNumHyperthreads=1 or 2
CpuSpeed=s
Where n is the number of physical CPUs or cores, not logical CPUs created by
hyperthreading. Choose 1 for non-hyperthreaded and 2 for hyperthreaded. Finally,
s is the speed in MHz.
[/CODE](yes, I know there's a possibility that you already knew this, and were pointing it out as a bug needing fixing...just figured it might help)

hj47 2010-01-16 03:29

Core i5 3.2GHz & DDR3 @ 2GHz
 
[CODE][Sat Jan 16 14:23:00 2010]
Compare your results to other computers at [URL]http://www.mersenne.org/report_benchmarks[/URL]
Intel(R) Core(TM) i5 CPU 750 @ 2.67GHz
CPU speed: 3200.00 MHz, 4 cores
CPU features: RDTSC, CMOV, Prefetch, MMX, SSE, SSE2, SSE4
L1 cache size: 32 KB
L2 cache size: 256 KB, L3 cache size: 8 MB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
TLBS: 64
Prime95 64-bit version 25.9, RdtscTiming=1
Best time for 768K FFT length: 11.587 ms.
Best time for 896K FFT length: 14.058 ms.
Best time for 1024K FFT length: 15.821 ms.
Best time for 1280K FFT length: 19.929 ms.
Best time for 1536K FFT length: 24.000 ms.
Best time for 1792K FFT length: 29.000 ms.
Best time for 2048K FFT length: 32.711 ms.
Best time for 2560K FFT length: 43.344 ms.
Best time for 3072K FFT length: 52.723 ms.
Best time for 3584K FFT length: 64.006 ms.
Best time for 4096K FFT length: 72.155 ms.
Best time for 5120K FFT length: 92.282 ms.
Best time for 6144K FFT length: 110.128 ms.
Best time for 7168K FFT length: 134.087 ms.
Best time for 8192K FFT length: 150.194 ms.
Timing FFTs using 2 threads.
Best time for 768K FFT length: 6.040 ms.
Best time for 896K FFT length: 7.331 ms.
Best time for 1024K FFT length: 8.382 ms.
Best time for 1280K FFT length: 10.397 ms.
Best time for 1536K FFT length: 12.526 ms.
Best time for 1792K FFT length: 15.116 ms.
Best time for 2048K FFT length: 17.055 ms.
Best time for 2560K FFT length: 22.350 ms.
Best time for 3072K FFT length: 27.049 ms.
Best time for 3584K FFT length: 32.737 ms.
Best time for 4096K FFT length: 36.823 ms.
Best time for 5120K FFT length: 47.122 ms.
Best time for 6144K FFT length: 56.323 ms.
Best time for 7168K FFT length: 68.502 ms.
Best time for 8192K FFT length: 76.710 ms.
Timing FFTs using 3 threads.
Best time for 768K FFT length: 4.719 ms.
Best time for 896K FFT length: 5.320 ms.
Best time for 1024K FFT length: 7.018 ms.
Best time for 1280K FFT length: 7.032 ms.
Best time for 1536K FFT length: 8.427 ms.
Best time for 1792K FFT length: 10.221 ms.
Best time for 2048K FFT length: 11.446 ms.
Best time for 2560K FFT length: 15.141 ms.
Best time for 3072K FFT length: 18.297 ms.
Best time for 3584K FFT length: 22.220 ms.
Best time for 4096K FFT length: 25.163 ms.
Best time for 5120K FFT length: 32.006 ms.
Best time for 6144K FFT length: 38.256 ms.
Best time for 7168K FFT length: 46.596 ms.
Best time for 8192K FFT length: 52.521 ms.
Timing FFTs using 4 threads.
Best time for 768K FFT length: 4.173 ms.
Best time for 896K FFT length: 4.696 ms.
Best time for 1024K FFT length: 6.224 ms.
Best time for 1280K FFT length: 5.876 ms.
Best time for 1536K FFT length: 7.000 ms.
Best time for 1792K FFT length: 8.099 ms.
Best time for 2048K FFT length: 9.215 ms.
Best time for 2560K FFT length: 11.506 ms.
Best time for 3072K FFT length: 13.909 ms.
Best time for 3584K FFT length: 16.817 ms.
Best time for 4096K FFT length: 18.955 ms.
Best time for 5120K FFT length: 24.315 ms.
Best time for 6144K FFT length: 29.556 ms.
Best time for 7168K FFT length: 36.312 ms.
Best time for 8192K FFT length: 41.955 ms.
Best time for 58 bit trial factors: 2.421 ms.
Best time for 59 bit trial factors: 2.451 ms.
Best time for 60 bit trial factors: 2.449 ms.
Best time for 61 bit trial factors: 2.731 ms.
Best time for 62 bit trial factors: 2.857 ms.
Best time for 63 bit trial factors: 3.371 ms.
Best time for 64 bit trial factors: 4.121 ms.
Best time for 65 bit trial factors: 4.396 ms.
Best time for 66 bit trial factors: 4.601 ms.
Best time for 67 bit trial factors: 4.569 ms.[/CODE]

hj47 2010-03-02 05:24

Core i5 3.6GHz & DDR3 @ 2GHz
 
[code]
[Tue Mar 02 16:19:45 2010]
Compare your results to other computers at [URL]http://www.mersenne.org/report_benchmarks[/URL]
Intel(R) Core(TM) i5 CPU 750 @ 2.67GHz
CPU speed: 3600.00 MHz, 4 cores
CPU features: RDTSC, CMOV, Prefetch, MMX, SSE, SSE2, SSE4
L1 cache size: 32 KB
L2 cache size: 256 KB, L3 cache size: 8 MB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
TLBS: 64
Prime95 64-bit version 25.9, RdtscTiming=1
Best time for 768K FFT length: 10.290 ms.
Best time for 896K FFT length: 12.511 ms.
Best time for 1024K FFT length: 14.059 ms.
Best time for 1280K FFT length: 17.697 ms.
Best time for 1536K FFT length: 21.340 ms.
Best time for 1792K FFT length: 25.805 ms.
Best time for 2048K FFT length: 29.111 ms.
Best time for 2560K FFT length: 38.653 ms.
Best time for 3072K FFT length: 47.120 ms.
Best time for 3584K FFT length: 57.132 ms.
Best time for 4096K FFT length: 64.416 ms.
Best time for 5120K FFT length: 82.394 ms.
Best time for 6144K FFT length: 98.321 ms.
Best time for 7168K FFT length: 119.650 ms.
Best time for 8192K FFT length: 134.385 ms.
Timing FFTs using 2 threads.
Best time for 768K FFT length: 5.376 ms.
Best time for 896K FFT length: 6.506 ms.
Best time for 1024K FFT length: 7.432 ms.
Best time for 1280K FFT length: 9.301 ms.
Best time for 1536K FFT length: 11.198 ms.
Best time for 1792K FFT length: 13.517 ms.
Best time for 2048K FFT length: 15.247 ms.
Best time for 2560K FFT length: 19.986 ms.
Best time for 3072K FFT length: 24.189 ms.
Best time for 3584K FFT length: 29.249 ms.
Best time for 4096K FFT length: 32.948 ms.
Best time for 5120K FFT length: 42.181 ms.
Best time for 6144K FFT length: 50.503 ms.
Best time for 7168K FFT length: 61.683 ms.
Best time for 8192K FFT length: 68.924 ms.
Timing FFTs using 3 threads.
Best time for 768K FFT length: 4.212 ms.
Best time for 896K FFT length: 4.663 ms.
Best time for 1024K FFT length: 6.274 ms.
Best time for 1280K FFT length: 6.284 ms.
Best time for 1536K FFT length: 7.545 ms.
Best time for 1792K FFT length: 9.162 ms.
Best time for 2048K FFT length: 10.280 ms.
Best time for 2560K FFT length: 13.537 ms.
Best time for 3072K FFT length: 16.419 ms.
Best time for 3584K FFT length: 19.979 ms.
Best time for 4096K FFT length: 22.601 ms.
Best time for 5120K FFT length: 28.552 ms.
Best time for 6144K FFT length: 36.026 ms.
Best time for 7168K FFT length: 42.251 ms.
Best time for 8192K FFT length: 47.758 ms.
Timing FFTs using 4 threads.
Best time for 768K FFT length: 3.719 ms.
Best time for 896K FFT length: 4.191 ms.
Best time for 1024K FFT length: 5.551 ms.
Best time for 1280K FFT length: 5.250 ms.
Best time for 1536K FFT length: 6.272 ms.
Best time for 1792K FFT length: 7.238 ms.
Best time for 2048K FFT length: 8.231 ms.
Best time for 2560K FFT length: 10.318 ms.
Best time for 3072K FFT length: 12.519 ms.
Best time for 3584K FFT length: 15.104 ms.
Best time for 4096K FFT length: 17.038 ms.
Best time for 5120K FFT length: 21.903 ms.
Best time for 6144K FFT length: 27.059 ms.
Best time for 7168K FFT length: 33.409 ms.
Best time for 8192K FFT length: 39.547 ms.
Best time for 58 bit trial factors: 2.152 ms.
Best time for 59 bit trial factors: 2.177 ms.
Best time for 60 bit trial factors: 2.174 ms.
Best time for 61 bit trial factors: 2.426 ms.
Best time for 62 bit trial factors: 2.538 ms.
Best time for 63 bit trial factors: 2.995 ms.
Best time for 64 bit trial factors: 3.681 ms.
Best time for 65 bit trial factors: 3.903 ms.
Best time for 66 bit trial factors: 4.088 ms.
Best time for 67 bit trial factors: 4.059 ms.
[/code]

petrw1 2010-03-15 05:00

VERY ODD results ... anyone have any suggestions?
 
I just installed Prime95 on:
[CODE]Software Version Windows,Prime95,v25.9,build 4
Model AMD Athlon Dual-Core QL-62
Features Dual core, Prefetch,3DNow!,SSE,SSE2
Speed 1.999 GHz (1.765 GHz P4 effective equivalent)
L1/L2 Cache 64 / 512 KB
Computer Memory 2814 MB configured usage 1000 MB day / 1000 MB night
--- it has 3 GB of RAM
[/CODE]

I started with Worker 1: P-1; Worker 2: DC
The iterations times seemed high for a DC (0.130) so I ran a benchmark.
The results really surprised me.

With 1 core the numbers seemed low: i.e. 63 Ms for 1280 FFT
But when it went to 1 core/1 helper it got very weird: the iterations time went up by about 40% to about 80 ms for 1280FFT; and every other FFT below 7K (at which time they were slightly better than 1 core).

So I changed both cores to DC. With only 1 worker active running DC I was getting iterations times of about 70 ms. When the second DC worker started both iterations times dropped to 132 ms.

Task manager showed Prime95 consistently in the 98-99% range.

Any thoughts? Anyone help?

lycorn 2010-03-15 08:32

The QL-62 works with DDR2-667 (333 MHz Bus speed). It´s not much in terms of mem bandwith. I would say that might be the limiting factor in the benchies you posted. The way the numbers vary suggests some memory access congestion.

petrw1 2010-03-15 14:31

[QUOTE=lycorn;208431]The QL-62 works with DDR2-667 (333 MHz Bus speed). It´s not much in terms of mem bandwith. I would say that might be the limiting factor in the benchies you posted. The way the numbers vary suggests some memory access congestion.[/QUOTE]

I just looked and there are 6 other benchmarks for the QL-62 are very close to mine ... that tells me this PC is working as expected and that it [U]probably[/U] does NOT have a [U]significant[/U] issue with spyware, viruses or overheating.

So maybe it is only/best suited for TF?

lycorn 2010-03-15 23:46

Try one core DC and one core TF.

petrw1 2010-04-06 20:10

Deleting benchmarks???
 
One the benchmark page there is a "Delete?" column but no corresponding "Delete" button or option.

Am I missing something OR is this feature not fully implemented yet?

lfm 2010-04-07 06:06

[QUOTE=petrw1;210784]One the benchmark page there is a "Delete?" column but no corresponding "Delete" button or option.

Am I missing something OR is this feature not fully implemented yet?[/QUOTE]

Ya, its implemented at a deeper level somewhere, I was confused by this at first too. The top level delete check boxes are non-functional.


All times are UTC. The time now is 22:54.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.