mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Hardware (https://www.mersenneforum.org/forumdisplay.php?f=9)
-   -   Perpetual benchmark thread... (https://www.mersenneforum.org/showthread.php?t=59)

TheJudger 2007-07-06 20:22

Hello Tony,

[QUOTE=T.Rex;109712]
BTW, George, would it be possible to add the possibility to bind prime95 threads on a set of processors, in order to reduce the NUMA effect when using 4 cores about 8 ?
T.[/QUOTE]

at least the 24.xx versions are able to be pinned on a specific CPU, I'm not sure about 25.xx.

Usually I use "numactl" if I want to pin a process (or a process group) to a set of CPUs and memory ranges while running Linux.
Example: numactl --physcpubind=0,2 --localalloc <application>
this would pin the application to CPU0 or CPU2 (numbers from /proc/cpuinfo).
This works great for many applications, especially on Opterons since they are real NUMA...

And now the bad news: At least mprime 24.xx calls sched_setaffinity() even if it was NOT told to pin to a specific CPU via ini-file. In this case mprime calls the sched_setaffinity() function with a CPU-mask 0xFFFFFFFF which means that it want to run on any of the 1st 32 CPUs...

geoff 2007-08-01 01:40

E6750
 
Intel Core 2 Duo E6750, stepping 0B.
ASUS P5K-VM, G33 chipset.
Transcend JM2GDDR2-8K, DDR2 800, CL5.

All stock settings, but with speedstep disabled in BIOS.
Running Linux/x86-64 from an Ubuntu live CD, with a fake libcurl.so.4.
[code]
Intel(R) Core(TM)2 Duo CPU E6750 @ 2.66GHz
CPU speed: 2671.50 MHz, 2 cores
CPU features: RDTSC, CMOV, Prefetch, MMX, SSE, SSE2
L1 cache size: 32 KB
L2 cache size: 4096 KB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
TLBS: 256
Prime95 64-bit version 25.3, RdtscTiming=1
Best time for 768K FFT length: 15.720 ms.
Best time for 896K FFT length: 18.772 ms.
Best time for 1024K FFT length: 21.442 ms.
Best time for 1280K FFT length: 26.580 ms.
Best time for 1536K FFT length: 32.532 ms.
Best time for 1792K FFT length: 38.715 ms.
Best time for 2048K FFT length: 43.067 ms.
Best time for 2560K FFT length: 56.481 ms.
Best time for 3072K FFT length: 69.312 ms.
Best time for 3584K FFT length: 82.517 ms.
Best time for 4096K FFT length: 92.188 ms.
Best time for 5120K FFT length: 117.745 ms.
Best time for 6144K FFT length: 142.917 ms.
Best time for 7168K FFT length: 174.043 ms.
Best time for 8192K FFT length: 190.819 ms.
Timing FFTs using 2 threads.
Best time for 768K FFT length: 8.157 ms.
Best time for 896K FFT length: 9.812 ms.
Best time for 1024K FFT length: 11.285 ms.
Best time for 1280K FFT length: 13.928 ms.
Best time for 1536K FFT length: 17.224 ms.
Best time for 1792K FFT length: 20.266 ms.
Best time for 2048K FFT length: 22.848 ms.
Best time for 2560K FFT length: 29.801 ms.
Best time for 3072K FFT length: 36.417 ms.
Best time for 3584K FFT length: 43.380 ms.
Best time for 4096K FFT length: 48.831 ms.
Best time for 5120K FFT length: 62.152 ms.
Best time for 6144K FFT length: 77.109 ms.
Best time for 7168K FFT length: 92.757 ms.
Best time for 8192K FFT length: 104.310 ms.
Best time for 58 bit trial factors: 3.336 ms.
Best time for 59 bit trial factors: 3.330 ms.
Best time for 60 bit trial factors: 3.641 ms.
Best time for 61 bit trial factors: 3.839 ms.
Best time for 62 bit trial factors: 4.475 ms.
Best time for 63 bit trial factors: 5.607 ms.
Best time for 64 bit trial factors: 6.028 ms.
Best time for 65 bit trial factors: 6.463 ms.
Best time for 66 bit trial factors: 6.397 ms.
Best time for 67 bit trial factors: 6.370 ms.
[/code]

Denahar 2007-08-04 18:40

For the sake of having a reference see the following results. Note that there is not that much of an improvement from running Prime95 on three or four cores (see memory bottleneck).

-D

[CODE]Intel(R) Xeon(R) CPU 5160 @ 3.00GHz
CPU speed: 3204.05 MHz, 4 cores
CPU features: RDTSC, CMOV, Prefetch, MMX, SSE, SSE2
L1 cache size: 32 KB
L2 cache size: 4096 KB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
TLBS: 256
Prime95 32-bit version 25.3, RdtscTiming=1
Best time for 768K FFT length: 13.849 ms.
Best time for 896K FFT length: 16.749 ms.
Best time for 1024K FFT length: 19.023 ms.
Best time for 1280K FFT length: 23.947 ms.
Best time for 1536K FFT length: 29.319 ms.
Best time for 1792K FFT length: 35.038 ms.
Best time for 2048K FFT length: 39.119 ms.
Best time for 2560K FFT length: 51.507 ms.
Best time for 3072K FFT length: 62.919 ms.
Best time for 3584K FFT length: 74.852 ms.
Best time for 4096K FFT length: 83.869 ms.
Best time for 5120K FFT length: 107.006 ms.
Best time for 6144K FFT length: 128.722 ms.
Best time for 7168K FFT length: 155.989 ms.
Best time for 8192K FFT length: 171.460 ms.
Timing FFTs using 2 threads.
Best time for 768K FFT length: 14.217 ms.
Best time for 896K FFT length: 16.151 ms.
Best time for 1024K FFT length: 21.919 ms.
Best time for 1280K FFT length: 18.645 ms.
Best time for 1536K FFT length: 22.578 ms.
Best time for 1792K FFT length: 26.076 ms.
Best time for 2048K FFT length: 29.430 ms.
Best time for 2560K FFT length: 37.801 ms.
Best time for 3072K FFT length: 46.155 ms.
Best time for 3584K FFT length: 54.326 ms.
Best time for 4096K FFT length: 62.073 ms.
Best time for 5120K FFT length: 75.494 ms.
Best time for 6144K FFT length: 90.140 ms.
Best time for 7168K FFT length: 108.159 ms.
Best time for 8192K FFT length: 122.330 ms.
Timing FFTs using 3 threads.
Best time for 768K FFT length: 10.280 ms.
Best time for 896K FFT length: 11.991 ms.
Best time for 1024K FFT length: 19.685 ms.
Best time for 1280K FFT length: 15.145 ms.
Best time for 1536K FFT length: 18.247 ms.
Best time for 1792K FFT length: 21.375 ms.
Best time for 2048K FFT length: 24.486 ms.
Best time for 2560K FFT length: 31.278 ms.
Best time for 3072K FFT length: 38.313 ms.
Best time for 3584K FFT length: 45.167 ms.
Best time for 4096K FFT length: 51.682 ms.
Best time for 5120K FFT length: 62.440 ms.
Best time for 6144K FFT length: 76.929 ms.
Best time for 7168K FFT length: 90.957 ms.
Best time for 8192K FFT length: 102.885 ms.
Timing FFTs using 4 threads.
Best time for 768K FFT length: 11.297 ms.
Best time for 896K FFT length: 12.846 ms.
Best time for 1024K FFT length: 21.918 ms.
Best time for 1280K FFT length: 13.915 ms.
Best time for 1536K FFT length: 16.707 ms.
Best time for 1792K FFT length: 19.476 ms.
Best time for 2048K FFT length: 22.268 ms.
Best time for 2560K FFT length: 28.380 ms.
Best time for 3072K FFT length: 34.850 ms.
Best time for 3584K FFT length: 41.447 ms.
Best time for 4096K FFT length: 48.045 ms.
Best time for 5120K FFT length: 59.141 ms.
Best time for 6144K FFT length: 72.136 ms.
Best time for 7168K FFT length: 86.582 ms.
Best time for 8192K FFT length: 100.139 ms.
Best time for 58 bit trial factors: 3.374 ms.
Best time for 59 bit trial factors: 3.494 ms.
Best time for 60 bit trial factors: 3.508 ms.
Best time for 61 bit trial factors: 3.492 ms.
Best time for 62 bit trial factors: 5.722 ms.
Best time for 63 bit trial factors: 5.732 ms.
Best time for 64 bit trial factors: 5.550 ms.
Best time for 65 bit trial factors: 5.520 ms.
Best time for 66 bit trial factors: 5.516 ms.
Best time for 67 bit trial factors: 5.496 ms.
[/CODE]

R.D. Silverman 2007-08-06 12:01

[QUOTE=Denahar;111722]For the sake of having a reference see the following results. Note that there is not that much of an improvement from running Prime95 on three or four cores (see memory bottleneck).

-D

[CODE]Intel(R) Xeon(R) CPU 5160 @ 3.00GHz
CPU speed: 3204.05 MHz, 4 cores
CPU features: RDTSC, CMOV, Prefetch, MMX, SSE, SSE2
L1 cache size: 32 KB
L2 cache size: 4096 KB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
TLBS: 256
Prime95 32-bit version 25.3, RdtscTiming=1
Best time for 768K FFT length: 13.849 ms.
Best time for 896K FFT length: 16.749 ms.
Best time for 1024K FFT length: 19.023 ms.
Best time for 1280K FFT length: 23.947 ms.
Best time for 1536K FFT length: 29.319 ms.
Best time for 1792K FFT length: 35.038 ms.
Best time for 2048K FFT length: 39.119 ms.
Best time for 2560K FFT length: 51.507 ms.
Best time for 3072K FFT length: 62.919 ms.
Best time for 3584K FFT length: 74.852 ms.
Best time for 4096K FFT length: 83.869 ms.
Best time for 5120K FFT length: 107.006 ms.
Best time for 6144K FFT length: 128.722 ms.
Best time for 7168K FFT length: 155.989 ms.
Best time for 8192K FFT length: 171.460 ms.
Timing FFTs using 2 threads.
Best time for 768K FFT length: 14.217 ms.
Best time for 896K FFT length: 16.151 ms.
Best time for 1024K FFT length: 21.919 ms.
Best time for 1280K FFT length: 18.645 ms.
Best time for 1536K FFT length: 22.578 ms.
Best time for 1792K FFT length: 26.076 ms.
Best time for 2048K FFT length: 29.430 ms.
Best time for 2560K FFT length: 37.801 ms.
Best time for 3072K FFT length: 46.155 ms.
Best time for 3584K FFT length: 54.326 ms.
Best time for 4096K FFT length: 62.073 ms.
Best time for 5120K FFT length: 75.494 ms.
Best time for 6144K FFT length: 90.140 ms.
Best time for 7168K FFT length: 108.159 ms.
Best time for 8192K FFT length: 122.330 ms.
Timing FFTs using 3 threads.
Best time for 768K FFT length: 10.280 ms.
Best time for 896K FFT length: 11.991 ms.
Best time for 1024K FFT length: 19.685 ms.
Best time for 1280K FFT length: 15.145 ms.
Best time for 1536K FFT length: 18.247 ms.
Best time for 1792K FFT length: 21.375 ms.
Best time for 2048K FFT length: 24.486 ms.
Best time for 2560K FFT length: 31.278 ms.
Best time for 3072K FFT length: 38.313 ms.
Best time for 3584K FFT length: 45.167 ms.
Best time for 4096K FFT length: 51.682 ms.
Best time for 5120K FFT length: 62.440 ms.
Best time for 6144K FFT length: 76.929 ms.
Best time for 7168K FFT length: 90.957 ms.
Best time for 8192K FFT length: 102.885 ms.
Timing FFTs using 4 threads.
Best time for 768K FFT length: 11.297 ms.
Best time for 896K FFT length: 12.846 ms.
Best time for 1024K FFT length: 21.918 ms.
Best time for 1280K FFT length: 13.915 ms.
Best time for 1536K FFT length: 16.707 ms.
Best time for 1792K FFT length: 19.476 ms.
Best time for 2048K FFT length: 22.268 ms.
Best time for 2560K FFT length: 28.380 ms.
Best time for 3072K FFT length: 34.850 ms.
Best time for 3584K FFT length: 41.447 ms.
Best time for 4096K FFT length: 48.045 ms.
Best time for 5120K FFT length: 59.141 ms.
Best time for 6144K FFT length: 72.136 ms.
Best time for 7168K FFT length: 86.582 ms.
Best time for 8192K FFT length: 100.139 ms.
Best time for 58 bit trial factors: 3.374 ms.
Best time for 59 bit trial factors: 3.494 ms.
Best time for 60 bit trial factors: 3.508 ms.
Best time for 61 bit trial factors: 3.492 ms.
Best time for 62 bit trial factors: 5.722 ms.
Best time for 63 bit trial factors: 5.732 ms.
Best time for 64 bit trial factors: 5.550 ms.
Best time for 65 bit trial factors: 5.520 ms.
Best time for 66 bit trial factors: 5.516 ms.
Best time for 67 bit trial factors: 5.496 ms.
[/CODE][/QUOTE]


I may be having a stupidity attack, but it is unclear to me whether this
represents one instance running in parallel on multiple cores, or
multiple instances...

With multi-core machines it is almost certainly better to run multiple
instances.

For example, I recently benchmarked my NFS siever on a CORE-2 at
2.4GHz.

With one instance running it took 8.3 seconds per special q.
With two, EACH one took about 8.45 seconds --> i.e. 1.964 times
the throughput. I was surprised. I expected cache/bus contention
to have more of an impact. I don't have access to a quad core machine,
so can't experiment further.

Denahar 2007-08-06 16:01

[QUOTE=R.D. Silverman;111820]I may be having a stupidity attack, but it is unclear to me whether this
represents one instance running in parallel on multiple cores, or
multiple instances...

With multi-core machines it is almost certainly better to run multiple
instances.

For example, I recently benchmarked my NFS siever on a CORE-2 at
2.4GHz.

With one instance running it took 8.3 seconds per special q.
With two, EACH one took about 8.45 seconds --> i.e. 1.964 times
the throughput. I was surprised. I expected cache/bus contention
to have more of an impact. I don't have access to a quad core machine,
so can't experiment further.[/QUOTE]

IMO, the first is the case of what "thread" is speaking here. For example, I can run Prime95 with one instance on four cores (giving the above benchmark) or 2 instances on two cores or even four instances on each a single core (most effective way perhaps?).

-D

Prime95 2007-08-06 18:13

[QUOTE=R.D. Silverman;111820]I may be having a stupidity attack, but it is unclear to me whether this represents one instance running in parallel on multiple cores, or multiple instances...

With multi-core machines it is almost certainly better to run multiple instances.[/QUOTE]

The benchmark is for one instance running in parallel on 1,2,3,or 4 cores.

The info is for reference purposes only as you are correct that it is better to test a different exponent on each core. The default behavior for prime95 does not use the parallel code.

Also, DO NOT attribute the poor scaling to the Intel architecture or memory bandwidth. It could easily be prime95's parallel implementation that is in need of improvement. Different applications may scale much better or much worse.

Bundu 2007-08-08 23:47

Benchmark on v25.3
 
Processor (CPU) INTEL® Core 2 Extreme QX6800 (4 X 2.93GHz) 1066MHz FSB/2x4MB Cache
Memory (RAM) 2048 MB CORSAIR XMS2 800MHz (2x1GB)
Operating System WINDOWS® VISTA Home Premium

Compare your results to other computers at [url]http://www.mersenne.org/bench.htm[/url]
Intel(R) Core(TM)2 Quad CPU @ 2.93GHz
CPU speed: 2935.73 MHz, 4 cores
CPU features: RDTSC, CMOV, Prefetch, MMX, SSE, SSE2
L1 cache size: 32 KB
L2 cache size: 4096 KB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
TLBS: 256
Prime95 32-bit version 25.3, RdtscTiming=1
Best time for 768K FFT length: 14.803 ms.
Best time for 896K FFT length: 17.646 ms.
Best time for 1024K FFT length: 20.317 ms.
Best time for 1280K FFT length: 25.242 ms.
Best time for 1536K FFT length: 30.821 ms.
Best time for 1792K FFT length: 36.638 ms.
Best time for 2048K FFT length: 40.677 ms.
Best time for 2560K FFT length: 53.766 ms.
Best time for 3072K FFT length: 65.308 ms.
Best time for 3584K FFT length: 77.869 ms.
Best time for 4096K FFT length: 86.908 ms.
Best time for 5120K FFT length: 111.457 ms.
Best time for 6144K FFT length: 135.721 ms.
Best time for 7168K FFT length: 165.223 ms.
Best time for 8192K FFT length: 180.015 ms.
Timing FFTs using 2 threads.
Best time for 768K FFT length: 8.156 ms.
Best time for 896K FFT length: 9.651 ms.
Best time for 1024K FFT length: 12.689 ms.
Best time for 1280K FFT length: 13.602 ms.
Best time for 1536K FFT length: 16.670 ms.
Best time for 1792K FFT length: 19.749 ms.
Best time for 2048K FFT length: 22.135 ms.
Best time for 2560K FFT length: 28.863 ms.
Best time for 3072K FFT length: 35.391 ms.
Best time for 3584K FFT length: 41.811 ms.
Best time for 4096K FFT length: 47.075 ms.
Best time for 5120K FFT length: 60.604 ms.
Best time for 6144K FFT length: 74.535 ms.
Best time for 7168K FFT length: 90.009 ms.
Best time for 8192K FFT length: 101.340 ms.
Timing FFTs using 3 threads.
Best time for 768K FFT length: 14.710 ms.
Best time for 896K FFT length: 15.555 ms.
Best time for 1024K FFT length: 25.734 ms.
Best time for 1280K FFT length: 13.235 ms.
Best time for 1536K FFT length: 15.729 ms.
Best time for 1792K FFT length: 18.493 ms.
Best time for 2048K FFT length: 20.730 ms.
Best time for 2560K FFT length: 27.075 ms.
Best time for 3072K FFT length: 32.825 ms.
Best time for 3584K FFT length: 37.862 ms.
Best time for 4096K FFT length: 43.170 ms.
Best time for 5120K FFT length: 52.882 ms.
Best time for 6144K FFT length: 64.273 ms.
Best time for 7168K FFT length: 77.444 ms.
Best time for 8192K FFT length: 88.054 ms.
Timing FFTs using 4 threads.
Best time for 768K FFT length: 13.965 ms.
Best time for 896K FFT length: 14.985 ms.
Best time for 1024K FFT length: 22.347 ms.
Best time for 1280K FFT length: 14.399 ms.
Best time for 1536K FFT length: 15.385 ms.
Best time for 1792K FFT length: 15.406 ms.
Best time for 2048K FFT length: 17.497 ms.
Best time for 2560K FFT length: 21.795 ms.
Best time for 3072K FFT length: 26.702 ms.
Best time for 3584K FFT length: 31.239 ms.
Best time for 4096K FFT length: 35.859 ms.
Best time for 5120K FFT length: 44.613 ms.
Best time for 6144K FFT length: 54.274 ms.
Best time for 7168K FFT length: 65.224 ms.
Best time for 8192K FFT length: 75.270 ms.
Best time for 58 bit trial factors: 3.653 ms.
Best time for 59 bit trial factors: 3.685 ms.
Best time for 60 bit trial factors: 3.633 ms.
Best time for 61 bit trial factors: 3.657 ms.
Best time for 62 bit trial factors: 6.090 ms.
Best time for 63 bit trial factors: 6.109 ms.
Best time for 64 bit trial factors: 5.910 ms.
Best time for 65 bit trial factors: 5.875 ms.
Best time for 66 bit trial factors: 5.900 ms.
Best time for 67 bit trial factors: 5.873 ms.

Bruno__ 2007-09-28 23:53

Running Windows Vista Business Edition.

Intel(R) Core(TM)2 CPU 6700 @ 2.66GHz
CPU speed: 2666.56 MHz
CPU features: RDTSC, CMOV, Prefetch, MMX, SSE, SSE2
L1 cache size: 32 KB
L2 cache size: unknown
L1 cache line size: 64 bytes
L2 cache line size: unknown
Prime95 32-bit version 24.14, RdtscTiming=1
Best time for 512K FFT length: 9.574 ms.
Best time for 640K FFT length: 13.030 ms.
Best time for 768K FFT length: 16.053 ms.
Best time for 896K FFT length: 19.159 ms.
Best time for 1024K FFT length: 21.228 ms.
Best time for 1280K FFT length: 27.003 ms.
Best time for 1536K FFT length: 33.030 ms.
Best time for 1792K FFT length: 39.252 ms.
Best time for 2048K FFT length: 43.670 ms.
Best time for 2560K FFT length: 57.669 ms.
Best time for 3072K FFT length: 70.901 ms.
Best time for 3584K FFT length: 84.567 ms.
Best time for 4096K FFT length: 94.978 ms.
Best time for 58 bit trial factors: 4.207 ms.
Best time for 59 bit trial factors: 4.222 ms.
Best time for 60 bit trial factors: 4.173 ms.
Best time for 61 bit trial factors: 4.208 ms.
Best time for 62 bit trial factors: 6.710 ms.
Best time for 63 bit trial factors: 6.721 ms.
Best time for 64 bit trial factors: 6.422 ms.
Best time for 65 bit trial factors: 6.432 ms.
Best time for 66 bit trial factors: 6.388 ms.
Best time for 67 bit trial factors: 6.392 ms.

ixfd64 2007-10-03 07:29

Dang, Bundu, that's a nice system. :D

fivemack 2007-10-03 12:36

Dual Opteron 2216 (4 cores, 2.4GHz) [company's compute server]

This machine is only a year old, but is just about half the speed of a QX6800.

There is a protein-analysis job on the fourth core (which has been running for a month so far), which meant the timings for four threads were in the three-second range, so I've not mentioned those.

[code]
[Worker #1 Oct 3 13:24] Timing 39 iterations at 768K FFT length. Best time: 31.022 ms.
[Worker #1 Oct 3 13:24] Timing 34 iterations at 896K FFT length. Best time: 36.985 ms.
[Worker #1 Oct 3 13:24] Timing 29 iterations at 1024K FFT length. Best time: 41.030 ms.
[Worker #1 Oct 3 13:24] Timing 23 iterations at 1280K FFT length. Best time: 52.455 ms.
[Worker #1 Oct 3 13:24] Timing 19 iterations at 1536K FFT length. Best time: 63.697 ms.
[Worker #1 Oct 3 13:24] Timing 17 iterations at 1792K FFT length. Best time: 77.128 ms.
[Worker #1 Oct 3 13:24] Timing 14 iterations at 2048K FFT length. Best time: 86.075 ms.
[Worker #1 Oct 3 13:24] Timing 11 iterations at 2560K FFT length. Best time: 113.336 ms.
[Worker #1 Oct 3 13:24] Timing 10 iterations at 3072K FFT length. Best time: 137.694 ms.
[Worker #1 Oct 3 13:24] Timing 10 iterations at 3584K FFT length. Best time: 166.261 ms.
[Worker #1 Oct 3 13:24] Timing 10 iterations at 4096K FFT length. Best time: 185.472 ms.
[Worker #1 Oct 3 13:24] Timing 10 iterations at 5120K FFT length. Best time: 242.171 ms.
[Worker #1 Oct 3 13:24] Timing 10 iterations at 6144K FFT length. Best time: 296.867 ms.
[Worker #1 Oct 3 13:24] Timing 10 iterations at 7168K FFT length. Best time: 359.789 ms.
[Worker #1 Oct 3 13:25] Timing 10 iterations at 8192K FFT length. Best time: 420.367 ms.
[Worker #1 Oct 3 13:25] Timing FFTs using 2 threads.
[Worker #1 Oct 3 13:25] Timing 39 iterations at 768K FFT length. Best time: 19.157 ms.
[Worker #1 Oct 3 13:25] Timing 34 iterations at 896K FFT length. Best time: 22.871 ms.
[Worker #1 Oct 3 13:25] Timing 29 iterations at 1024K FFT length. Best time: 25.871 ms.
[Worker #1 Oct 3 13:25] Timing 23 iterations at 1280K FFT length. Best time: 34.670 ms.
[Worker #1 Oct 3 13:25] Timing 19 iterations at 1536K FFT length. Best time: 41.658 ms.
[Worker #1 Oct 3 13:25] Timing 17 iterations at 1792K FFT length. Best time: 49.989 ms.
[Worker #1 Oct 3 13:25] Timing 14 iterations at 2048K FFT length. Best time: 56.124 ms.
[Worker #1 Oct 3 13:25] Timing 11 iterations at 2560K FFT length. Best time: 74.350 ms.
[Worker #1 Oct 3 13:25] Timing 10 iterations at 3072K FFT length. Best time: 90.026 ms.
[Worker #1 Oct 3 13:25] Timing 10 iterations at 3584K FFT length. Best time: 106.669 ms.
[Worker #1 Oct 3 13:25] Timing 10 iterations at 4096K FFT length. Best time: 119.611 ms.
[Worker #1 Oct 3 13:25] Timing 10 iterations at 5120K FFT length. Best time: 133.539 ms.
[Worker #1 Oct 3 13:25] Timing 10 iterations at 6144K FFT length. Best time: 181.363 ms.
[Worker #1 Oct 3 13:25] Timing 10 iterations at 7168K FFT length. Best time: 221.455 ms.
[Worker #1 Oct 3 13:26] Timing 10 iterations at 8192K FFT length. Best time: 258.508 ms.
[Worker #1 Oct 3 13:26] Timing FFTs using 3 threads.
[Worker #1 Oct 3 13:26] Timing 39 iterations at 768K FFT length. Best time: 15.866 ms.
[Worker #1 Oct 3 13:26] Timing 34 iterations at 896K FFT length. Best time: 18.178 ms.
[Worker #1 Oct 3 13:26] Timing 29 iterations at 1024K FFT length. Best time: 20.588 ms.
[Worker #1 Oct 3 13:26] Timing 23 iterations at 1280K FFT length. Best time: 32.625 ms.
[Worker #1 Oct 3 13:26] Timing 19 iterations at 1536K FFT length. Best time: 37.852 ms.
[Worker #1 Oct 3 13:26] Timing 17 iterations at 1792K FFT length. Best time: 43.261 ms.
[Worker #1 Oct 3 13:26] Timing 14 iterations at 2048K FFT length. Best time: 48.171 ms.
[Worker #1 Oct 3 13:26] Timing 11 iterations at 2560K FFT length. Best time: 67.688 ms.
[Worker #1 Oct 3 13:26] Timing 10 iterations at 3072K FFT length. Best time: 79.096 ms.
[Worker #1 Oct 3 13:26] Timing 10 iterations at 3584K FFT length. Best time: 90.504 ms.
[Worker #1 Oct 3 13:26] Timing 10 iterations at 4096K FFT length. Best time: 102.211 ms.
[Worker #1 Oct 3 13:26] Timing 10 iterations at 5120K FFT length. Best time: 93.737 ms.
[Worker #1 Oct 3 13:26] Timing 10 iterations at 6144K FFT length. Best time: 115.809 ms.
[Worker #1 Oct 3 13:26] Timing 10 iterations at 7168K FFT length. Best time: 149.458 ms.
[Worker #1 Oct 3 13:26] Timing 10 iterations at 8192K FFT length. Best time: 176.192 ms.

[/code]

fivemack 2007-10-03 12:40

Core2 E6600, version 25.5 under linux-x64

[code]
[Worker #1 Oct 3 13:36] Timing 39 iterations at 768K FFT length. Best time: 17.423 ms.
[Worker #1 Oct 3 13:36] Timing 34 iterations at 896K FFT length. Best time: 20.847 ms.
[Worker #1 Oct 3 13:36] Timing 29 iterations at 1024K FFT length. Best time: 23.692 ms.
[Worker #1 Oct 3 13:36] Timing 23 iterations at 1280K FFT length. Best time: 29.493 ms.
[Worker #1 Oct 3 13:36] Timing 19 iterations at 1536K FFT length. Best time: 36.124 ms.
[Worker #1 Oct 3 13:36] Timing 17 iterations at 1792K FFT length. Best time: 43.026 ms.
[Worker #1 Oct 3 13:36] Timing 14 iterations at 2048K FFT length. Best time: 47.782 ms.
[Worker #1 Oct 3 13:36] Timing 11 iterations at 2560K FFT length. Best time: 62.770 ms.
[Worker #1 Oct 3 13:36] Timing 10 iterations at 3072K FFT length. Best time: 77.013 ms.
[Worker #1 Oct 3 13:36] Timing 10 iterations at 3584K FFT length. Best time: 91.654 ms.
[Worker #1 Oct 3 13:36] Timing 10 iterations at 4096K FFT length. Best time: 102.566 ms.
[Worker #1 Oct 3 13:36] Timing 10 iterations at 5120K FFT length. Best time: 130.828 ms.
[Worker #1 Oct 3 13:36] Timing 10 iterations at 6144K FFT length. Best time: 158.877 ms.
[Worker #1 Oct 3 13:36] Timing 10 iterations at 7168K FFT length. Best time: 193.376 ms.
[Worker #1 Oct 3 13:36] Timing 10 iterations at 8192K FFT length. Best time: 212.379 ms.
[Worker #1 Oct 3 13:36] Timing FFTs using 2 threads.
[Worker #1 Oct 3 13:36] Timing 39 iterations at 768K FFT length. Best time: 9.222 ms.
[Worker #1 Oct 3 13:36] Timing 34 iterations at 896K FFT length. Best time: 11.053 ms.
[Worker #1 Oct 3 13:36] Timing 29 iterations at 1024K FFT length. Best time: 13.235 ms.
[Worker #1 Oct 3 13:36] Timing 23 iterations at 1280K FFT length. Best time: 15.468 ms.
[Worker #1 Oct 3 13:36] Timing 19 iterations at 1536K FFT length. Best time: 19.167 ms.
[Worker #1 Oct 3 13:36] Timing 17 iterations at 1792K FFT length. Best time: 22.509 ms.
[Worker #1 Oct 3 13:36] Timing 14 iterations at 2048K FFT length. Best time: 25.278 ms.
[Worker #1 Oct 3 13:36] Timing 11 iterations at 2560K FFT length. Best time: 33.295 ms.
[Worker #1 Oct 3 13:36] Timing 10 iterations at 3072K FFT length. Best time: 40.454 ms.
[Worker #1 Oct 3 13:37] Timing 10 iterations at 3584K FFT length. Best time: 48.145 ms.
[Worker #1 Oct 3 13:37] Timing 10 iterations at 4096K FFT length. Best time: 53.886 ms.
[Worker #1 Oct 3 13:37] Timing 10 iterations at 5120K FFT length. Best time: 69.289 ms.
[Worker #1 Oct 3 13:37] Timing 10 iterations at 6144K FFT length. Best time: 85.362 ms.
[Worker #1 Oct 3 13:37] Timing 10 iterations at 7168K FFT length. Best time: 103.101 ms.
[Worker #1 Oct 3 13:37] Timing 10 iterations at 8192K FFT length. Best time: 116.227 ms.
[/code]


All times are UTC. The time now is 22:38.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.