mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Software (https://www.mersenneforum.org/forumdisplay.php?f=10)
-   -   Beta version 24.6 - Athlon users wanted (https://www.mersenneforum.org/showthread.php?t=3387)

Prime95 2004-12-14 19:25

A new [url]ftp://mersenne.org/gimps/p95v246.zip[/url] is available.

Fixes the bug in miscalculating completion dates, thus fetching way too much work to do. (Uninitialized stack variable)

May have fixed the allocating too much memory problem in P-1 stage 2. If not, please report the exponent and bounds.

Fixes the bug reported in running ECM on 2^n-3.

koekie 2004-12-14 19:53

Downloaded and installed latest version, keep you informed about any abnormalities I run into.

Prime95 2004-12-14 20:03

GDF, I've just uploaded a potential fix for machines that do not support prefetch. It is a little hard to debug here since all my CPUs do support prefetch! If it fails again, please post the crash address - that helped a lot.

Prime95 2004-12-14 20:24

Linux: [url]ftp://mersenne.org/gimps/mprime246.tar.gz[/url]
or [url]ftp://mersenne.org/gimps/sprime246.tar.gz[/url] (staticly linked)

gdf 2004-12-14 20:35

K6-2 cxt core, better of with V23.5 for 1024K FFT and larger
 
The results for a 350Mhz @ 400 K6-2 cxt core on the legendary Asus p55t2p4 with 512Kb L2 and 192Mb edo ram

AMD-K6(tm) 3D processor
CPU speed: 399.76 MHz
CPU features: RDTSC, MMX
L1 cache size: 32 KB
L2 cache size: 0 KB
L1 cache line size: 32 bytes
L2 cache line size: 0 bytes
L1 TLBS: 128
Prime95 version 23.5, RdtscTiming=1
Best time for 384K FFT length: 636.714 ms.
Best time for 448K FFT length: 765.981 ms.
Best time for 512K FFT length: 854.511 ms.
Best time for 640K FFT length: 1129.586 ms.
Best time for 768K FFT length: 1382.630 ms.
Best time for 896K FFT length: 1658.737 ms.
Best time for 1024K FFT length: 1867.738 ms.
Best time for 1280K FFT length: 2402.752 ms.
Best time for 1536K FFT length: 2870.065 ms.
Best time for 1792K FFT length: 3462.454 ms.
Best time for 2048K FFT length: 3878.508 ms.

AMD-K6(tm) 3D processor
CPU speed: 399.62 MHz
CPU features: RDTSC, MMX
L1 cache size: 32 KB
L2 cache size: unknown
L1 cache line size: 32 bytes
L2 cache line size: unknown
L1 TLBS: 128
Prime95 version 24.6, RdtscTiming=1
Best time for 512K FFT length: 807.001 ms.
Best time for 640K FFT length: 1090.182 ms.
Best time for 768K FFT length: 1323.925 ms.
Best time for 896K FFT length: 1608.689 ms.
Best time for 1024K FFT length: 1807.595 ms.
Best time for 1280K FFT length: 2612.892 ms.
Best time for 1536K FFT length: 3172.834 ms.
Best time for 1792K FFT length: 3795.632 ms.
Best time for 2048K FFT length: 4269.919 ms.


And I thought that the 3dnow did have a prefetch instruction (or only on athlon ?)

gdf

Mystwalker 2004-12-14 21:01

Interestingly, the new version is slower on the K6/2 with FFT lengths of 1280K and bigger...

Xyzzy 2004-12-14 21:04

Mprime on a K8...

AMD Athlon(tm) 64 Processor 3400+
CPU speed: 2402.69 MHz
CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE, SSE2
L1 cache size: 64 KB
L2 cache size: 512 KB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
L1 TLBS: 32
L2 TLBS: 512
Prime95 version [b]24.6[/b], RdtscTiming=1
Best time for 4K FFT length: 0.093 ms.
Best time for 5K FFT length: 0.133 ms.
Best time for 6K FFT length: 0.168 ms.
Best time for 7K FFT length: 0.211 ms.
Best time for 8K FFT length: 0.235 ms.
Best time for 10K FFT length: 0.311 ms.
Best time for 12K FFT length: 0.361 ms.
Best time for 14K FFT length: 0.451 ms.
Best time for 16K FFT length: 0.514 ms.
Best time for 20K FFT length: 0.651 ms.
Best time for 24K FFT length: 0.787 ms.
Best time for 28K FFT length: 0.957 ms.
Best time for 32K FFT length: 1.065 ms.
Best time for 40K FFT length: 1.723 ms.
Best time for 48K FFT length: 2.120 ms.
Best time for 56K FFT length: 2.577 ms.
Best time for 64K FFT length: 2.946 ms.
Best time for 80K FFT length: 3.904 ms.
Best time for 96K FFT length: 4.733 ms.
Best time for 112K FFT length: 5.672 ms.
Best time for 128K FFT length: 6.513 ms.
Best time for 160K FFT length: 8.105 ms.
Best time for 192K FFT length: 9.764 ms.
Best time for 224K FFT length: 11.655 ms.
Best time for 256K FFT length: 13.041 ms.
Best time for 320K FFT length: 17.041 ms.
Best time for 384K FFT length: 20.524 ms.
Best time for 448K FFT length: 24.493 ms.
Best time for 512K FFT length: 27.615 ms.
Best time for 640K FFT length: 34.023 ms.
Best time for 768K FFT length: 41.731 ms.
Best time for 896K FFT length: 50.227 ms.
Best time for 1024K FFT length: 56.740 ms.
Best time for 1280K FFT length: 75.799 ms.
Best time for 1536K FFT length: 90.789 ms.
Best time for 1792K FFT length: 110.390 ms.
Best time for 2048K FFT length: 125.055 ms.
Best time for 2560K FFT length: 163.044 ms.
Best time for 3072K FFT length: 187.716 ms.
Best time for 3584K FFT length: 232.004 ms.
Best time for 4096K FFT length: 266.159 ms.

AMD Athlon(tm) 64 Processor 3400+
CPU speed: 2402.39 MHz
CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE, SSE2
L1 cache size: 64 KB
L2 cache size: 512 KB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
L1 TLBS: 32
L2 TLBS: 512
Prime95 version [b]23.9[/b], RdtscTiming=1
Best time for 384K FFT length: 20.539 ms.
Best time for 448K FFT length: 24.491 ms.
Best time for 512K FFT length: 27.511 ms.
Best time for 640K FFT length: 34.074 ms.
Best time for 768K FFT length: 41.511 ms.
Best time for 896K FFT length: 49.847 ms.
Best time for 1024K FFT length: 56.590 ms.
Best time for 1280K FFT length: 75.263 ms.
Best time for 1536K FFT length: 90.542 ms.
Best time for 1792K FFT length: 109.816 ms.
Best time for 2048K FFT length: 124.190 ms.
Best time for 2560K FFT length: 162.746 ms.
Best time for 3072K FFT length: 187.725 ms.
Best time for 3584K FFT length: 230.422 ms.
Best time for 4096K FFT length: 266.172 ms.

Uncwilly 2004-12-14 22:47

Do the areas that have been tweaked effect the code that is involved in TF? After watching the expected time for completion melt away on the DC I am doing, I would like to run 1 or 2 TF's too. I plan on posting the expo I am working on once it is done on Friday. About 50% of the DC has been on 24.6

xan 2004-12-15 17:31

Is this behaviour normal ?

Prime95-Version [b]23.9[/b]
[code]
Mobile AMD Athlon(tm) 64 Processor 3000+
CPU speed: 1803.90 MHz
CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE, SSE2
L1 cache size: 64 KB
L2 cache size: 1024 KB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
L1 TLBS: 32
L2 TLBS: 512
Prime95 version 23.9, RdtscTiming=1
Best time for 384K FFT length: 26.581 ms.
Best time for 448K FFT length: 31.737 ms.
Best time for 512K FFT length: 35.747 ms.
Best time for 640K FFT length: 44.944 ms.
Best time for 768K FFT length: 54.817 ms.
Best time for 896K FFT length: 66.060 ms.
Best time for 1024K FFT length: 74.674 ms.
Best time for 1280K FFT length: 99.497 ms.
Best time for 1536K FFT length: 121.814 ms.
Best time for 1792K FFT length: 146.601 ms.
Best time for 2048K FFT length: 164.794 ms.
[/code]

Prime95-Version [b]24.6[/b]
[code]
Mobile AMD Athlon(tm) 64 Processor 3000+
CPU speed: 1803.90 MHz
CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE, SSE2
L1 cache size: 64 KB
L2 cache size: 1024 KB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
L1 TLBS: 32
L2 TLBS: 512
Prime95 version 24.6, RdtscTiming=1
Best time for 512K FFT length: 35.780 ms.
Best time for 640K FFT length: 45.032 ms.
Best time for 768K FFT length: 55.045 ms.
Best time for 896K FFT length: 66.314 ms.
Best time for 1024K FFT length: 74.819 ms.
Best time for 1280K FFT length: 99.902 ms.
Best time for 1536K FFT length: 122.161 ms.
Best time for 1792K FFT length: 147.312 ms.
Best time for 2048K FFT length: 165.277 ms.
[/code]

No (big) differences between the old and new code :( No matter which OS (nearly the same results for Windows XP and Linux) I used.

I can reduce the time if I switch off the SSE2-Support (tested only on Linux):

[code]
Mobile AMD Athlon(tm) 64 Processor 3000+
CPU speed: 1803.90 MHz
CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE
L1 cache size: 64 KB
L2 cache size: 1024 KB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
L1 TLBS: 32
L2 TLBS: 512
Prime95 version 24.6, RdtscTiming=1
Best time for 512K FFT length: 33.247 ms.
Best time for 640K FFT length: 44.431 ms.
Best time for 768K FFT length: 53.769 ms.
Best time for 896K FFT length: 65.014 ms.
Best time for 1024K FFT length: 72.825 ms.
Best time for 1280K FFT length: 96.282 ms.
Best time for 1536K FFT length: 116.102 ms.
Best time for 1792K FFT length: 138.661 ms.
Best time for 2048K FFT length: 154.845 ms.
[/code]

Uncwilly 2004-12-15 17:44

[QUOTE=xan]Is this behaviour normal ?

Prime95-Version [b]23.9[/b]
[code]
Mobile AMD Athlon(tm) 64 Processor 3000+
CPU speed: 1803.90 MHz
CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE, SSE2
[/code][/QUOTE][quote=Prime95]The good news: Athlons (except 64-bit CPUs) are about 15% faster. :banana:[/quote]

Looks like it. :mellow:

gbvalor 2004-12-15 18:12

This is for an Athlon XP 2500+ (Barton), SuSE Linux 8.2, Kernel 2.4.20, It seems about 15% faster with new 24.6 :grin:

NEW VERSION 24.6:

[Wed Dec 15 18:57:39 2004]
Compare your results to other computers at [url]http://www.mersenne.org/bench.htm[/url]
That web page also contains instructions on how your results can be included.

AMD Athlon(tm) XP 2500+
CPU speed: 1830.12 MHz
CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE
L1 cache size: 64 KB
L2 cache size: 512 KB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
L1 TLBS: 32
L2 TLBS: 256
Prime95 version 24.6, RdtscTiming=1
Best time for 512K FFT length: 35.025 ms.
Best time for 640K FFT length: 47.908 ms.
Best time for 768K FFT length: 58.600 ms.
Best time for 896K FFT length: 71.383 ms.
Best time for 1024K FFT length: 79.911 ms.
Best time for 1280K FFT length: 102.648 ms.
Best time for 1536K FFT length: 123.361 ms.
Best time for 1792K FFT length: 149.243 ms.
Best time for 2048K FFT length: 166.819 ms.

OLD VERSION 23.5:

[Wed Dec 15 19:01:45 2004]
Compare your results to other computers at [url]http://www.mersenne.org/bench.htm[/url]
That web page also contains instructions on how your results can be included.

AMD Athlon(tm) XP 2500+
CPU speed: 1830.01 MHz
CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE
L1 cache size: 64 KB
L2 cache size: 512 KB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
L1 TLBS: 32
L2 TLBS: 256
Prime95 version 23.5, RdtscTiming=1
Best time for 384K FFT length: 33.606 ms.
Best time for 448K FFT length: 39.149 ms.
Best time for 512K FFT length: 42.339 ms.
Best time for 640K FFT length: 55.299 ms.
Best time for 768K FFT length: 66.953 ms.
Best time for 896K FFT length: 80.526 ms.
Best time for 1024K FFT length: 89.689 ms.
Best time for 1280K FFT length: 120.169 ms.
Best time for 1536K FFT length: 142.668 ms.
Best time for 1792K FFT length: 176.829 ms.
Best time for 2048K FFT length: 192.572 ms.


Guillermo


All times are UTC. The time now is 07:24.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.