mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2004-12-08, 00:25   #1
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

2·32·7·59 Posts
Default Beta version 24.6 - Athlon users wanted

As some of you know the I've been rewriting the FFT code to bring big speed gains for SoB, LLR, OpenPFGW, and other projects. This rewrite is now complete.

The good news: Athlons (except 64-bit CPUs) are about 15% faster.

The bad news: P3s are 33% slower. I didn't time P4s, but they should be ever so slightly slower. I expect P2 and Pentium-M will also be slower.

Athlon owners running Windows may want to try the new version at ftp://mersenne.org/gimps/p95v246.zip Let me know if you find any bugs. Running a doublecheck or two would be nice.

In the meantime, I'll work on further fine tuning the new FFT code and see if I can recover some of the loss in P3 timings.

Last fiddled with by Prime95 on 2004-12-08 at 00:26
Prime95 is offline   Reply With Quote
Old 2004-12-08, 00:49   #2
Uncwilly
6809 > 6502
 
Uncwilly's Avatar
 
"""""""""""""""""""
Aug 2003
101×103 Posts

950010 Posts
Default

I went from about .156 to ~.114 a 13,xxx,xxx number on my 1.2 GHz ath.

THANKS!!
Uncwilly is offline   Reply With Quote
Old 2004-12-08, 05:16   #3
sdbardwick
 
sdbardwick's Avatar
 
Aug 2002
North San Diego County

2·11·31 Posts
Default

Neat!
I'll switch my Athlon 1900MP (2x 1.6GHz) box over to double checking as soon as the current factoring assignment finishes. Probably take a little less than a month for the first results.

Given the various sizes of L1/L2 cache in the Athlon/Duron/Sempron processors, is the new code optimized for one version in particular?

Would you like benchmarks posted?

I'm sure SalemTheCat100 will be pleased

-Scott-
sdbardwick is offline   Reply With Quote
Old 2004-12-08, 05:49   #4
moo
 
moo's Avatar
 
Jul 2004
Nowhere

80910 Posts
Default

why dont u do best of boath world i mean you already has a good speed for pents why not check first what cpu is installed then use the right tweeking like a driver kinda only for fft right drivers match right processcer
moo is offline   Reply With Quote
Old 2004-12-08, 07:08   #5
Uncwilly
6809 > 6502
 
Uncwilly's Avatar
 
"""""""""""""""""""
Aug 2003
101×103 Posts

224348 Posts
Default

I am already doing DC's with my ath. Here are the benches in "non-safe mode" (ie how I typically run my machine.

The new:
Code:
AMD Athlon(tm) processor
CPU speed: 1127.85 MHz
CPU features: RDTSC, CMOV, PREFETCH, MMX
L1 cache size: 64 KB
L2 cache size: 256 KB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
L1 TLBS: 24
L2 TLBS: 256
Prime95 version 24.6, RdtscTiming=1
Best time for 512K FFT length: 65.019 ms.
Best time for 640K FFT length: 87.656 ms.
Best time for 768K FFT length: 105.915 ms.
Best time for 896K FFT length: 129.876 ms.
Best time for 1024K FFT length: 145.119 ms.
Best time for 1280K FFT length: 195.657 ms.
Best time for 1536K FFT length: 235.093 ms.
Best time for 1792K FFT length: 280.729 ms.
Best time for 2048K FFT length: 321.764 ms.
The old:
Code:
AMD Athlon(tm) processor
CPU speed: 1127.80 MHz
CPU features: RDTSC, CMOV, PREFETCH, MMX
L1 cache size: 64 KB
L2 cache size: 256 KB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
L1 TLBS: 24
L2 TLBS: 256
Prime95 version 23.5, RdtscTiming=1
Best time for 384K FFT length: 78.067 ms.
Best time for 448K FFT length: 90.489 ms.
Best time for 512K FFT length: 97.280 ms.
Best time for 640K FFT length: 123.678 ms.
Best time for 768K FFT length: 149.556 ms.
Best time for 896K FFT length: 175.282 ms.
Best time for 1024K FFT length: 200.824 ms.
Best time for 1280K FFT length: 271.674 ms.
Best time for 1536K FFT length: 328.457 ms.
Best time for 1792K FFT length: 399.580 ms.
Best time for 2048K FFT length: 477.366 ms.
Uncwilly is offline   Reply With Quote
Old 2004-12-08, 14:29   #6
jebeagles
 
jebeagles's Avatar
 
Jun 2004
Chicago

22×7 Posts
Default Thanks!!

my athlon XP 2400+ seems to working well with it, times are down, productivity is up... excellent work.
jebeagles is offline   Reply With Quote
Old 2004-12-08, 16:32   #7
jebeagles
 
jebeagles's Avatar
 
Jun 2004
Chicago

1C16 Posts
Default

The new:
Code:
AMD Athlon(tm) XP 2400+
CPU speed: 1991.65 MHz
CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE
L1 cache size: 64 KB
L2 cache size: 256 KB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
L1 TLBS: 32
L2 TLBS: 256
Prime95 version 24.6, RdtscTiming=1
Best time for 512K FFT length: 36.591 ms.
Best time for 640K FFT length: 47.987 ms.
Best time for 768K FFT length: 58.558 ms.
Best time for 896K FFT length: 69.761 ms.
Best time for 1024K FFT length: 78.666 ms.
Best time for 1280K FFT length: 109.193 ms.
Best time for 1536K FFT length: 132.532 ms.
Best time for 1792K FFT length: 157.138 ms.
Best time for 2048K FFT length: 176.963 ms.
The old:
Code:
AMD Athlon(tm) XP 2400+
CPU speed: 1991.13 MHz
CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE
L1 cache size: 64 KB
L2 cache size: 256 KB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
L1 TLBS: 32
L2 TLBS: 256
Prime95 version 23.8, RdtscTiming=1
Best time for 384K FFT length: 46.405 ms.
Best time for 448K FFT length: 57.387 ms.
Best time for 512K FFT length: 59.590 ms.
Best time for 640K FFT length: 77.468 ms.
Best time for 768K FFT length: 91.095 ms.
Best time for 896K FFT length: 108.234 ms.
Best time for 1024K FFT length: 121.520 ms.
Best time for 1280K FFT length: 165.685 ms.
Best time for 1536K FFT length: 190.610 ms.
Best time for 1792K FFT length: 242.729 ms.
Best time for 2048K FFT length: 273.883 ms.
I'm also getting a 33% increase on a 26,xxx,xxx number!

Last fiddled with by jebeagles on 2004-12-08 at 16:33
jebeagles is offline   Reply With Quote
Old 2004-12-08, 17:16   #8
MrHappy
 
MrHappy's Avatar
 
Dec 2003
Paisley Park & Neverland

B916 Posts
Default

I don't know if this has anything to do with the new version. It's not hamful either, just unexpected:
I only did LMH Factoring lately, so I downloaded 24.6 and requested some doublechecks because my queue was empty. Now: All of them were expected to be completed by tomorrow. So they kept coming in and in and in... till I clicked Stop!
Why were they expected to complete immediately?
MrHappy is offline   Reply With Quote
Old 2004-12-08, 17:51   #9
Xyzzy
 
Xyzzy's Avatar
 
"Mike"
Aug 2002

3·2,689 Posts
Default

Will there be a Mprime version?
Xyzzy is offline   Reply With Quote
Old 2004-12-08, 20:59   #10
JuanTutors
 
JuanTutors's Avatar
 
Mar 2004

50910 Posts
Default

Quote:
Originally Posted by Prime95
I didn't time P4s, but they should be ever so slightly slower.
Has anyone tried to benchmark a P4 yet? (I would myself, but I'm trying to get an exponent finished before I go on vacation for a few weeks .)
JuanTutors is offline   Reply With Quote
Old 2004-12-08, 21:05   #11
akruppa
 
akruppa's Avatar
 
"Nancy"
Aug 2002
Alexandria

1001101000112 Posts
Default

Dear George,

does this mean you implemented Colin's general DWT for non-SSE2 architectures as well?

Alex
akruppa is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
LLR beta Version 3.8.13 (deprecated) Jean Penné Software 111 2015-01-26 21:41
Prime95 beta version 28.3 Prime95 Software 68 2014-02-23 05:42
Beta version 24.12 available Prime95 Software 33 2005-06-14 13:19
Early Beta of version 24.11 Prime95 Software 113 2005-05-24 17:05
Beta version of PRP Prime95 PSearch 15 2004-09-17 19:21

All times are UTC. The time now is 08:59.

Fri Apr 23 08:59:53 UTC 2021 up 15 days, 3:40, 0 users, load averages: 1.59, 1.65, 1.81

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.