mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Software (https://www.mersenneforum.org/forumdisplay.php?f=10)
-   -   Shortest time to complete a 2^67 trial factor (no factor) (https://www.mersenneforum.org/showthread.php?t=1006)

dsouza123 2003-08-20 12:59

Shortest time to complete a 2^67 trial factor (no factor)
 
What is the shortest amount of time to complete a 2^67 TF ( without finding a factor ) ?

Time in days::hours:minutes .

What setup ( CPU, speed, cache size, OS ) ?

For example using my Athlon 1200 the shortest elapsed time is 3::2:22 .

3::02:22 2^67 M23128319,57
Athlon, 1200, L1 128 L2 256, Win98SE

They do start with different amounts of factoring done: 57,58 etc. which does affect the amount of time so if you know the starting point include it.

Most M23 on my PC are M23xxxxxx,57 with one ,58.

Are there any systems that can complete a 2^67
( Mxxxxxxxx,57 without finding a factor) in 1 day ?

Xyzzy 2003-08-20 17:21

I bet the fastest will be a P4... Even though the AMD wins by a big margin up to 64 bits, the vast majority of the overall work will be from 64-67, which is where the SSE2 crap kicks in...

I can run 23128319 from 58 to 59 bits in 75 or so seconds on an overclocked (2100MHz) 2500+ Barton, so...

59 = 75
60 = 150
61 = 300
62 = 600
63 = 1,200
64 = 2,400
65 = 4,800
66 = 9,600
67 = 19,200

If you add all that up you get 38,325 seconds or 10 hours 38 minutes 45 seconds...

eepiccolo 2003-08-20 18:06

[quote="Xyzzy"]
59 = 75
60 = 150
61 = 300
62 = 600
63 = 1,200
64 = 2,400
65 = 4,800
66 = 9,600
67 = 19,200

If you add all that up you get 38,325 seconds or 10 hours 38 minutes 45 seconds...[/quote]
The problem with this is that once you get past 64 bits, things slow down. so for you 65 bits might be something more like 7,000 s, but I'm not sure exactly. The SSE2 in the P4 keeps it from slowing down as much, but it still might end up at something like 6,000 s for 65 bits, if 64 bits took 2,400 s.

sdbardwick 2003-08-20 18:49

[quote="eepiccolo"]The problem with this is that once you get past 64 bits, things slow down. so for you 65 bits might be something more like 7,000 s, but I'm not sure exactly. The SSE2 in the P4 keeps it from slowing down as much, but it still might end up at something like 6,000 s for 65 bits, if 64 bits took 2,400 s.[/quote]

The slowdown over 64 bits is significant; my Athlon XP 1900+(1600Mhz) takes over a day just to go from 66 bit to 67 bit factoring on 23M exponents.

OTOH, my P4 2.5 (OC'd 2.4) generally takes about 25 hours total to TF a 23M exponent to 67 bits (starting bits 58, 59, 60). I'll check my work 2.66 and see if it is less than a day.

Xyzzy 2003-08-20 20:51

We know the work being done is doubled for each additional bit we go deeper... Why exactly does the time increase so much?

Prime95 2003-08-20 20:54

64 bits fits in one floating point or two integer resisters. To do 65-bit factors you must use two floats or three integer registers, resulting in a big jump in the number of multiplies and adds.

dsouza123 2003-08-21 00:12

Time in hours to do a 2^67 on an Athlon 1200

skipping smaller steps
2^60 1/12 = 5 minutes
2^61 1/6
2^62 1/3
2^63 1 1/4
2^64 2 1/2
2^65 10
2^66 20
2^67 40

74 1/3 hours

Up to and including 2^64 the time doubles
There is a big jump at 2^65 the time quadruples.
At 2^66 it resumes time doubling.

This is using the x87 ( no SSE2 on the Athlon )

garo 2003-08-21 01:07

Well if you notice the time quadruples from the 2^62 to 2^63 step as well. This is because non-SSE2 machines have very efficient code upto that limit. Some of those optimizations are not applicable for 2^63 and 2^64. I believe this code was contributed by someone other than George. If you mosey around in the forum you should find George's post about this.

So to sum it all up, 2^65 is about 32 times as slow as 2^62 as opposed to 8 times which is what we would have expected if tha scaling were regular. This proportion is very different for P4s and is actually closer to 8 than to 32.
In fact here are some timing comparisons I made about 7 months ago.

[code:1]Time taken to complete 14366959 to 0.98%

To Bit On PII 450 On P4 2533 Improv On 1333TB Improv

59 6 2.5 2.4 1.85 3.25
60 12 5 2.4 3.7 3.25
61 24 10 2.4 7.4 3.25
62 48 20 2.4 14.8 3.25
63 170 37.5 4.5 51 3.33
64 340 75 4.5 102 3.33
65 1540 180 8.5 480 3.21

[/code:1]

bayanne 2003-08-21 07:35

I think that this topic has important considerations for Lone Mersenne Hunters. All work on slower boxen (which is what this sub-project seems to attract) must be attempting to t-f all exponents up to 62 bit, rather than trying to move on upwards to 64 bit.

Thoughts anyone, or is this beginning to move OT .....

cheesehead 2003-08-21 12:15

[quote="garo"]Well if you notice the time quadruples from the 2^62 to 2^63 step as well. This is because non-SSE2 machines have very efficient code upto that limit. Some of those optimizations are not applicable for 2^63 and 2^64.[/quote]
I thought the increase in iteration times between 2^62 and 2^63 was due to the switchover from integer arithmetic (used up to 2^62) to floating-point arithmetic (for the 2^63 and 2^64 ranges) on some CPUs.

Classic Pentia, compared to the 486s, have far more efficient floating-point because AFAIK that's when Intel introduced FP pipelining. But their integer performance is not as dramatically better per clock cycle than 486s as the FP ratio is. As a result, classic Pentia get their best throughput per clock cycle, relative to other types, on tasks that are FP-heavy but below the SSE2 range.


TF M67108913 on a P75:

2^61 to 2^62: 0.279 sec./iteration
2^62 to 2^63: 0.320 sec./iteration
2^63 to 2^64: 0.320 sec./iteration
2^64 to 2^65: 1.314 sec./iteration

Note the small step from 2^62 iteration time to 2^63/2^64 iteration time.

Its [b]relative[/b] total times per bit range are about:

2^61 to 2^62: 1.0
2^62 to 2^63: 2.3
2^63 to 2^64: 4.6
2^64 to 2^65: 37.7

[quote="bayanne"]All work on slower boxen (which is what this sub-project seems to attract) must be attempting to t-f all exponents up to 62 bit, rather than trying to move on upwards to 64 bit.[/quote]
It would make sense for classic Pentia to concentrate on 2^62 -> 2^64 rather than any other range.

patrik 2003-08-21 16:38

Re: Shortest time to complete a 2^67 trial factor (no factor
 
[quote="dsouza123"]What is the shortest amount of time to complete a 2^67 TF ( without finding a factor ) ?
Time in days::hours:minutes .[/quote]0::20:42 2^67 M23052089,60
P4, 3141 MHz, L1 8 L2 512, RedHat Linux[quote="dsouza123"]Are there any systems that can complete a 2^67
( Mxxxxxxxx,57 without finding a factor) in 1 day ?[/quote] 8)

dsouza123 2003-08-21 18:13

Wow, under 1 day 2^67 TF is already being done !

I thought it needed about a 3.6 GHz but the SSE2 instructions
on a P4 must be quite efficient.

A friend who is also doing TF is going to build a new machine
this may influence what chip/speed he gets.

I have been very curious about TF speeds and which CPU's are best,
but I don't have access to any PCs with SSE2.

All the posts have given really good information and insight.

Thanks

I wonder if the Opteron/Athlon64 will be this quick or if raw clock speed
will dominate ?

dsouza123 2003-08-21 18:38

What are the facts on integer / FP x87 / int x87 / FP SSE2 / int SSE2 ?

I noticed the x87 has a 64 bit signed integer format and
the SSE2 has 64 and 128 integer support.

What is used at what levels ?

Does it start as 64 bit integer up to 2^62 using 2 32 bit registers ?
Switch to x87 then at 2^64 or 2^65 go to SSE2 if available ?
Is the SSE2 using integer or floating point ?

Would a Prime95 lite ( just the TF code, interface could remain the same )
be more efficient ? ( a much smaller exe that could remain in the cache )

Or is the code for TF small enough that the section that does the TF can stay in the cache L1, L2 ?


All times are UTC. The time now is 20:12.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.