mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Msieve (https://www.mersenneforum.org/forumdisplay.php?f=83)
-   -   Feedback for new MPQS utility sought (https://www.mersenneforum.org/showthread.php?t=3240)

schickel 2004-12-17 00:46

Random number seeds?
 
Jason,
How do you generate the random seeds for MSieve? I've noticed that it looks like the first one is always (fff.....) with the high bits set on my machine. Is that going to make a difference? (I generally start MSieve as soon as my desktop loads; would random seeds close to each other affect the polynomials generated?)

This question was prompted by this.....four runs of MakeNum gave these results:
[Code]C:\Program Files\MSieve>makenum 20
factor 1: 7177440877
factor 2: 14141463367
101499517230903852859

Thu 12-16-2004 16:32
C:\Program Files\MSieve>makenum 20
factor 1: 7177440877
factor 2: 8780977949
63024950071188221273

Thu 12-16-2004 16:32
C:\Program Files\MSieve>makenum 20
factor 1: 7177440887
factor 2: 12576080897
90264077228347435639

Thu 12-16-2004 16:32
C:\Program Files\MSieve>makenum 20
factor 1: 7177440913
factor 2: 12716285693
91270389194334757709
[/Code]
Is it keying off a timer somewhere? If so, you might want to either hash the timer or iterate the generator a couple of times before generating your first number.....

Later,
Frank

jasonp 2004-12-17 02:14

[QUOTE=schickel]
How do you generate the random seeds for MSieve? I've noticed that it looks like the first one is always (fff.....) with the high bits set on my machine. Is that going to make a difference? (I generally start MSieve as soon as my desktop loads; would random seeds close to each other affect the polynomials generated?)
[/QUOTE]
Msieve uses a generator with 63 bits of internal state, seeded with two 32-bit
words. The first is the process ID and the second is the unix time. When the
generator is seeded the first word is reduced via a remainder computation and
the second is XOR'ed with a fixed constant. Process IDs in win9x are large
negative numbers, and in most other OSes are small positive numbers.

[QUOTE=schickel]
This question was prompted by this.....four runs of MakeNum gave these results:
[/QUOTE]
Yeah, makenum uses the first words out of the generator to build the first
random prime. Msieve actually uses random numbers while building the
factor base for your input, so the generator has been running for a while
by the time it is actually relied upon to generate sieving polynomials. In
fact, I have yet to see any duplicate relations appear in a factorization
that were not explicitly put there (which is what would happen with bad
random numbers)

If a good seed was really a concern, the code could always be modified
to read the initial state from /dev/random (at least on linux).

jasonp

schickel 2004-12-17 04:55

Thanks for the info!
 
Jason,
[Quote=jasonp]Msieve actually uses random numbers while building the factor base for your input, so the generator has been running for a while by the time it is actually relied upon to generate sieving polynomials.[/Quote]OK, that makes me feel a little better....I haven't looked at the code, so I didn't realize how heavily it used the random generator.
[Quote=jasonp]Process IDs in win9x are large negative numbers, and in most other OSes are small positive numbers.[/Quote]
Unca Billy probably does it that way just to ornery....

[Quote=jasonp]In fact, I have yet to see any duplicate relations appear in a factorization that were not explicitly put there (which is what would happen with bad random numbers)[/Quote]
That would have been my main worry: that starting it with the same randoms would be wasteful.


I'm still chugging away on a c100 that (hopefully) will be done tonight. What we need is a scaled table of run times for various composites. Looking at the timestamp for the working directory, I started this one on 12/10 (PIII-500).....time for a PC upgrade!

Later,
Frank

ET_ 2004-12-17 17:43

And now the last one...
 
[code]
Msieve v. 0.87
random seeds: 00000c60 41bf23bd
factoring 5024561734078743614019475672210465536770797634435009651390623763133271404343125421517983291123932203
Tue Dec 14 18:32:45 2004
using multiplier of 3
Tue Dec 14 18:32:48 2004
using sieve block of 65536
using a sieve bound of 1872109 (69855 primes)
using large prime bound of 561632700
using double large prime bound of 5610664057485900


sieving in progress (press Ctrl-C to pause)
found 70071 relations (15043 full + 55028 partial), need 69983
begin with 1524278 relations
reduce to 180169 relations in 11 passes
attempting to read 15043 full and 180169 partial relations
recovered 15043 full and 180169 partial relations
recovered 193815 polynomials
attempting to build 55028 cycles
found 55028 cycles in 7 passes
distribution of cycle lengths:
length 2 : 11362
length 3 : 11330
length 4 : 9697
length 5 : 7813
length 6 : 5542
length 7 : 3652
length 8 : 2310
length 9+: 3322
largest cycle: 22 relations
Thu Dec 16 04:57:48 2004
69855 x 69919 system, weight 5346855 (avg 76.47/col)
reduce to 69258 x 69322 in 3 passes
lanczos halted after 1097 iterations
recovered 61 nontrivial dependencies
Thu Dec 16 05:10:51 2004
probable prime factor: 95615984081251482246538695683308609617345441049
probable prime factor: 52549391007773634411218693704643056941771279438002147
Thu Dec 16 05:14:22 2004
[/code]

Luigi

Mystwalker 2004-12-18 17:11

What came to my mind as an improvement for msieve:

Is it possible to optimize it for Intel CPUs (esp. P4s)? From what I've read, it's a lot slower there, compared to Athlon CPUs.

jasonp 2004-12-19 03:10

[QUOTE=Mystwalker]What came to my mind as an improvement for msieve:

Is it possible to optimize it for Intel CPUs (esp. P4s)? From what I've read, it's a lot slower there, compared to Athlon CPUs.[/QUOTE]
There's certainly room for improvement. The next version is going to
concentrate on incorporating the suggestions multiple people have
provided for msieve. There are also some improvements in sieving
performance (~8%) pending. Finally, there will be experimental code that
attempts to get the blocking right for Intel CPUs, so that if the
performance is not actually *good*, it may be less bad :smile:

I would expect the next version out in the next few days. For the
version after that, I have some half-baked ideas.

jasonp

R.D. Silverman 2004-12-19 03:28

[QUOTE=jasonp]There's certainly room for improvement. The next version is going to
concentrate on incorporating the suggestions multiple people have
provided for msieve. There are also some improvements in sieving
performance (~8%) pending. Finally, there will be experimental code that
attempts to get the blocking right for Intel CPUs, so that if the
performance is not actually *good*, it may be less bad :smile:

I would expect the next version out in the next few days. For the
version after that, I have some half-baked ideas.

jasonp[/QUOTE]

You have piqued my curiosity. What is it about the Athlons that makes
them faster? How much faster?

jasonp 2004-12-19 06:07

[QUOTE=R.D. Silverman]You have piqued my curiosity. What is it about the Athlons that makes
them faster? How much faster?[/QUOTE]
It appears that a factorization that takes time X on an Athlon or
AMD64 takes time about 1.5X on a 'similar speed' P4. The difference
appears narrower with the newer mobile P3 processors. Obviously
it's a little hard to directly compare the two kinds of machines.
(Somebody correct me if I'm off on this; I don't have access to P4s
outside of work).

My guess is that the difference boils down to two things:

- the L1 cache on AMD processors is larger than the L1
cache on Intel CPUs, and
- the L1 on AMDs is write-back while the L1 on the P4 is
write-through

The sieving code uses collections of hashtables to avoid having
to sieve with the large factor base primes, and interleaves the
sieving for several polynomials with the updates of roots for factor
base primes. The number of polynomials, and the number of factor
base primes that represent one 'chunk' of sieving work, determine
the size of the working set. Currently those parameters are hardwired
for a block size somewhere less than 32kB, and this sounds a bit
too large.

The second point means that the latency to memory is limited
by the latency to L2 cache on the P4, since most of the sieving
computations involve interleaved stream write operations. Again,
the use of hashtables limits the number of active cache lines in
use, and if this number is small enough the processor can use
store buffers to merge writes to the same cache line.

Both of these can be merged into one hypothesis: that given a working
set that involves lots of unpredictable writes, AMD processors have
a better latency to memory and so higher performance.

George's early notes on memory bandwidth and latency for the P4
(admittedly for a very different program) are still at

[url]www.mersenne.org/p4notes.doc[/url]

jasonp

R.D. Silverman 2004-12-19 15:15

[QUOTE=jasonp]It appears that a factorization that takes time X on an Athlon or
AMD64 takes time about 1.5X on a 'similar speed' P4.

<snip>

jasonp[/QUOTE]


Hi,

I was not aware that the two processors had the same instruction sets...

Does assembler code for the Pentium work on the Athlon?? Do executables
linked under Visual C++ for the Pentium run on an Athlon machine? I might
try benchmarking my code on an Athlon before buying my next machine....

jasonp 2004-12-19 17:03

[QUOTE=R.D. Silverman]Hi,

I was not aware that the two processors had the same instruction sets...

Does assembler code for the Pentium work on the Athlon?? Do executables
linked under Visual C++ for the Pentium run on an Athlon machine? I might
try benchmarking my code on an Athlon before buying my next machine....[/QUOTE]
They're completely binary compatible, with the possible exception of
multimedia-type instructions (that most compilers couldn't emit anyway)
existing for one processor type but not for another. The same applies
to AMD64 processors, whether running in 32-bit or 64-bit mode.

The win32 msieve binary on my web page is optimized for the athlon,
and nobody seems to have trouble running it on a pentium (modulo
performance concerns)

jasonp

jasonp 2004-12-21 14:25

Msieve 0.88 beta
 
I thought I'd try a beta release this time around before
going on to the real thing. The changes for version 0.88
are about complete, and have survived some local testing.
The web page has not been updated, but you can find
0.88 beta at

[url]www.boo.net/~jasonp/msieve088b.tar.gz[/url]
[url]www.boo.net/~jasonp/msieve_beta.exe[/url]

AFAICT this version incorporates just about all of the
suggestions I've received from others. You still cannot
enter a number as a formula, and it still cannot read
from multiple savefiles simultaneously. Be sure to check
the changelog and let me know if your feature request
has not been implemented. Also run the binary with
'-h' to get a list of the command-line options available.

As mentioned before, experimental code is now in place
to attempt to get Intel CPUs to sieve faster. If you
use such a CPU and have logs of old factorizations lying
around, I'm especially interested in whether the performance
changes when those factorizations are repeated with 0.88.
I honestly don't know if it will help, and it may hurt.

Good luck, and thanks for your patience.
jasonp


All times are UTC. The time now is 20:23.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.