mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Msieve (https://www.mersenneforum.org/forumdisplay.php?f=83)
-   -   Feedback for new MPQS utility sought (https://www.mersenneforum.org/showthread.php?t=3240)

Mystwalker 2005-03-30 15:16

I'm almost through with my 3rd application of msieve (C100, C98 and C101) - and I really like it.

It took some learning, though.
Quite confusing was the notion of "partial relations", as it seems to me that it is used for something I'd maybe also call "partial relations" as well as something like "full-equivalent relations that originated from partial relations"(?):

[code]Sat Mar 26 23:22:39 2005 restarting with 14681 full and 1428683 partial relations
Sat Mar 26 23:23:07 2005 found 65564 relations (14684 full + 50880 partial), need 69508[/code]

Apart from that, I was really surprised that I didn't need all the sieve files from the sieving machines to do the matrix step. I kept book, but I just added the reported relations - not knowing that new relations emerge once the files have been concatenated. A lot of hours of lost work. :sad:

Maybe it's possible to write up a little guide on how to use msieve?
I certainly doesn't have to be as detailed as the guide of ggnfs, but at least some hints are surely not bad.

Oh, and I found out that, under Windows, concatenation of the sieve files can be done simply with the "copy msieve1.dat + msieve2.dat msieve.dat" (thus the number to be factored appears a second time somewhere in the file). I seeme to work without a glitch, do you know of any problem that might occur?

jasonp 2005-03-31 02:59

[QUOTE=Mystwalker]I'm almost through with my 3rd application of msieve (C100, C98 and C101) - and I really like it.
[/QUOTE]Thanks!
[QUOTE=Mystwalker]
It took some learning, though.
Quite confusing was the notion of "partial relations", as it seems to me that it is used for something I'd maybe also call "partial relations" as well as something like "full-equivalent relations that originated from partial relations"(?):
[/QUOTE]
Yes, it's not clear from the progress messages that millions of partials are being accumulated in the background. Maybe something like "combined partials" would be more appropriate here.
[QUOTE=Mystwalker]
Apart from that, I was really surprised that I didn't need all the sieve files from the sieving machines to do the matrix step. I kept book, but I just added the reported relations - not knowing that new relations emerge once the files have been concatenated. A lot of hours of lost work. :sad:

Maybe it's possible to write up a little guide on how to use msieve?
I certainly doesn't have to be as detailed as the guide of ggnfs, but at least some hints are surely not bad.
[/QUOTE]
It would be a good idea to throw in a readme for the next version, at least. Of course I'd also need a thanks file, a references file, an algorithms file, etc. This particular issue has thrown many people who used msieve for distributed factoring.
[QUOTE=Mystwalker]
Oh, and I found out that, under Windows, concatenation of the sieve files can be done simply with the "copy msieve1.dat + msieve2.dat msieve.dat" (thus the number to be factored appears a second time somewhere in the file). I seeme to work without a glitch, do you know of any problem that might occur?[/QUOTE]
That should work fine. You can also use the Mingw unix utilities to do the same job.

jasonp

Mystwalker 2005-03-31 14:33

[QUOTE=jasonp]You can also use the Mingw unix utilities to do the same job.[/QUOTE]

I did so in the beginning.
The first time, I was only able to concatenate 2 files (maybe more, but not all 11-13) at once.
The second time, I only used a single PC...
The third time, I wanted to concatenate 2 files. With the head and grep mentioned in this thread, it did not work - msieve wanted to start from scratch. But "copy x+y z" did - and it is slightly less effort for me...

One other thing I noticed:
Could it be possible to do a screen output once the matrix stage starts? Right now, there are only log entries. Maybe some kind of progress update, too, but that should have a lower priority (implementation, not runtime)...

jasonp 2005-03-31 20:39

[QUOTE=Mystwalker]I did so in the beginning.
The first time, I was only able to concatenate 2 files (maybe more, but not all 11-13) at once.
The second time, I only used a single PC...
The third time, I wanted to concatenate 2 files. With the head and grep mentioned in this thread, it did not work - msieve wanted to start from scratch. But "copy x+y z" did - and it is slightly less effort for me...
[/QUOTE]
The head/grep method mentioned earlier, I believe, was for damage
control when the files were corrupted.

You can also use 'cat file1 file2 file3 ... > msieve.dat' and achieve the same as 'copy'. Note that I haven't tried that in Windows, but I'd be shocked if it didn't work.

[QUOTE=Mystwalker]
One other thing I noticed:
Could it be possible to do a screen output once the matrix stage starts? Right now, there are only log entries. Maybe some kind of progress update, too, but that should have a lower priority (implementation, not runtime)...[/QUOTE]
At C100 size you only have to wait a few minutes for all the postprocessing to finish; nonetheless it's easy to print something to the screen.

jasonp

smh 2005-03-31 21:10

Are you still working on speed optimalizations?

The new poly selection tool with GGNFS makes even 95 digit factorizations faster with GGNFS on my P4

jasonp 2005-04-01 02:19

[QUOTE=smh]Are you still working on speed optimalizations?

The new poly selection tool with GGNFS makes even 95 digit factorizations faster with GGNFS on my P4[/QUOTE]
Every once in a while I try something new, but nothing has made a difference so far. I still have a few ideas left, but I seriously doubt that major speedups are possible anymore without major surgery.

The 'market' for small factorizations seems to be drying up anyway :smile:

jasonp

Mystwalker 2005-04-11 12:57

(Possible) feature request:

Assuming it isn't possible so far, could you add an option to let msieve search for a set time?
That would come handy for me, as one could create work units of some sort that way.

jasonp 2005-04-12 01:06

[QUOTE=Mystwalker](Possible) feature request:

Assuming it isn't possible so far, could you add an option to let msieve search for a set time?
That would come handy for me, as one could create work units of some sort that way.[/QUOTE]
The right way to handle this is to break the code up into distinct executables, one of which is a siever that reads a job file (all the industrial-strength factoring programs do it this way). It would have the further advantage of reducing memory use by a lot during the sieve stage, but someone has to write that job file first :)

That being said, it should be easy to check the elapsed time every so often and give up when a time limit is reached. It would be even easier to set an alarm to go off at a preset time, but I don't know how portable that would be.

jasonp

Jahanbakhsh 2005-05-29 14:44

Dear Jason,
I want to add distributed computation feature to your msieve project by MPI for running this application under Linux cluster. I review your source code but encounter some problems in sieving portion. I can't find any documentation for the algorithms used by you in your homepage. I suggest doing a thorough documentation of the code for someone else wants to integrate improvements. :bow:

Jahanbakhsh

jasonp 2005-05-31 00:49

[QUOTE=Jahanbakhsh]Dear Jason,
I want to add distributed computation feature to your msieve project by MPI for running this application under Linux cluster. I review your source code but encounter some problems in sieving portion. I can't find any documentation for the algorithms used by you in your homepage. I suggest doing a thorough documentation of the code for someone else wants to integrate improvements. [/QUOTE]
I'll do what I can, in the time I can spare. You shouldn't wait for me, though:

If you're confused about how sieving works in general, there are two resources that really helped me get started: the quadratic sieve chapter of Crandall and Pomerance's book 'Prime Numbers: A Computational Perspective', and Scott Contini's paper 'Factoring Integers with the Self-Initializing Quadratic Sieve'. The first reference gives a good high-level view of how things work, and the second is really good at filling in the details of the sieving process that the first glosses over.

Once you understand these references, all the sieving code in msieve is packed into a single file, and the polynomial selection code is packed into another file. I've commented these very liberally, so hopefully you can find the points where the code corresponds to the description.

Unfortunately, msieve uses optimizations that do not appear in other QS programs. These are not documented anywhere but in the code, though if you can get a copy of ASIACRYPT 2004 there's a paper there called 'Sieving Using Bucket Sort', which describes the general idea.

I'd also suggest that if you're contemplating making a cluster-aware version of any factoring package that it be a number field sieve package. QS code would need an entire cluster to finish in the time that NFS code needs on a single machine, once the number to be factored is large enough.

Good luck,
jasonp

ValerieVonck 2005-06-01 06:46

What I like to is that the program can sieve a > 150 digit number in more doable time (like say 10 hours) in a distributed approach.


All times are UTC. The time now is 04:57.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.