mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > Msieve

Reply
 
Thread Tools
Old 2010-11-20, 17:10   #12
aifbowman
 
Nov 2010

2×7 Posts
Default

Mini-geek and Jason, thank you so much to you both, you've been incredibly helpful :) .

As it is impossible for me to access the computers over the weekend (the room is locked) and I anticipate the relations to be at 30% by monday morning, it is unlikely that changing the polynomial will make the sieving stage any faster, but I will do the math and find out.

From what you have both said, I think splitting the linear algebra step is a bit beyond me, but If I am nearing the deadline then I will run square roots for different dependency numbers on different machines.

I just have one more short question for now, I'm not sure how familiar you are with the factmsieve python script that switches between msieve and GNFS for different stages, but that is what I am using. This script has inbuilt support for multiple clients (hence how I am using multiple computers). But when you start each client you have to specify which number client that one is and also the total number of clients. My question is this: now that I have started the sieving with 10 clients (by inputting the command: 'factmsieve.py 1 10, 2 10, 3 10 etc for each machine) can I add an 11th or 12th client by just putting 11 12 or 12 12? Or will that not work/ screw up the whole thing seeing as the other 10 think the maximum is 10? Sorry if that isn't very clear but I hope you get what I am saying.
aifbowman is offline   Reply With Quote
Old 2010-11-20, 17:54   #13
Mini-Geek
Account Deleted
 
Mini-Geek's Avatar
 
"Tim Sorbera"
Aug 2006
San Antonio, TX USA

10000101010112 Posts
Default

Quote:
Originally Posted by aifbowman View Post
I just have one more short question for now, I'm not sure how familiar you are with the factmsieve python script that switches between msieve and GNFS for different stages, but that is what I am using. This script has inbuilt support for multiple clients (hence how I am using multiple computers). But when you start each client you have to specify which number client that one is and also the total number of clients. My question is this: now that I have started the sieving with 10 clients (by inputting the command: 'factmsieve.py 1 10, 2 10, 3 10 etc for each machine) can I add an 11th or 12th client by just putting 11 12 or 12 12? Or will that not work/ screw up the whole thing seeing as the other 10 think the maximum is 10? Sorry if that isn't very clear but I hope you get what I am saying.
The new client's work would be wasted because of how the splitting works. During post-processing, the duplicate relations would be ignored, so it wouldn't cause any serious problems, but wouldn't help a bit.
From the factmsieve.py file, this is how the splitting works:
Code:
    # For multiple clients, the q search space is divided 
    # into major blocks of length num_clients * fact_p['qstep'] so
    # that major block i starts and ends at:
    #
    #    QSTART + i * num_clients * fact_p['qstep']
    #    QSTART + (i + 1) * num_clients * fact_p['qstep'].
    #
    # Within each such major block, client k sieves the 
    # (k - 1)'th fact_p['qstep'] block. It then proceeds to it's 
    # fact_p['qstep'] block within the next major block
As an example, if you start searching at q=10M with two clients working in 1M blocks, the first client sieves 10M-11M, and 12M-13M, and so on, and the second client does the other parts (11M-12M, etc.). This fairly complicated thing would be thrown off unless you can restart all clients as their number out of 12. Alternatively, you could set the new clients to search at a q range that the other clients won't be (starting a bit higher than where you expect the 10 to end would be good)

Last fiddled with by Mini-Geek on 2010-11-20 at 17:55
Mini-Geek is offline   Reply With Quote
Old 2010-11-20, 18:24   #14
schickel
 
schickel's Avatar
 
"Frank <^>"
Dec 2004
CDP Janesville

2×1,061 Posts
Default

Quote:
Originally Posted by Mini-Geek View Post
The new client's work would be wasted because of how the splitting works. During post-processing, the duplicate relations would be ignored, so it wouldn't cause any serious problems, but wouldn't help a bit.

...

This fairly complicated thing would be thrown off unless you can restart all clients as their number out of 12. Alternatively, you could set the new clients to search at a q range that the other clients won't be (starting a bit higher than where you expect the 10 to end would be good)
Note that if you do that [start some new clients above the current range], start them off as '1 of 2' and '2 of 2', otherwise the range gets split up as if you had the original 12 and you'll skip over a lot of the new range....

Last fiddled with by schickel on 2010-11-20 at 18:25 Reason: Minor edit for clarity
schickel is offline   Reply With Quote
Old 2010-11-20, 23:23   #15
fivemack
(loop (#_fork))
 
fivemack's Avatar
 
Feb 2006
Cambridge, England

72×131 Posts
Default

Quote:
Originally Posted by aifbowman View Post
You don't happen to know what quad core exactly was used for the c161? I looked through the thread but couldn't find any mention of it.
Looking through my records, I did that job with -t4 on my dual Opteron 2380 machine (so using half the threads available) ... Opteron 2380 chips were going for about €100 on ebay at the start of the year (they have since gone up substantially - I assume some large customer decommissioned quad-Shanghai machines when the Xeon 5500 chips came out, so there was briefly a glut on the market) and I thought it would be silly not to take advantage

Last fiddled with by fivemack on 2010-11-20 at 23:24
fivemack is offline   Reply With Quote
Old 2010-11-21, 00:28   #16
aifbowman
 
Nov 2010

2·7 Posts
Default

Ah ok so fivemack I guess i'll be very lucky if I manage to do it in 4 days with my i7-920.

So if I set a new group of computers to do a different range, when the first group reach that range will they ignore it automatically (they are all writing to them same dat file)? Or will I need to stop them and reset them at a new range?

I think my best bet is to stop all the machines when I get access to them on monday morning and then just restart them all along with a few extra ones. My only concern with this is that the ones I stop will all create resume files. How will that affect things if I then pick up with more than 10 machines?
aifbowman is offline   Reply With Quote
Old 2010-11-21, 02:59   #17
fivemack
(loop (#_fork))
 
fivemack's Avatar
 
Feb 2006
Cambridge, England

72×131 Posts
Default

Aargh! Having all the machines writing to the same dat file is in general a bad idea; it increases the opportunities for data corruption while not gaining anything.

I would strongly recommend that you write your own scheduling tools for this problem, if only that it's a useful exercise to write scheduling tools, and profitable to be able to blame nobody but yourself when you accidentally set two rooms full of machines to sieving the same region. Not that I've done that more than twice.

The i7/920 is (because of the three memory buses) an almost ideal machine for doing large GNFS jobs: the matrix work on a C160 that I did three weeks ago (7124576x7124803) took only 70 hours using -t4 on my i7/920. 55603893 relations, 44226266 unique; gnfs-lasieve4I15e for the sieving, 29-bit large primes, 3LP on algebraic side, alim=rlim=25e6
fivemack is offline   Reply With Quote
Old 2010-11-21, 04:12   #18
schickel
 
schickel's Avatar
 
"Frank <^>"
Dec 2004
CDP Janesville

2·1,061 Posts
Default

Quote:
Originally Posted by aifbowman View Post
So if I set a new group of computers to do a different range, when the first group reach that range will they ignore it automatically (they are all writing to them same dat file)? Or will I need to stop them and reset them at a new range?
The problem with starting all of the current ones then adding new machines to the mix is just that: the current machines don't know anything about the new machines, and the range the new machines work on has no special meaning to the current machines. Your best bet is to start the new ones at, say, 1.5-2 times the factorbase bounds, let them sieve until you're close to being done, then transfer the new relations to the current cluster to get a total count on the relations. Anything else greatly complicates the work required on your part.
Quote:
Originally Posted by aifbowman
I think my best bet is to stop all the machines when I get access to them on monday morning and then just restart them all along with a few extra ones. My only concern with this is that the ones I stop will all create resume files. How will that affect things if I then pick up with more than 10 machines?
Quote:
Originally Posted by fivemack View Post
Aargh! Having all the machines writing to the same dat file is in general a bad idea; it increases the opportunities for data corruption while not gaining anything.
Actually, not having done more than glance at the python script (I use the old(er) perl script), all the client machines should be writing to an "spairs.add.*" file, which the "master" machine (#1 of x) scoops up and appends to the main msieve.dat file. So one less thing to worry about there...
Quote:
Originally Posted by fivemack
I would strongly recommend that you write your own scheduling tools for this problem, if only that it's a useful exercise to write scheduling tools, and profitable to be able to blame nobody but yourself when you accidentally set two rooms full of machines to sieving the same region. Not that I've done that more than twice.
I keep trying to convince myself to sit down and do something along these lines; so far no luck. (And now it's going to be very hard, depending on how much snow we get here this winter...so far 4" (~10 cm).
schickel is offline   Reply With Quote
Old 2010-11-21, 14:14   #19
aifbowman
 
Nov 2010

2×7 Posts
Default

As schickel rightly points out the python script makes all the clients write to spairs files which the master machine periodically transfers to the main dat file. So I don't need to worry about any scheduling tools right..?

Fivemack out of interest why did you use t4 with both your dual opteron and i7 post processings? I mean in both cases you had 8 threads available so were you just leaving some threads for general tasks or is there a problem with using more than 4? I only ask because my i7s are currently sieving using 8 threads, would 4 actually make the process quicker?

It sounds like adding a second cluster into the mix is going to cause more trouble than it's worth, would my idea of stopping the first cluster and then restarting with a larger cluster work fine? I'm still worried about how there will be 10 resume files but more than 10 machines working on it.

Thanks for your time guys :)
aifbowman is offline   Reply With Quote
Old 2010-11-21, 15:28   #20
fivemack
(loop (#_fork))
 
fivemack's Avatar
 
Feb 2006
Cambridge, England

641910 Posts
Default

Quote:
Originally Posted by aifbowman View Post
As schickel rightly points out the python script makes all the clients write to spairs files which the master machine periodically transfers to the main dat file. So I don't need to worry about any scheduling tools right..?
You ought to worry about scheduling tools unless you are entirely convinced that the python script works correctly when you enlarge your pool of sievers ... read the source code and figure out what would happen in that case.

Quote:
Fivemack out of interest why did you use t4 with both your dual opteron and i7 post processings? I mean in both cases you had 8 threads available so were you just leaving some threads for general tasks or is there a problem with using more than 4?
Linear algebra is a test of the machine's memory bandwidth; running eight threads on either the dual Opteron or the i7 gets answers more slowly than running four. You can't use the machine for anything else memory-intensive while it's running LA on half its threads, without slowing down the LA a lot.

Quote:
I only ask because my i7s are currently sieving using 8 threads, would 4 actually make the process quicker?
The sieving step is not particularly restrained by memory bandwidth, and I find that running eight threads gets me twice as many answers in about 60% more time than running four.

I think you should probably be testing this sort of thing yourself rather than asking us here; it uses a little compute time, of which you have a reasonable amount, and the result tends to stick with you better.

Just out of curiosity, is it Kevin Buzzard lecturing for the class asking you to factor a C160?

Tom
fivemack is offline   Reply With Quote
Old 2010-11-21, 16:12   #21
aifbowman
 
Nov 2010

2×7 Posts
Default

Tom- no actually it is not, I don't think he lectures for any first year courses (at least not the ones I have started so far), why do you ask?

I will look into the python script's code but to be honest I have very little programming experience so I will struggle to have any idea what is going on...

Thanks for the advice about the number of threads to use, makes sense and I will bear it in mind for the different stages.

You raise a fair point about me testing this stuff myself... and believe me I am trying, it's just a bit of a struggle when barely a week ago I didn't even know what modular arithmetic was...

Anthony
aifbowman is offline   Reply With Quote
Old 2010-11-21, 16:33   #22
aifbowman
 
Nov 2010

2×7 Posts
Default

Ok, I have tested out my idea of stopping the 10 clients and then restarting with an 11th, specifying the maximum number of clients as 11, but the 11th just starts sieving right from the beginning again. So it is looking like the only way I will be able to use more machines is if I specify different ranges for them. Can I do this through the python script? Also how would I merge the relations this second cluster would find with those from the first cluster? Is it simply a case as copying and pasting the contents of one dat file into the other?
aifbowman is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
A couple newbie questions evanmiyakawa Information & Answers 4 2017-11-07 01:37
new here with a couple questions theshark Information & Answers 21 2014-08-30 17:36
2^877-1 polynomial selection fivemack Factoring 47 2009-06-16 00:24
Polynomial selection CRGreathouse Factoring 2 2009-05-25 07:55
A couple questions from a new guy Optics Information & Answers 8 2009-04-25 18:23

All times are UTC. The time now is 00:48.


Sat Jul 17 00:48:12 UTC 2021 up 49 days, 22:35, 1 user, load averages: 2.00, 1.59, 1.42

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.