mersenneforum.org  

Go Back   mersenneforum.org > Other Stuff > Archived Projects > NFSNET Discussion

 
 
Thread Tools
Old 2003-09-30, 16:21   #1
xilman
Bamboozled!
 
xilman's Avatar
 
"π’‰Ίπ’ŒŒπ’‡·π’†·π’€­"
May 2003
Down not across

22·5·72·11 Posts
Default Straw Poll: Off-line NFSNET sieving?

I posted this to nfsnet-l@nfsnet.org. Not everyone in this forum subscribes to the list (though they should do, IMO ;-) so apologies to those who see it twice. Please respond either to the list or the forum. The NFSNET admins monitor both.


I want to find out how many people would be prepared to spend some effort hand-holding machines so that they could increase their contribution to the NFSNET project. That is, I'm making a cost-benefit analysis and I'm not going to pretend that the suggestion is cost-free. If you haven't already been scared off, please read on.

The current project to factor 2,757-, aka M757 is approaching its end. About another two weeks should be all that's required.

As you know, we've been thinking about running a record-breaking factorization of M811 either next in line or the one after. I posted an estimate back in the summer of how difficult this project would be. Since then we've refined our estimates and in doing so have done about 0.05% of the sieving required. It will take close to 75 thousand WUs.

In the past, some of you expressed the opinion that you would be able to contribute more if NFSNET supported off-line and dial-up connected machines. There's no doubt that if we (the guys running the servers and writing the code) put in enough effort, we could support clients of that nature. However, it would not be an easy option, to say the least, to be able to support all and sundry while protecting the project from incompetence and from malicious attacks.

Here is a possible trade-off between effort and increased productivity. Suppose somewhere on www.nfsnet.org there was a form into which you typed an email address and clicked a button. Shortly afterwards, you would receive an email which contained the text to be saved in a file which was then fed to an instance of the linesiever binary on your machine. The file would contain a block of, say, 10K lines, or perhaps a week's work for an average machine. (10K is just an example. It would be easy to calibrate to 1 week or whatever). You would then fire up the linesiever and leave it running. When it had finished, you would find a results.txt file which contains a large number of relations. It's then your responsibility to get that file to an NFSNET ftp server. Thereafter, the results would be processed as normal and your work would appear in the stats as usual.

We would, of course, document where to place the input file, how to call the linesiever and where its output would be found. We'd be able to give some tech support but wouldn't be able to do cradle-to-grave handholding. You'd be expected to be able to solve some problems by yourself. You would also be encouraged to write any labour-saving scripts you may want to make your life easier. Others using the same mechanism may be able to provide support too.


OK, so who would be interested in such a scheme, and how many additional clients could you provide?


Paul

P.S. The mechanism outlined above is essentially how The Cabal operated very successfully on several record-breaking factorizations. A number of us in NFSNET have been part of the Cabal (I helped found that group) and we know how it works in practice. I'm not proposing anything too radical.
xilman is offline  
Old 2003-10-12, 17:12   #2
dsouza123
 
dsouza123's Avatar
 
Sep 2002

2·331 Posts
Default

For returning the results, would it be as simple as (while the nfsnet program isn't running) copying the project.in and results.txt file into the ./processors/p0 directory (on a internet connected machine) making sure the stop.txt file is not present and running the nfsnet client which should see a project.in and a results.txt file and it would upload the results automatically.

I am asking because there was a few times after finishing a result the server wasn't responding so it kept doing the sleep then retry server cycle, eventually I stopped the program later
on restarting the program I believe it uploaded the results.txt file and got a new assignment.

If there is a setting or a file which would tell the program on an offline machine not to try to contact the server to get an assignment but just to use the project.in file and when done not to try to contact the server just somehow indicate the assignment is done so the 2 files could be copied to a internet connected machine.

In a choice of setting or file, a file is much preferred then its existance would determine the state. It is much easier to create or delete a file than to modify a file and change/add a setting.

If there was a setting or a file that would ask for a large assignment 1k or 10k lines, then getting an extended amount of work would be easier. If some other special setting or file were needed to allow this behavior ( getting the enabling setting/file as a result of emailing and having it tied to specific ID so it couldn't be played with is fine ).
Would there be an extended timeout for the large assignments so the work wouldn't be reassigned ?

From what I have seen it is roughly 1/4 the number of lines in KB for the results file.
100 lines to seive would produce about 25KB results.txt file.
This is from the current 2^757-1 project and could differ in the next.

One issue would be the switching to/from a sub range in the project. While doing the 2^757-1 project my PC switched at the start of September from a sub range to the main range. Would that also be automatic, with the project.in file having enough info to trigger the switchover ?

I am trying to figure a way that would be nearly automated, needing only to copy a few files to and from a offline machine but utilizing the very functional and straight forward system that is used when online.
dsouza123 is offline  
Old 2003-10-12, 19:42   #3
Wacky
 
Wacky's Avatar
 
Jun 2003
The Texas Hill Country

32×112 Posts
Default

The procedure that you indicate will work for reasonably small files. However, there are many cases where it would not work for files that are larger. Our normal client uses an HTTP POST to send the results. Quite a few of the machines are behind a proxy that they may not even realize exists. For example, a cable company comes to mind. Unfortunately, large POSTs are not the norm and the proxy does make a difference.

Therefore, I suspect that you will really have to manually upload the files using a protocol, such as ftp, that is not proxied.

Remember that if you generate a reasonably sized file in an hour or two, it will be an order of magnitude greater if it accumulates for a whole day.

As a result, at least in the short term, any offline sieving that gets done will have to involve quite a bit more manual intervention on the part of a participant that wants to try it.
Wacky is offline  
Old 2003-10-12, 21:48   #4
dsouza123
 
dsouza123's Avatar
 
Sep 2002

2·331 Posts
Default

Using a 1.2 Ghz Athlon with a moderate CPU load the rate is about 1 line per minute, with essentially no other load it is about 3 lines per minute. So 60 to 180 lines per hour would be 15kb to 45kb per hour.
What is a size limit on a http post ?

If it is too big a couple of solutions come to mind.

1) A utility that breaks a single result file into a number of smaller ones.
Considering internet protocols use packets to deal with the same issue of transferring large amounts of data maybe an intermediate packetized format could be used. There is nothing that mandates that a large number of lines have to be send as one enormous file. ( If you think about it, sequential independant results from multiple machines is already the equivalent of packets ).

Start with the large results.txt file, have a separate utility that turns it into the packet form, then the same ( or a different utility ) sends the packets using http post ( since there is already code that works well using this system ). The receiving server could reprocess the packets into a results.txt file or whatever form is most usefull. Calling another program shouldn't be a big issue, it is already the way nfsnet clients work.

Alternatively if the packets are designed well enough they could be used without reassembling ( the equivalent of smaller results.txt files, some info would be duplicated in each packet but that is equivalent to the current system with sequential assignments).
Maybe there could be a cutoff, if the results.txt file is over some size this special purpose packetizing / sending program would be called.

2) The use of file compression.
With the result.txt file stored in a zip file it would be about a third the size.

I figure the more automatic the system the more participants.
If it can be brought down to copying a few files to and from an offline machine and the program on the online machine automatically dealing with the transfers it has a better chance of success.

With a need for more participants / computers, reducing the hurdles for people to help should be the focus.

I wrote utilies to manage stop.txt file for the command line client when I couldn't get the GUI client to work, because I wanted it to be streamlined to a few clicks. Others may not want to/ be able to write utilities to automate the system so if the burden could be lifted from them everybody will benefit.

Thanks for an interesting distributed project and for the providing the tools to allow me to participate.
dsouza123 is offline  
Old 2003-10-13, 02:38   #5
Wacky
 
Wacky's Avatar
 
Jun 2003
The Texas Hill Country

100010000012 Posts
Default

Thanks for your suggestions. For the large part, they are already a part of the future plans which are already underway.

Unfortunately, I am the only primary researcher who is concerned about the sieving infrastructure.

Paul is not really a programmer and is concerned with the math side of things and the post processing (which is a non-trivial task). Don handles the web site and tries to help with the "customer service". Jeff has provided the Windows GUI and helps with Windows related problems. Chris has provided some code and is actively trying to resolve some OS specific problems on the server side which you may not even notice, but are causing us some difficulty.

But, that leaves the rest to me. Besides the operational administration problems, I have to both design and code new features. We are all very busy trying to keep things running and implement improvements. I'm spread too thin. My design goals are well ahead of the time that I have available to implement them.

Quite frankly, I'm unsure of the real "return on investment" that we will get from participants who do not have 24/7 connectivity. There are certainly a number of people who may think that they want to participate. But when they really realize the resources necessary to be effective, they may decide that this project is too demanding of them and, appropriately, choose another project that better fits the resources that they have to offer.

I would certainly like to develop an automated system to support offline sieving. I have some design ideas. But at the present time, they await additional programming resources.

In the interim, at the expense of some of our time, we are willing to allow a few experiments to see if there is any real promise of significant increased productivity which would result from off-line sieving.

I appreciate your interest and hope that you will continue to support our efforts.

Richard
Wacky is offline  
Old 2003-10-13, 18:36   #6
dsouza123
 
dsouza123's Avatar
 
Sep 2002

2×331 Posts
Default

Fair enough, now I understand the significant amount of work being done by very few persons. The focus is ( correctly )maintaining the existing systems requiring the complete effort of those doing the behind the scenes work.

I'll continue with the regular dialup sieving, if I write some other utilities that may help I'll post them as before.

Hope you have continued success with your efforts.

Thanks
dsouza123 is offline  
 

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Stockfish game, move 11 poll. Not "Trump vs Clinton" poll. MooMoo2 Other Chess Games 0 2016-11-07 06:07
line sieving references Raman Factoring 14 2013-10-04 11:43
Line sieving vs. lattice sieving JHansen NFSNET Discussion 9 2010-06-09 19:25
nfsnet down fivemack NFSNET Discussion 9 2007-04-23 14:55
http://www.nfsnet.org/downloads/nfsnet-04145-powerpc-MacOSX.tgz Death NFSNET Discussion 15 2004-06-22 07:35

All times are UTC. The time now is 04:18.


Sun Aug 1 04:18:04 UTC 2021 up 8 days, 22:47, 0 users, load averages: 2.02, 2.70, 2.34

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.