mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   No Prime Left Behind (https://www.mersenneforum.org/forumdisplay.php?f=82)
-   -   PRPnet (https://www.mersenneforum.org/showthread.php?t=12223)

mdettweiler 2009-07-28 15:54

PRPnet
 
Hi all,

With recent releases of PRPnet, PRPnet is stable enough to handle large #'s of clients on a wide variety of tests. As of now we have two PRPnet servers available, port 9000 running the 13th drive and port 2000 running the 14th drive.

For those unfamiliar with PRPnet, it was designed by Mark Rodenkirch (who goes by "rogue" on this forum) to replace the aging LLRnet software. LLRnet, due to its tight integration with the underlying LLR code, was hampered by the fact that it utilized the older, slower LLR version 3.5. PRPnet was designed modularly, so that its underlying LLR code can be upgraded easily, and thus does not suffer from this limitation.

In addition, PRPnet supports utilization of both LLR and PFGW for primality testing in cases where those respective programs are faster or otherwise better applicable. While this aspect of the program does not particularly benefit NPLB, it is a great boon for our sister project, Conjectures 'R Us (CRUS), which does a lot of testing on non-base-2 numbers, which require PRP or N+/-1 tests instead of LLR tests.

The PRPnet client is not difficult to set up and use. It works a little differently than the classic LLRnet client, behaving more akin to the updated LLRnet client that we are currently using. Rather than constantly topping off its work queue after each completed test, PRPnet finishes its entire queue before returning it and starting on the next batch. This limits the amount of network connections PRPnet needs to make, and eliminates many of the strange behaviors that occurred in LLRnet and drove many of us nuts. :smile:

One interesting feature that PRPnet has is the ability to mix and match multiple servers in user-specified proportions. For example, you can have it spend 70% of its time working on server A, and 30% time working on server B. You can even configure servers with a 0% proportion, to be used as a backup to ensure your computer does not go idle if the primary server(s) cannot be reached. This feature opens the door to all sorts of interesting possibilities, by allowing almost BOINC-like flexibility yet retaining the simplicity of a standalone application.

You can download the latest PRPnet from the following link: [URL="http://uwin.mine.nu/prpnet"]Lennart's PRPnet clients[/URL].



To set up the client:[LIST][*]Download the client package for your operating system from the link above.[*]Extract the zip file to a new folder.[*]Open the file prpclient.ini.[*]Plug in your email address and username near the top of the file. Also, you'll need to provide an arbitrary name for the particular computer this client is running on. This is mainly useful in case we find a bad result from your computer in future doublechecking, so you can identify the culprit machine more easily.[*]Below the area where you just put your user information, you'll see a section where you can configure what servers the client gets work from. Here, you can configure the proportions and queue sizes for each individual server. The instructions provided in the prpclient.ini file are rather self-explanatory, so we won't go into detail about that here. You can reference the prpnet_servers.txt file for a list of known active public PRPnet servers at the time of this version's release, or check our [URL="http://www.mersenneforum.org/showthread.php?t=12224"]PRPnet servers for NPLB[/URL] thread for an up-to-date list of NPLB servers.[*]Farther down the file, you'll see two options, "startoption" and "stopoption". By default, each time the client is stopped or started, it asks you what to do with any work left in its queue. This can become rather redundant during normal operation, and can be especially annoying on Windows systems where PRPnet will hold up a system shutdown until you answer its prompt. We recommend configuring the client to startoption=9 (complete assigned work units), and stopoption=3 (return completed workunits, keep incomplete workunits in queue, and shut down).[*]Once you're satisfied with the configuration, save the file.[*]Now, run prpclient.exe (or ./prpclient from a terminal on Linux). The client will run, fetch its first batch of work from whatever server has the highest work proportion, and start working. When that batch is done, it will return the results and grab a new batch to work on.[*]To stop the client, press Ctrl-C. The client will stop according to your specified stopoption.[/LIST]That's all there is to it! If you have any questions, feel free to post them in this thread and someone knowledgeable with PRPnet will answer your question.

Note on what we expect will be a frequently asked question: Most of the various options in prpclient.ini may look a little confusing, but are quite safely left at their defaults.

Max :smile:

Mini-Geek 2009-07-28 16:20

1 Attachment(s)
The Windows package includes what I presume to be the Linux executable (in addition to the Windows executable).
Attached is a zip file with all the text files modified to have the Windows-style endline markers so that Notepad can view them properly. (the LLR and Phrot readmes are the same as the ones already included)

mdettweiler 2009-07-28 16:31

[quote=Mini-Geek;183149]The Windows package includes what I presume to be the Linux executable (in addition to the Windows executable).
Attached is a zip file with all the text files modified to have the Windows-style endline markers so that Notepad can view them properly. (the LLR and Phrot readmes are the same as the ones already included)[/quote]
Oh, good catch! Thanks, I've got the fixed file uploading right now.

vaughan 2009-08-02 14:14

Typo: in line 10 of prpnet.ini it says "differtiates", it should be "differentiates" ie it needs "en" inserting.

gd_barnes 2009-08-13 04:31

Since PRPnet cannot be used for project-level drives at this point, I have modified the 1st para in the 1st post here to reflect what we would currently recommend for people wanting to run a PRPnet server.

mdettweiler 2009-10-28 19:59

Since PRPnet 2.4.3 appears to be stable on both server and client ends with no apparent problems, I've upgraded all of the NPLB and CRUS servers to the latest version. Note that while the 2.4.3 server is backwards compatible with earlier clients, the 2.4 client (which I will be posting binary packages for soon) only works with >=2.4.0 servers.

This new version of PRPnet should be immune to the blank-residue problem that plagued earlier versions of the server and forced us to limit PRPnet usage drastically. At this point, we can run pretty much anything through PRPnet, and should start looking into what we might want to transition over from LLRnet in the near future. We'll still keep some of our servers on LLRnet for quite a while to come, but nonetheless any discussion on which drives we should transition and when is welcome.

Personally, I think it would be interesting to take one of the k=400-1001 drives and switch it to PRPnet. We currently have three nearly identical drives for that range, and switching one over would not really detract from the selection available in LLRnet servers.

mdettweiler 2009-11-02 18:49

Hi all,

I've now (finally--sorry for the wait) put together NPLB/CRUS client packages for the 2.4.3 release of PRPnet and posted them for download on the noprimeleftbehind.net website. Download links and setup instructions can be found in the first post of this thread.

A couple of notes on this release:
-Just to reiterate, the 2.4.3 client will not work on a pre-2.4 server. All of the NPLB and CRUS servers are running 2.4.3, though some other projects may not have yet upgraded from 2.3 or 2.2. PSP has upgraded to 2.4, but I don't think PrimeGrid has quite yet for all their servers (though they probably will soon).
-Version 2.4 added support for testing of Generalized Fermat Numbers (GFNs). While neither NPLB or CRUS is testing these numbers, I've included binaries for genefer (the GFN PRP-testing application) in the client packages, in the interest of making a complete client package that will work on any server.

Max :smile:

Lennart 2009-11-02 19:19

[quote=mdettweiler;194569]

A couple of notes on this release:
-Just to reiterate, the 2.4.3 client will not work on a pre-2.4 server. All of the NPLB and CRUS servers are running 2.4.3, though some other projects may not have yet upgraded from 2.3 or 2.2. PSP has upgraded to 2.4, but I don't think PrimeGrid has quite yet for all their servers (though they probably will soon).
-Version 2.4 added support for testing of Generalized Fermat Numbers (GFNs). While neither NPLB or CRUS is testing these numbers, I've included binaries for genefer (the GFN PRP-testing application) in the client packages, in the interest of making a complete client package that will work on any server.

Max :smile:[/quote]

They are all upgraded to 2.4.3 on PrimeGrid :)

Lennart

mdettweiler 2009-11-02 20:52

[quote=Lennart;194573]They are all upgraded to 2.4.3 on PrimeGrid :)

Lennart[/quote]
Cool! Now I can upgrade all my clients to 2.4.3. (I was holding off until the PrimeGrid servers were upgraded since I have them configured as backup servers in case Gary's network goes down or something like that.)

mdettweiler 2009-11-03 05:18

PRPnet 2.4.4 was just recently released to correct a few last-minute bugs that showed up in 2.4.3. It is recommended that all users upgrade to this latest version. (Note that if you already downloaded and set up 2.4.3, then all you have to do is stop the client, swap in the new prpclient.exe, and restart the client. This makes upgrading somewhat easier than having to replace and re-fill-in prpclient.ini as with earlier, more major releases that included changes to that. :smile:)

mdettweiler 2009-11-25 16:13

Hi all,

I have just posted binaries for PRPnet 2.4.6. This fixes some critical bugs in earlier versions that showed up on Linux for both the client and server, though since I think a few minor tweaks may affect Windows as well, I'd recommend everyone upgrade. The only files you'll need to replace are prpclient.exe (or prpclient on Linux) and prpnet_servers.txt (in which I added NPLB's new G3000 PRPnet server for the 5th Drive).

Max :smile:

MyDogBuster 2009-11-28 22:50

Is there a status page for this server port so admins can figure out whats going on here?

Nice one Bruce.

Mini-Geek 2009-11-29 01:14

[quote=MyDogBuster;197281]Is there a status page for this server port so admins can figure out whats going on here?[/quote]
There are the public [URL]http://nplb-gb1.no-ip.org:3000/[/URL] and [URL]http://nplb-gb1.no-ip.org:3000/user_stats.html[/URL] pages.
AFAIK (assuming no NPLB-specific stuff) there is no admin-only page with more info.

MyDogBuster 2009-11-29 01:44

[quote]There are the public [URL]http://nplb-gb1.no-ip.org:3000/[/URL] and [URL]http://nplb-gb1.no-ip.org:3000/user_stats.html[/URL] pages.
AFAIK (assuming no NPLB-specific stuff) there is no admin-only page with more info. [/quote]Thanks Tim, I was looking more for a page showing primes found and a summary of whats running and a list or prior primes much like the current GB status page.

/[URL]http://nplb-gb1.no-ip.org/llrnet/[/URL]

gd_barnes 2009-11-29 01:48

Ian, Max is in the process of working with AMDavid on interfacing with the main NPLB stats and creating such a page that you are looking for. Max, can you confirm that for sure?

Bruce, I am moving your prime to the k<=1001 primes thread.

mdettweiler 2009-11-29 03:57

As for the status page, yes, I do eventually plan on getting the PRPnet servers wired into a page like what we currently have for the LLRnet servers at [URL]http://nplb-gb1.no-ip.org/llrnet/[/URL]. Currently it's somewhat low on my list of priorities, though, since PRPnet's built-in status pages can suffice in the meantime.

mdettweiler 2009-11-29 04:01

To answer a question that Bruce asked me in a PM recently and which I expect will be a frequently asked question: if you find a top-5000 prime on one of our PRPnet servers, report it just like you would with a prime found on LLRnet. That is, your prover code should include "NPLB, LLR, srsieve", and your name.

gd_barnes 2009-11-29 11:09

[quote=mdettweiler;197296]As for the status page, yes, I do eventually plan on getting the PRPnet servers wired into a page like what we currently have for the LLRnet servers at [URL]http://nplb-gb1.no-ip.org/llrnet/[/URL]. Currently it's somewhat low on my list of priorities, though, since PRPnet's built-in status pages can suffice in the meantime.[/quote]

What is your list of priorities?

See [URL]http://www.mersenneforum.org/showpost.php?p=197301&postcount=79[/URL].

I think that getting a proper status page set up and doing the interfacing to the DB for scoring should be at the top of the priority list. The problem with Bruce not getting Email notification of the prime is a potential nightmare. What if we have 3-4 servers and 4-5 primes coming in all at once and most of the primes are by Vaughan and Cipher, who solely rely on the notifications? (I also mostly rely on them.) Although you and I can look on the server files and see the primes that come in, other admins like Ian (or Karsten) need a web page that shows them. I will bet my last dollar that PRPnet will continue to have issues like this. Just the fact that Tim found an issue in the latest 2.4.6 release proves that point.

I don't want to put another major drive on PRPnet until the above is done.

The status and scores pages created by PRPnet are virtually useless for our needs.

Can you give us an estimate as to how long it will take to complete the interfacing and status pages?

On my end, I'm now making it top priority to finish building my computers by this coming Friday, which will allow a permanent place for our servers. I'll also contact my cable co. about a commercial account so that my IP stops changing.


Thanks,
Gary

mdettweiler 2009-11-29 14:34

[quote=gd_barnes;197307]What is your list of priorities?

See [URL]http://www.mersenneforum.org/showpost.php?p=197301&postcount=79[/URL].

I think that getting a proper status page set up and doing the interfacing to the DB for scoring should be at the top of the priority list. The problem with Bruce not getting Email notification of the prime is a potential nightmare. What if we have 3-4 servers and 4-5 primes coming in all at once and most of the primes are by Vaughan and Cipher, who solely rely on the notifications? (I also mostly rely on them.) Although you and I can look on the server files and see the primes that come in, other admins like Ian (or Karsten) need a web page that shows them. I will bet my last dollar that PRPnet will continue to have issues like this. Just the fact that Tim found an issue in the latest 2.4.6 release proves that point.

I don't want to put another major drive on PRPnet until the above is done.

The status and scores pages created by PRPnet are virtually useless for our needs.

Can you give us an estimate as to how long it will take to complete the interfacing and status pages?

On my end, I'm now making it top priority to finish building my computers by this coming Friday, which will allow a permanent place for our servers. I'll also contact my cable co. about a commercial account so that my IP stops changing.


Thanks,
Gary[/quote]
Just to clarify: getting PRPnet results imported into the DB is highest on my priorities right now. As I said over in the "Questions and stuff" thread, I'm waiting to hear back from Dave on that. However, what's a bit lower is getting them wired in to a status page like [URL]http://nplb-gb1.no-ip.org/llrnet/[/URL]; that's rather similar to the built-in stats pages as it's for informational purposes only. In fact, the DB replicates much of that page's functionality, so once we've got DB import going, we should be set for the most part.

Brucifer 2009-11-29 17:36

[QUOTE=gd_barnes;197307]On my end, I'm now making it top priority to finish building my computers by this coming Friday, which will allow a permanent place for our servers. I'll also contact my cable co. about a commercial account so that my IP stops changing.
[/QUOTE]

So what are you putting together Gary?

mdettweiler 2009-11-29 18:55

[quote=Brucifer;197317]So what are you putting together Gary?[/quote]
Gary's building two computers, one as a dedicated server box for NPLB, and the other as a replacement for his desktop which died a while back. The server is intended to run the stuff that is currently on David's servers (which he's shutting down next year) as well as the stuff currently at nplb-gb1.no-ip.org.

Since this will be running the noprimeleftbehind.net website and all of our other primary things, Gary's also getting a commercial-grade internet connection so that he can have a static IP address. The No-IP dynamic DNS thing we've been using so far for nplb-gb1.no-ip.org has been working largely OK, though whenever his IP address changes (which it does periodically, especially in the summer when his router goes offline a lot due to power outages) it takes a while for client computers to update their DNS caches, thus leading to up to a few hours of effective downtime. This should eliminate that problem.

Lastly, one part of the new setup which is already in place is a UPS backup for the server machine itself (currently it's on dumpford, which is running the GB servers until the new server's ready) and Gary's router, cable modem, etc. That should adequately protect the server from power outages of up to at least a few hours. Thus, theoretically the only outages we'll have to worry about are long-term ones (which probably don't happen too often in Kansas, where AFAIK they don't get too much ice and snow), and times when the cable lines themselves go out.

henryzz 2009-11-29 20:00

wasnt gary going to do a test to see how long the UPS backup would last

mdettweiler 2009-11-29 21:34

[quote=henryzz;197326]wasnt gary going to do a test to see how long the UPS backup would last[/quote]
Yes, he said he was going to do that around Thanksgiving. Gary, how'd that go? :smile:

gd_barnes 2009-11-30 06:59

We get a fair amount of snow and ice in KS (and MO, which I'm 10 miles from) but that is virtually irrelavent to my outages. The summertime is virtually irrelavent to my outages. What IS relavent are thunderstorms in the spring and early summer. Although KS and OK have many tornadoes, those are mostly concentrated in the open plains. I am in the large metropolitan Kansas City area. Tornados are surprisingly rare here. (Much to my chagrin because I do enjoy storm chasing. :smile:)

In other words, the outages are largely random except in the spring and early summer. The problem is that the house that I bought, while not too old (built in 1986), is near a main street that still has above ground power lines unlike most of the rest of the area. I see KCP&L working on them frequently. If they will finally pony up the cost to bury them, I suspect I will have almost no outages. Before I was divorced 5 years ago, I lived 5 miles directly west of here in an area mostly built in 1990-1992. All lines were buried. We probably had a total of 5-6 outages in 14 years. Here I've had an avg. of 2-3 per year.

So it's mostly the above ground power lines and to a much lesser extent the weather that affects how many outages we have in the metropolitan area. I've had outages when the whether was perfect because someone ran their car into an electric pole or a bird landed on a line wrong and shorted it out.

Max, you said the status page (i.e. for the LLRnet servers) is for informational purposes only. That's true but it's highly important information that is needed ASAP for the PRPnet servers. If someone doesn't get Email notification of a prime, then it may not get reported for weeks or ever. I realize that the admins can check the primes on the server but we shouldn't be relying on that. Also, we need to be able to quickly see how many pairs are remaining in a server without having to add up every k-value.

I don't know what you mean by "the DB replicates much of that page's functionality". There's nothing that replicates the LLRnet status page's functionality or usefuleness. I'll reiterate again: The current PRPnet status pages are effectively useless for NPLB.

To drive home the point further: I'm not inclined to move any machines to the new PRPnet server for the 5th drive until we KNOW that the Email notification is working OR we have a status page that shows the primes. I suspect issues like this are why Ian wanted to either continue with an LLRnet server or do the next range manually for the 7th drive. If 2 admins are not inclined to use PRPnet servers, then I'm sure others will avoid them also. The Email notification has to be permanently fixed and/or a status page is needed that everyone can see.


Gary

gd_barnes 2009-11-30 07:03

[quote=mdettweiler;197329]Yes, he said he was going to do that around Thanksgiving. Gary, how'd that go? :smile:[/quote]

I forgot about it. I'll do that now. One problem: The server machine is the only one on it now. I hate to let it shut itself down. I think what I'll do is let it run halfway down and double that time. Hopefully the little indicators of power remaining are reasonably accurate.

I'll report back here after the test is done.

Edit: Hah! I goofed. It makes a shrill beeping sound when it goes into battery mode. The instructions seemed to indicate that you had to hold the button down 1 second to make it go silent. Unfortunately that shuts it off. I subsequently found out that just a quick press silences it. So I just caused my own outage. Ergh! So you may see an outage on the GB servers of 5-10 mins. as I get them all restarted.

gd_barnes 2009-11-30 07:53

Max, after accidently turning the backup off after unplugging it (in an effort to silence it), I've now restarted all of the GB servers. You might check them to make sure everything's OK. One thing a bit strange that you might check out: For port 3000 that you added since the last time I restarted all the servers, there is no prpserver244 program. I had to run the more usual prpserver program instead.

OK, everyone, here's the scoop: I'm quite disappointed. With my one server machine running full bore on all 4 cores along with other stuff connected that uses virtually no power like a land line phone, router, modem, and fan, it drained ~80% of the battery backup in ~40 mins.

After spending $250 for a top of the line backup, I'm ready to send it back. Ergh! Anyway, the bottom line is that the backup appears to only be good for < 50 mins. I guess to have a longer backup, I'd have to not run the machine full bore.

The good news is that I can only recall 1 (maybe 2) power outages in 3 years of living here that lasted > 30 mins. From my perspective, that is certainly acceptable so I will leave all 4 cores running.

I guess what bothers me the most is that I had hoped to use it to back up 2 or 3 machines running all 4 cores for at least an hour. As it is, it will stay only on the server machine. I suppose that typical backups are not intended for modern machines running 4 cores full blast.


Gary

mdettweiler 2009-11-30 16:51

[quote=gd_barnes;197357]Max, after accidently turning the backup off after unplugging it (in an effort to silence it), I've now restarted all of the GB servers. You might check them to make sure everything's OK. One thing a bit strange that you might check out: For port 3000 that you added since the last time I restarted all the servers, there is no prpserver244 program. I had to run the more usual prpserver program instead.[/quote]
Okay, looks good. FYI, don't use prpserver244 any more; that was only something temporary that I was using when Mark and I were debugging problems with the server. In fact, that has a few somewhat major bugs in it that are now fixed in the latest version (2.4.6). Fortunately none of those bugs came back to bite us in the short time they were running 2.4.4 just now. I've restarted all the servers with 2.4.6.

[quote]OK, everyone, here's the scoop: I'm quite disappointed. With my one server machine running full bore on all 4 cores along with other stuff connected that uses virtually no power like a land line phone, router, modem, and fan, it drained ~80% of the battery backup in ~40 mins.

After spending $250 for a top of the line backup, I'm ready to send it back. Ergh! Anyway, the bottom line is that the backup appears to only be good for < 50 mins. I guess to have a longer backup, I'd have to not run the machine full bore.

The good news is that I can only recall 1 (maybe 2) power outages in 3 years of living here that lasted > 30 mins. From my perspective, that is certainly acceptable so I will leave all 4 cores running.

I guess what bothers me the most is that I had hoped to use it to back up 2 or 3 machines running all 4 cores for at least an hour. As it is, it will stay only on the server machine. I suppose that typical backups are not intended for modern machines running 4 cores full blast.[/quote]
Hmm...yeah, I guess that sounds about right for a UPS of that size. Nonetheless, yes, given the specs I'd read for that model, it seemed like it should have more capacity than that...but then, what do I know about interpreting UPS specs? :smile:

BTW, about the status pages and PRPnet that you mentioned a couple posts up: first of all, the status pages do NOT run the email notification. That's the DB. Nontheless, yes, the status pages are the next-best thing to email notification when checking for primes. I'll see if I can get something cooked up within the next day or two. Also, I'll try to figure out why PRPnet's bult-in email notification (which was supposed to hold us over in the interim until we got DB import turned on for G3000) wasn't working.

mdettweiler 2009-11-30 17:16

Behold! We now have a PRPnet status page somewhat like the one we have for the GB LLRnet servers:
[URL]http://nplb-gb1.no-ip.org/prpnet/[/URL]
The first thing the script does is convert the results it's dealing with to LLRnet format, which saved me a lot of work on my end and also ensured that things are largely the same for the end user. The main difference is in the part that displays the #, and first and last k/n pairs remaining. I changed it to a more general "lines in prpserver.candidates", since PRPnet uses two lines to denote assigned pairs, thus meaning that this is close to, but not exactly, the # of k/n pairs remaining. (The exact number would be rather difficult to determine, though this should be close enough for most purposes.) Also, PRPnet sorts its prpserver.candidates file (its equivalent of LLRnet's knpairs.txt, for the uninitiated) by k primary and n secondary, regardless of what order it's actually going to hand out the pairs. Thus, the first and last k/n pairs shown will reflect that. (Again, this is still largely close enough for most purposes, as the lowest n for the first k will probably be rather close to the true lowest n. If you need more granular detail, you can get it per k on the built-in PRPnet status pages.)

Note that as of now, the first and last lines have the text "N 0 active" on them. That's part of PRPnet's prpserver.candidates format, and I haven't gotten around yet to writing code to parse that out for this status page. I'll get around to that eventually, though it's only a cosmetic difference so not quite my highest priority. :smile:

Probably the biggest benefit of this status page is the fact that it keeps track of all primes found in a file, a la the LLRnet status pages. That should simplify prime-checking greatly until we can get email notification working again.

(BTW: yes, I know G2000 isn't on the status page yet as of this writing. I'll do that next.)

Brucifer 2009-11-30 18:21

Hey Max the page looks great!! Thanx!! :smile: Much much simpler than looking through a bunch of log files. :smile:

Mini-Geek 2009-11-30 19:12

[quote=mdettweiler;197378]Also, PRPnet sorts its prpserver.candidates file (its equivalent of LLRnet's knpairs.txt, for the uninitiated) by k primary and n secondary, regardless of what order it's actually going to hand out the pairs. Thus, the first and last k/n pairs shown will reflect that. (Again, this is still largely close enough for most purposes, as the lowest n for the first k will probably be rather close to the true lowest n. If you need more granular detail, you can get it per k on the built-in PRPnet status pages.)[/quote]
If you'd prefer it sorted by length, (which is practically by n for NPLB's purposes) this is a simple change in PRPnet's source: prpserver.cpp line 1446, change "theCandidate = g_CandidatesBy[B]BKCN[/B][i];" to "theCandidate = g_CandidatesBy[B]Length[/B][i];". I've tried this, and AFAICT it works perfectly. :smile:
Not sure if it's worth making the change each version, but there it is.

MyDogBuster 2009-11-30 19:57

Nice job Max. Now I don't have to be re-trained which at my age is next to impossible.:tu:

gd_barnes 2009-11-30 21:22

Excellent. That's a good stop gap page until we can get something that is more exact. The main useful thing is the primes file.

gd_barnes 2009-11-30 21:34

[quote=Mini-Geek;197384]If you'd prefer it sorted by length, (which is practically by n for NPLB's purposes) this is a simple change in PRPnet's source: prpserver.cpp line 1446, change "theCandidate = g_CandidatesBy[B]BKCN[/B][i];" to "theCandidate = g_CandidatesBy[B]Length[/B][i];". I've tried this, and AFAICT it works perfectly. :smile:
Not sure if it's worth making the change each version, but there it is.[/quote]

Outstanding!! Thanks for the great tip Tim.

Max, can you please make this change to the various PRPnet servers, recompile them, stop them, and restart them. Perhaps you can call the changed binary "prpservern", which indicates that the prpserver.candidates file would be sorted by n. The sorting of the candidates file has always annoyed me greatly. You can't quickly tell what candidates are handed out to people. You also can't quickly tell what the "true" lowest and highest n-value that are remaining.

You'd also have to do this with each new version of the PRPnet server that comes out unless you can talk Mark into making it the default sorting method in the code. For that reason, I would not suggest putting the version # in the name of the changed binary. We'd have too many of them very quickly.

Another thing, can we delete all of the previous temporary prpserver244 binaries from the various prpnet folders? I also see that one of them has a prpserver230, which I assume can be deleted. That way I'll always know to restart them with the prpserver binary as needed.


Thanks,
Gary

mdettweiler 2009-12-01 00:49

[quote=gd_barnes;197400]Outstanding!! Thanks for the great tip Tim.

Max, can you please make this change to the various PRPnet servers, recompile them, stop them, and restart them. Perhaps you can call the changed binary "prpservern", which indicates that the prpserver.candidates file would be sorted by n. The sorting of the candidates file has always annoyed me greatly. You can't quickly tell what candidates are handed out to people. You also can't quickly tell what the "true" lowest and highest n-value that are remaining.

You'd also have to do this with each new version of the PRPnet server that comes out unless you can talk Mark into making it the default sorting method in the code. For that reason, I would not suggest putting the version # in the name of the changed binary. We'd have too many of them very quickly.

Another thing, can we delete all of the previous temporary prpserver244 binaries from the various prpnet folders? I also see that one of them has a prpserver230, which I assume can be deleted. That way I'll always know to restart them with the prpserver binary as needed.


Thanks,
Gary[/quote]
First of all, ditto, thanks for the tip Tim. I'd come across this once before when I was looking for a way to make the PRPnet server hand out k/n pairs by k instead of by decimal size (which usually means by n). To do that, you need to make the reverse change to another line which controls the hand-out order. That was a while back and I haven't tried compiling the tweak into any recent versions. But, yes, good idea. Since this version of PRPnet will hopefully stick around for a while before it needs replacing, it shouldn't be too hard to compile a tweaked version and stick with it. I'll do it as soon as I get the chance.

@Gary about the various prpserver binaries: I've gotten in the habit of keeping around variou older "milestone" versions as backups, primarily for use in case something goes majorly wrong with a new version that I'm testing. Now that 2.4.6 is out of the testing phase, yes, it would be OK to delete prpserver244 and prpserver230. I'll do that as soon as I get around to it.

rogue 2009-12-01 01:07

It seems to me that you want a custom webpage. If you know how to write HTML, it shouldn't be hard for you to create a custom one by modify the appropriate class.

mdettweiler 2009-12-01 01:40

[quote=rogue;197412]It seems to me that you want a custom webpage. If you know how to write HTML, it shouldn't be hard for you to create a custom one by modify the appropriate class.[/quote]
The problem is, I know nothing of C++; the closest thing I know is Java, which was enough to allow me to make rudimentary changes like switching the BKCN and Length variables.

Also, one particular advantage of having the web pages generated by a separate script is that I can copy off the actual results files, convert them to LLRnet format, and post it to the web in incremental updates throughout the day. This is useful for providing updates to the NPLB stats database in 1-hour increments throughout the day (rather than just once a day when the results files are copied off). See the "Results since last copy-off" link at [URL]http://nplb-gb1.no-ip.org/prpnet/[/URL] for an example.

But, yes, it would be an interesting idea to check out. Possibly we could make a couple tweaks to the built-in web pages to add some extra functionality. Nontheless, probably the best solution for NPLB would still be to have a separate script-generated status page to complement PRPnet's built-in pages.

gd_barnes 2009-12-01 02:52

[quote=mdettweiler;197410]@Gary about the various prpserver binaries: I've gotten in the habit of keeping around variou older "milestone" versions as backups, primarily for use in case something goes majorly wrong with a new version that I'm testing. Now that 2.4.6 is out of the testing phase, yes, it would be OK to delete prpserver244 and prpserver230. I'll do that as soon as I get around to it.[/quote]


Excellent thinking about keeping older milestone versions. That's a given in the "real world" programming industry. If you mess up a current version beyond repair, you can fairly quickly start anew from the prior one.

Since the testing phase is done on 2.4.6, I'll go ahead and delete prpserver244 and prpserver230 from the folders. The machine's under my finger tips...no use for you to do it.

Thanks all for the tips and work on the PRPnet server. It's been a serious challenge but slowly and surely we've gotten close enough that I'm comfortable graduating all of our LLRnet servers to PRPnet servers as soon as the stats are interfacing with the DB. As Max had mentioned earlier in another thread, we'll probably still leave at least one drive on an LLRnet server for those who prefer them. NPLB has always been about choice; small, medium, and large candidates to test; and now manual, LLRnet, and PRPnet methods of searching. :smile:


Gary

mdettweiler 2009-12-01 03:07

[quote=gd_barnes;197414]Thanks all for the tips and work on the PRPnet server. It's been a serious challenge but slowly and surely we've gotten close enough that I'm comfortable graduating all of our LLRnet servers to PRPnet servers as soon as the stats are interfacing with the DB. As Max had mentioned earlier in another thread, we'll probably still leave at least one drive on an LLRnet server for those who prefer them. NPLB has always been about choice; small, medium, and large candidates to test; and now manual, LLRnet, and PRPnet methods of searching. :smile:[/quote]
I belive I mentioned over in the 7th Drive thread that that would probably be a good one to keep on LLRnet long-term, since eventually we'll have two other nearly identical drives running PRPnet, and Ian's already comfortably set up with that one. Do you concur?

You know, come to think of it, this is perfect timing to begin the PRPnet transition. As we're currently on the eve of having the new server ready, this way we can rather easily transition all the IB servers into new PRPnet GB servers--no downtime to worry about since the original servers will be drying out gracefully. And if G4000 is the one server we keep on LLRnet, that works fine since it's already running on dumpford and won't be too hard to copy over to a similar setup on the new server (as opposed to the rather different system David uses for administrating the ones he's hosting).

gd_barnes 2009-12-01 04:59

[quote=mdettweiler;197415]I belive I mentioned over in the 7th Drive thread that that would probably be a good one to keep on LLRnet long-term, since eventually we'll have two other nearly identical drives running PRPnet, and Ian's already comfortably set up with that one. Do you concur?
[/quote]

Isn't that what I implied with "we'll probably still leave at least..."? You might wanna reread. :smile:

[quote=mdettweiler;197415]
You know, come to think of it, this is perfect timing to begin the PRPnet transition. As we're currently on the eve of having the new server ready, this way we can rather easily transition all the IB servers into new PRPnet GB servers--no downtime to worry about since the original servers will be drying out gracefully. And if G4000 is the one server we keep on LLRnet, that works fine since it's already running on dumpford and won't be too hard to copy over to a similar setup on the new server (as opposed to the rather different system David uses for administrating the ones he's hosting).[/quote]

I'm confused. How are we on the eve of having a new server ready? Isn't one already ready and running? Begin the PRPnet transition? What does that mean? To me, the plan all along was to wait for current ranges on the LLRnet servers to be about 3 days from drying and create a new PRPnet server in their place on my machines with the next n-range, like we did for port G3000.

But...one problem here. We can't just make everything (except the 7th drive) PRPnet servers without the stats interface and Email notification working. First things first...let's get that working for port 3000. Then we can transition more servers as they are near drying.

Here's how I see the transition going:

1. You coordinate with Dave on getting the stats interface and Email notification working. (this week?)
2. I get my machines built and a call put in on getting a commercial account set up. (call will be made Tues.; machines built by Fri.; actual changeover to commercial account will likely be 1-2 weeks)
3. I do the various techie stuff related to the Smoothwall router and the such and coordinate with you to test it to make sure it is working well. (by Tues. the 8th)
4. You/we move all of the current LLRnet/PRPnet servers from Dumpford to the new machine. No new servers are created at this point. (Weds.-Thurs. 8th-9th)
5. About 3 days before each LLRnet server dries, create an appropriate PRPnet server with the next higher n-range and update our threads as necessary to reflect them. Optionally this could be done well ahead of time with appropriate testing to make sure the stats interface and Email notification is working. (various after Dec. 9th)
6. After each PRPnet server from #5 is verified as working/interfacing/notifying correctly, once again update our threads to reflect the changeover and inform Ironbits to phase out the appropriate server.

To accomplish this in a reasonable time frame, we'll probably need to have Ironbits reduce the n-range for the k=1400-2000 drive and have someone like me put a bunch of cores on it to dry it out.

Ideally it'd be nice to have all of this done by year end but I know I've delayed things quite a bit by dragging my feet on building the machines. At this point, a realistic timeframe would be by ~Jan. 15th.

This effectively avoids stopping a server in the middle of processing an n-range in order to change it from LLRnet to PRPnet. IMHO, that is a potentially messy nightmare to make sure all pairs get processed.


Gary

mdettweiler 2009-12-01 15:59

[quote=gd_barnes;197421]Isn't that what I implied with "we'll probably still leave at least..."? You might wanna reread. :smile:[/quote]
Hmm...you're right, it was worded a bit unclearly. What I was intending to ask was whether you concurred that the 7th Drive/G4000 would be the one to keep on LLRnet.

[quote]I'm confused. How are we on the eve of having a new server ready? Isn't one already ready and running? Begin the PRPnet transition? What does that mean? To me, the plan all along was to wait for current ranges on the LLRnet servers to be about 3 days from drying and create a new PRPnet server in their place on my machines with the next n-range, like we did for port G3000.

But...one problem here. We can't just make everything (except the 7th drive) PRPnet servers without the stats interface and Email notification working. First things first...let's get that working for port 3000. Then we can transition more servers as they are near drying.

Here's how I see the transition going:

1. You coordinate with Dave on getting the stats interface and Email notification working. (this week?)
2. I get my machines built and a call put in on getting a commercial account set up. (call will be made Tues.; machines built by Fri.; actual changeover to commercial account will likely be 1-2 weeks)
3. I do the various techie stuff related to the Smoothwall router and the such and coordinate with you to test it to make sure it is working well. (by Tues. the 8th)
4. You/we move all of the current LLRnet/PRPnet servers from Dumpford to the new machine. No new servers are created at this point. (Weds.-Thurs. 8th-9th)
5. About 3 days before each LLRnet server dries, create an appropriate PRPnet server with the next higher n-range and update our threads as necessary to reflect them. Optionally this could be done well ahead of time with appropriate testing to make sure the stats interface and Email notification is working. (various after Dec. 9th)
6. After each PRPnet server from #5 is verified as working/interfacing/notifying correctly, once again update our threads to reflect the changeover and inform Ironbits to phase out the appropriate server.

To accomplish this in a reasonable time frame, we'll probably need to have Ironbits reduce the n-range for the k=1400-2000 drive and have someone like me put a bunch of cores on it to dry it out.

Ideally it'd be nice to have all of this done by year end but I know I've delayed things quite a bit by dragging my feet on building the machines. At this point, a realistic timeframe would be by ~Jan. 15th.

This effectively avoids stopping a server in the middle of processing an n-range in order to change it from LLRnet to PRPnet. IMHO, that is a potentially messy nightmare to make sure all pairs get processed.


Gary[/quote]
Yes, that's essentially what I had in mind. Sorry for any confusion on that.

As for the email notification and DB import: I'm still waiting for Dave on that one. According to his forum profile, he hasn't been on since the 24th. My guess is he went away for Thanksgiving and still hasn't come back. :smile:

Flatlander 2009-12-06 14:53

[QUOTE=gd_barnes;197348]...
I think what I'll do is let it run halfway down and double that time. Hopefully the little indicators of power remaining are reasonably accurate.
...[/QUOTE]
I think that you have already found this out but there is far too much on this forum (and CRUS) for me to read properly :wink: :
My experience with electric wheelchairs (and the emergency power pack in my wagon) tells me that indicator lights are not to be trusted.
The 2nd 'half' is usually much less than the 1st 'half'! (I still don't fully understand that newfangled electrickery.)

gd_barnes 2009-12-08 01:42

PRPnet is not ready for a full rollout at NPLB yet. Therefore we'll have only the 5th, 6th, 10th, and doublecheck drives on it for the foreseeable future. The 12th drive is currently on PRPnet. When the current k=2400-2600 range is near drying out, we'll be creating an LLRnet server for subsequent ranges. The small tests results in large loads are creating too many problems for PRPnet.

As a SWAG, my feeling is that PRPnet should be able to handle 100-120 cores on tests for n>700K. It may be more but to be safe, that would be best.

I've made the applicable modifications to the 1st post here to account for the current situation.

mdettweiler 2009-12-08 03:26

[quote=gd_barnes;198133]PRPnet is not ready for a full rollout at NPLB yet. Therefore we'll have only the 5th, 6th, and doublecheck drives on it for the foreseeable future. The 12th drive is currently on PRPnet. When the current k=2400-2600 range is near drying out, we'll be creating an LLRnet server for subsequent ranges. The small tests results in large loads are creating too many problems for PRPnet.

As a SWAG, my feeling is that PRPnet should be able to handle 40-60 cores on tests for n>700K. It may be more but to be safe, that would be best.

I've made the applicable modifications to the 1st post here to account for the current situation.[/quote]
Gary, see my post on this in the PRPnet servers for NPLB thread. I've observed that PRPnet can handle somewhat more than you're thinking, even if not the tiny n=50K-250K stuff.

gd_barnes 2009-12-08 04:45

[quote=mdettweiler;198138]Gary, see my post on this in the PRPnet servers for NPLB thread. I've observed that PRPnet can handle somewhat more than you're thinking, even if not the tiny n=50K-250K stuff.[/quote]

Based on it handling 30-50 cores from Buce at n=200K, I've upped my previous post to 100-120 cores at n=700K. Perhaps it can handle that and maybe it can't. You can't do a direct timing comparison. Example:

Let's say n=200K tests take 100 secs. That means n=700K tests take 100*(7/2)^2=~1200 secs. (Not actual; only an example.)

That doesn't mean it can handle 1200/100=12 times as many clients. That's because what if all clients hit the n=700K range at once? There's no easy way to do an easy apples-to-apples comparison.

From what I can tell, it's based off of how many clients concurrently hit the server and that is regardless of the n-value size. It one person starts 100 cores all at once at n=40M, the server may still barf, even though the tests take 20 days or more. We just have to ask ourselves: What is an acceptable level of risk. If we keep risking these problems, especially on an increasingly higher percentage of our servers, our higher-resourced folks will find testing to do elsewhere.


Gary

rogue 2009-12-08 14:13

[QUOTE=gd_barnes;198142]From what I can tell, it's based off of how many clients concurrently hit the server and that is regardless of the n-value size. It one person starts 100 cores all at once at n=40M, the server may still barf, even though the tests take 20 days or more. We just have to ask ourselves: What is an acceptable level of risk. If we keep risking these problems, especially on an increasingly higher percentage of our servers, our higher-resourced folks will find testing to do elsewhere.[/QUOTE]

Gary, I would prefer that you avoid the use of that term because it is too generic.

There is one issue with the 2.4.6 server that Max has a patch for and has implemented. I have not released 2.4.7 yet. There are definitely no barfing issues of any kind in 2.4.6.

I did not write PRPNet to be multi-threaded or to compete directly with BOINC. It was intended for smaller projects. I would argue that having hundreds of clients is really pushing the limit of what PRPNet was intended for.

The issue here is that since PRPNet is not multi-threaded that some clients need to wait to send/receive work from/to the server. The server should be able to handle up to about 10 clients connecting each minute. When you have a client connecting every 5 minutes to send/receive work and then add hundreds of clients doing the same thing every few minutes, that is problematic. By telling users to grab enough work to keep busy for a longer period of time more greatly reduces the load on the server.

I don't intend to multi-thread PRPNet until there is a database behind it. That requires a lot of work and considering that I am doing the development on my own in my limited free time, it is unlikely to happen anytime soon.

Brucifer 2009-12-08 18:18

[QUOTE=rogue;198186]I did not write PRPNet to be multi-threaded or to compete directly with BOINC. It was intended for smaller projects. I would argue that having hundreds of clients is really pushing the limit of what PRPNet was intended for.[/QUOTE]

Before things get blasted into the stratosphere........ :) It would appear that there are multiple issues here that multiple people are looking at. The first thing though to keep in mind is the comment above about PRPNet being intended for smaller projects. I can understand that, and as for the issue of having hundreds of clients pushing the limits, I have seen that first hand with the very short tests, and have to agree with Rogue on that too.

So basically what we have here is a couple opposing thought lines. On the one hand is a project that likes to run rallies with a wad of clients hitting a server to try and get the maximum tests completed by a bunch of people in a short specified amount of time. That right there in itself is contrary to the author's intended purpose of writing the application as that specifically was not what he wrote it for. So based on that it would be reasonable to assume that we shouldn't be pushing this application for a purpose that it wasn't intended for. And what that really boils down to is for something like the Double Check work that prpnet is currently processing and working just fine on, then it's great for that purpose. For handling very high volume situations, then things are much better handled by llrnet which has been working better with large traffic. However, lets not forget that llrnet has it's issues too, they are just different issues. The initial hassle on llrnet on port 7000 went nuts for some reason that really hasn't been explained well yet. The port blew up before Gary got to issue of the infamous "UPS Test" scenario.

I have had zero issues with prpnet on the doublecheck port, or on the port 3000 work. So again, everything points back to the short tests. While we were hammering G7000, Lennart had better than a hundred cores hitting it at the time I was running around 30 cores. On a Q6600 each core is doing multiple tests per minute. That is just a horrendous hit on the server, not to forget that we also have added latency from standard internet hassles and all thrown into the pot too. I would dare say that prpnet would work just fine on the short tests if there was only a small number of clients hitting it at the same frequency rate as what hits Ports 3000 and 2000.

So the long and short of this is that there is a time and place for everything...... and it is wise to keep that in mind even when dealing with llr testing LOL Heavy hitters should really be running their own servers as it takes much much stress of the project's servers, it also significantly reduces the bandwidth drain across the board on the internet, and places it on an internal network which is much better at handling the load. The admin can watch his own stuff, and if he(she) needs to add more servers to the mix it is a simple thing to do.

Personally I think that all the port 7000 type work should just be put out for manual reservations. Not many people run that stuff anyway as most people are intent on hitting the top-5000 stuff. With the port 7000 stuff out of the way, then things would and should get back to their peaceful normal status quo.

Max is a geek and loves playing with this stuff. He has two apps to use now, llrnet and prpnet. Gives him lots of geeking time. Two apps gives boys like AMDave/Bok more little projects to write their stats stuff for. Put up both llrnet and prpnet servets so people can use whatever turns their little old hearts on. Then everyone will have their own little space in the playground here to play in..... and everyone can do what they enjoy doing as that's what we are here for anyway. :-) And through it all, we will still be finding primes which is what Gary (and tribe) was intending when creating this project.

As for Max, he's been working away hard at trying to keep us all happy, and your efforts don't go unnoticed Max. As for Rogue, I hope you don't get dismayed at all this controversy over Port 7000 tests. It is good you wrote PRPNet. It does what YOU intended for YOUR application to do. More power to you. I think it is cool that you have taken the step to let others use your app because in doing that you open yourself up to stuff like all this controversy over the short tests, etc. I sure hope that this stuff doesn't tick you off and you give up on it cause it's working really well for what you intended.

Meanwhile, if possible just get the rest of the port 7000 pairs ready and what Lennart doesn't want to do, I'll do, and we can get those puppies out of sight and out of mind and out of the way and move on down the road.

rogue 2009-12-08 19:22

[QUOTE=Brucifer;198238]As for Max, he's been working away hard at trying to keep us all happy, and your efforts don't go unnoticed Max. As for Rogue, I hope you don't get dismayed at all this controversy over Port 7000 tests. It is good you wrote PRPNet. It does what YOU intended for YOUR application to do. More power to you. I think it is cool that you have taken the step to let others use your app because in doing that you open yourself up to stuff like all this controversy over the short tests, etc. I sure hope that this stuff doesn't tick you off and you give up on it cause it's working really well for what you intended..[/QUOTE]

Thank you!

I'm starting a new thread that will be used to list requirements for PRPNet to replace LLRNet.

gd_barnes 2009-12-09 01:03

Let me also thank you, Mark, for your excellent contribution to prime searching. The new PFGW is a tremendous tool now. PRPnet is great for smaller volume operations.

On PRPnet, since it has been used at PrimeGrid, I was wholly under the impression that it was also for high volume use and as a smaller competitor for BOINC. I thought that was the intent of allowing the varying percentages against different servers.

Since it was your intent that PRPnet only be for smaller applications, I agree it is a great tool now. Unfortunately we've had to rein ourselves in a little here. It was our original intention to convert all NPLB servers except one to PRPnet but we've now realized that we can't do that. We have to define what is small and what is big at NPLB. For the time being, we're making n>600K small and n<600K big. That's the best we can do. It's reversed from the n-size because from the server's perspective, smaller tests are bigger because it means more load on them.

Yes, I have been harsh on PRPnet but none of it was ever intended as personal insults directed at you. I'm sorry if you felt that way. I was only under the impression that PRPnet was intended for high volume use and have become extremely frustrated at Max saying that it was "ready for high volume use", yet it wasn't. Max, were you aware that it was not intended for such high use? Somewhere along the line, I think both of us were mislead.

Regardless, I don't know the techie reason of why there are still issues in higher volume situations. Max has said that multi-threading is needed so I have to go with that. Also what I heard earlier is memory utilization, especially in situations where there is an unexpected outage of the server or clients. I know these are huge changes so for the time being, we shall go with the assumption that it can only be used in lower volume situations.


Gary

mdettweiler 2010-01-22 20:20

Hi all,

I have now posted a Windows client package for PRPnet 3.1.3. (The Linux counterpart is still pending an inquiry I sent to Mark via email; it should be available within a day or two.) I recommend that all users upgrade their clients to the latest version; if you're upgrading from anything older than 2.4.4 or so, make sure that you use the new prpclient.ini included with the package as it has new options in it.

Note that as of yet, all of our PRPnet servers are still on version 2.4.6. I will be upgrading those to 3.1.3 over the next few days. In the meantime, the 3.1.3 client is fully backwards compatible with 2.4 servers (and vice versa).

I will be posting a new thread shortly with details about a new PRPnet 3.1.3 server which I've set up and loaded with small 12th Drive doublecheck tests for stress testing purposes. Stay tuned. :smile:

Max :smile:

gd_barnes 2010-01-23 07:06

[quote=mdettweiler;202858]Hi all,

I have now posted a Windows client package for PRPnet 3.1.3. (The Linux counterpart is still pending an inquiry I sent to Mark via email; it should be available within a day or two.) I recommend that all users upgrade their clients to the latest version; if you're upgrading from anything older than 2.4.4 or so, make sure that you use the new prpclient.ini included with the package as it has new options in it.

Note that as of yet, all of our PRPnet servers are still on version 2.4.6. I will be upgrading those to 3.1.3 over the next few days. In the meantime, the 3.1.3 client is fully backwards compatible with 2.4 servers (and vice versa).

I will be posting a new thread shortly with details about a new PRPnet 3.1.3 server which I've set up and loaded with small 12th Drive doublecheck tests for stress testing purposes. Stay tuned. :smile:

Max :smile:[/quote]

I assume we'll have a full stress test successfully completed before doing a wholesale upgrade of NPLB servers. 2.4.6 is running just fine. It can handle many clients on our higher n-ranges with no problem. It's just the lower n-ranges with many clients that it can't handle.


Gary

mdettweiler 2010-01-23 16:11

[quote=gd_barnes;202924]I assume we'll have a full stress test successfully completed before doing a wholesale upgrade of NPLB servers. 2.4.6 is running just fine. It can handle many clients on our higher n-ranges with no problem. It's just the lower n-ranges with many clients that it can't handle.


Gary[/quote]
Right, I'm not sure what I was thinking when I wrote that. :rolleyes: The only real problem with 2.4.6 is that the email notification in it is broken, but that's not a big deal since for all the NPLB public servers the DB can handle that. That, and there were a couple extra features (built-in sorting options, etc.) that were added in the also-stable 2.4.7, but I was too lazy to upgrade them since we didn't need any of those features right off the bat. I may as well keep them on 2.4.6 now until 3.1.4 (or 3.1.5 by the time we've nailed down these little issues here?) has passed testing.

mdettweiler 2010-01-23 17:35

Client packages for PRPnet 3.1.4 are now available. This fixes a bug for the Windows version in which the CPU was tied erroneously during server communications, and it is the first 3.1 version posted here for Linux. Come and get 'em! :smile:

gd_barnes 2010-01-23 21:52

I just now ran a client on 3.1.4 for the first time. A couple of nits:

1. LLR is writing an output status every 10,000 iterations, which causes too much screen scrolling. That can be set in manual LLR to something higher. How can we change that in the client?

2. I see the timing for each candidate in the work_G7465.save file. We need that timing to be on the candidate in the prpclient.log file because the work_G7465.save file is only temporary.

As an example on #1, it is writing this:
2001*2^101941-1, iteration : 20000 / 101941 [19.61%]. Time per iteration : 0.15

I'll put a quad on it for a while. If working well, I'll dogpile most my cores later tonight.

Max, I added a Greeting.txt file so that it stopped writing out those not found messages.


Gary

mdettweiler 2010-01-23 22:42

[quote=gd_barnes;202990]I just now ran a client on 3.1.4 for the first time. A couple of nits:

1. LLR is writing an output status every 10,000 iterations, which causes too much screen scrolling. That can be set in manual LLR to something higher. How can we change that in the client?[/quote]
That's a little difficult because PRPnet recreates llr.ini for each candidate. However, LLR is actually designed NOT to scroll, but rather just overwrite the previous line; it's prevented from doing this, though, by the smaller width of the console window pushing it onto the next line. This can be remedied by making the console window wider: on Linux, just stretch it sideways like you'd resize any window; on Windows, click on the little "C:\" logo in the command prompt's titlebar and then click Defaults; set Width to 90, click OK, and you should be all set.

[quote]2. I see the timing for each candidate in the work_G7465.save file. We need that timing to be on the candidate in the prpclient.log file because the work_G7465.save file is only temporary.[/quote]
Hmm, right. Mark, could you possibly add that info to prpclient.log in 3.1.5?

Speaking of which, it would be helpful to have that in the server's logs as well. Heck, I'm not even seeing [i]anything[/i] about returned tests in prpserver.log--is this an error?

[quote]As an example on #1, it is writing this:
2001*2^101941-1, iteration : 20000 / 101941 [19.61%]. Time per iteration : 0.15

I'll put a quad on it for a while. If working well, I'll dogpile most my cores later tonight.[/quote]
I'll forewarn you, this may not be the the definitive dogpile--I don't think this release has even addressed the "no candidates on server" message yet (Mark, correct me if I'm wrong here), so we'll need to re-test after that's fixed.

[quote]Max, I added a Greeting.txt file so that it stopped writing out those not found messages.[/quote]
Okay, thanks.

gd_barnes 2010-01-24 01:50

OK, I'll do that on the stretching of the window. I suppose it's no different than PFGW now so not a big deal.

Based on what you're saying, I think I'll hold off on the dogpile. I see that Lennart has brought some machines and we're starting to get some "volume related" type of errors. Here is a sampling:

[quote]
[2010-01-24 01:46:49 GMT] Error sending [WorkUnit: 2001*2^148837-1 1264297609 2001 2 148837 -1] to socket 17
[2010-01-24 01:46:49 GMT] ODBC Information: SQL_ERROR: [MySQL][ODBC 3.51 Driver][mysqld-5.0.51a-3ubuntu5.4]Duplicate entry '2001*2^148837-1-1264297609' for key 1
[2010-01-24 01:46:49 GMT] ODBC Information: SQL_ERROR: [MySQL][ODBC 3.51 Driver][mysqld-5.0.51a-3ubuntu5.4]Duplicate entry '2001*2^148837-1-1264297609' for key 1
[2010-01-24 01:46:49 GMT] Error sending [End of Message] to socket 17
[2010-01-24 01:46:49 GMT] ODBC Information: SQL_ERROR: [MySQL][ODBC 3.51 Driver][mysqld-5.0.51a-3ubuntu5.4]Duplicate entry '2001*2^148837-1-1264297609' for key 1
[2010-01-24 01:46:49 GMT] Error sending [Start Greeting] to socket 17
[2010-01-24 01:46:49 GMT] Error sending buffer to socket 17
[2010-01-24 01:46:49 GMT] ODBC Information: SQL_ERROR: [MySQL][ODBC 3.51 Driver][mysqld-5.0.51a-3ubuntu5.4]Duplicate entry '2001*2^148837-1-1264297609' for key 1
[2010-01-24 01:46:49 GMT] ODBC Information: SQL_ERROR: [MySQL][ODBC 3.51 Driver][mysqld-5.0.51a-3ubuntu5.4]Duplicate entry '2001*2^148837-1-1264297609' for key 1
[2010-01-24 01:46:49 GMT] ODBC Information: SQL_ERROR: [MySQL][ODBC 3.51 Driver][mysqld-5.0.51a-3ubuntu5.4]Duplicate entry '2001*2^148837-1-1264297609' for key 1
[2010-01-24 01:46:49 GMT] Error sending [End Greeting] to socket 17
[2010-01-24 01:46:49 GMT] ODBC Information: SQL_ERROR: [MySQL][ODBC 3.51 Driver][mysqld-5.0.51a-3ubuntu5.4]Duplicate entry '2001*2^148837-1-1264297609' for key 1
[/quote]Looks like that needs to be looked into.


Gary

rogue 2010-01-24 03:39

[QUOTE=mdettweiler;202992]Hmm, right. Mark, could you possibly add that info to prpclient.log in 3.1.5?

Speaking of which, it would be helpful to have that in the server's logs as well. Heck, I'm not even seeing [i]anything[/i] about returned tests in prpserver.log--is this an error?


I'll forewarn you, this may not be the the definitive dogpile--I don't think this release has even addressed the "no candidates on server" message yet (Mark, correct me if I'm wrong here), so we'll need to re-test after that's fixed.[/QUOTE]

In regards to your first question, it could be added, but I don't understand the need.

As for the second question, the tests should be logged to to prpserver.log. I know they are written to completed_tests.log, so the issue is rather minor. I probably did something stupid. I'll investigate.

Regarding your last question, I don't have enough data to start looking into the issue.

For Gary's issue, it appears that the client disconnected while the server was trying to send a test. Gary, did the client disconnect during the connection? The duplicate keys are occurring on the CandidateTest table. I made a number of changes in 3.1.2 which should have addressed that problem, so I don't understand why it is happening. Can you send me more of the log so that I can see the communication between server and clients?

gd_barnes 2010-01-24 04:36

[quote=rogue;203004]In regards to your first question, it could be added, but I don't understand the need.[/quote]

I got confused as to your "1st question", "2nd question" numbering. I don't really know what question you're referring to.

I [I]hope[/I] this response is NOT you referring to the need for testing time on each candiate because I really don't want to keep going into the reason. Both Ian and I have brought that up and it has been mentioned at least twice and now 3 times. We need the testing time for a candidate on both the client and server side to determine workflow on our machines.

What we're saying is that it doesn't work to have the test time in a temporary file. It needs to be in a permanent file. LLR, PFGW, Proth, Proth, and LLRnet have the test time in a permanent file. PRPnet needs that test time too.

We are respectfully requesting that you add testing time per candidate to both the server and client side in permanent files with release 3.1.5. It is what we have asked for in the "requirements for PRPnet" thread.

That said, assuming you're not referring to the testing time, perhaps my server ignorance is getting the best of me here. I don't know what Max really means by the fact that he is "not even seeing [I]anything[/I] about returned tests in prpserver.log--is this an error?".

Is the previous sentence about returned tests what you are referring to when you said you don't see the need? If so, I'll leave that to you and him.

I'll see what I can get you as far as more of those error messages go. I stopped my client one time: To simply change it from batching 3 tests to 1 test. I then restarted it shortly afterwards. I'm running Linux 8.04 64-bit.

I have a suggestion: After making changes for a release but before releasing it to us, why not put 2 quads on it at a low n-range and let it rip for testing. Lennart and I each only had about a quad on it and these problems are coming out. I'm pretty sure you would have had the same.

We could send you a file. That way, you can see quite a few problems ahead of time before a public release. I know you only have a Windows set up but to me, these problems appear independent of the OS being used. It would save an immense amount of everyone's time if we didn't have to keep testing one change at a time.


Thank you,
Gary

gd_barnes 2010-01-24 06:51

1 Attachment(s)
Now a different error is appearing. I think that is because it's taking a very long time (10-15 secs.) each time it needs to retrieve a batch. In my case, the batch is 1 pair so we can stress test it as much as possible. The error message that is appearing is the dreaded:

"Nothing was received on socket 3, therefore the socket was closed."

This seems to happen every time it returns a batch.

Because I don't know all of what you need, I'm attaching the entire prpserver.log file. In the next post, I'll attach the entire client log file from one of my clients.

You'll also notice that the "MySQL" error referred to in my recent post is sprinkled out fairly regularly throughout the entire file.


Gary

gd_barnes 2010-01-24 07:20

Well, I've discovered something kind of interesting. Max, in the PRPnet thread, you've configured the .ini file in the 3.1.4 client to include most of NPLB's servers, even though this server is SQL and all prior servers are not.

Now, since all the servers were in the .ini file (with 100% on G7465) whenever there was a connect problem (or some problem of that nature) on this test, what would happen is that it would go and retrieve a pair from the 0% port 3000 or port 5000, which you have listed as 2nd and 3rd. But the problem with that is that those are PRPnet 2.4.6. Anyway, you get the drift. 3.1.4 client, 2.4.6 server. It kept trying to return them and it didn't work.

But...don't stop reading...

That said, I was still getting the dreaded "nothing was received" message on some of the port 7465 connects so the problem still exists.

In looking at the previous MySQL error messages and the nothing was received error messages, they are clearly still related to our current test. It appears that Lennart is still getting them too.

I'm bringing this up because I'm not going to attach the prpclient.log file just yet. I'm stopping my clients, changing the .ini file to only go after port 7465 and restarting them. But before restarting them, I'm going to rename the prpclient.log file so that we can tell what was happening before and after the change. I don't want prior "valid" error messages to be lost.

In a little while, I'll attach a prpclient.log file without any possible testing on ports 3000 and 5000.

I hope I haven't messed up any pairs on ports 3000 and 5000. In other words, if the server couldn't handle the pairs, I hope they'll eventually be handed back out to someone.

Max, can you please change the .ini file in the PRPnet client in the PRPnet thread so that this doesn't happen to others? Just listing one server there is all that is needed. Perhaps a comment right above it to indicate that more than one server with percentages can be added if needed.


Gary

gd_barnes 2010-01-24 08:11

1 Attachment(s)
With Lennart now throwing a whole bunch of cores on there, we may be able to isolate some things. I can say that we're still getting the MySQL error messages and "nothing returned" error messages from time to time. On my clients, since they are only configured to port 7465 now, whenever they can't connect within about 3 seconds, I get a "No available candidates are left on this server." and a "Could not connect to any servers and no work is pending. Pausing 1 minute." error message. It then waits a minute and seems to connect OK.

I'm now attaching a prpclient.log file.

Here is a thought: Shouldn't it wait more than 3 seconds before either looking for another server or befor pausing for a minute? I know we don't want to wait too long because there are usually secondary servers to access but IMHO, it seems like it should wait as long as 10 seconds before looking elsewhere or pausing.


Gary

rogue 2010-01-24 13:49

[QUOTE=gd_barnes;203012]I [I]hope[/I] this response is NOT you referring to the need for testing time on each candiate because I really don't want to keep going into the reason. Both Ian and I have brought that up and it has been mentioned at least twice and now 3 times. We need the testing time for a candidate on both the client and server side to determine workflow on our machines.

What we're saying is that it doesn't work to have the test time in a temporary file. It needs to be in a permanent file. LLR, PFGW, Proth, Proth, and LLRnet have the test time in a permanent file. PRPnet needs that test time too.[/QUOTE]

The test time is in the database, but is not written to either the client or server log (unless debugging is enabled). I was trying to understand why the client would need to log that time. Note that the client does not log completed tests locally. Is that what you are really asking for?

I looked at the log files. Unfortunately I need the test run with debuglevel=4 (on the server) to diagnose. I see the errors (duplicate keys), but I don't understand why that is happening. When using InnoDB, MySQL supports transactions. I turn off autocommit and lock the database row to prevent other threads from accessing it. This is all done before the thread updates the database. It appears (at first glance) that the database row is not getting locked or that there is no transaction. I have a way around this, BUT I want to see that debug log before I can be certain that the problem is what I think it is.

Note that with debuglevel=4, that file will grow very quickly. It is best to get the "duplicate key" error a few times, then terminate the server and set it back to debuglevel=0.

I also need that same setting on the server to understand why you are getting the "No available candidates on server" message. I don't understand why the server is not finding any candidates for the client and setting debuglevel=4 on the server will help me diagnose that.

mdettweiler 2010-01-24 17:45

[quote=rogue;203049]I looked at the log files. Unfortunately I need the test run with debuglevel=4 (on the server) to diagnose. I see the errors (duplicate keys), but I don't understand why that is happening. When using InnoDB, MySQL supports transactions. I turn off autocommit and lock the database row to prevent other threads from accessing it. This is all done before the thread updates the database. It appears (at first glance) that the database row is not getting locked or that there is no transaction. I have a way around this, BUT I want to see that debug log before I can be certain that the problem is what I think it is.

Note that with debuglevel=4, that file will grow very quickly. It is best to get the "duplicate key" error a few times, then terminate the server and set it back to debuglevel=0.

I also need that same setting on the server to understand why you are getting the "No available candidates on server" message. I don't understand why the server is not finding any candidates for the client and setting debuglevel=4 on the server will help me diagnose that.[/quote]
Mark, I've sent you a debug log from the last day or so on the server with debuglevel=3 as previously requested. Is this good, or should I set it to debuglevel=4 and get some more data?

rogue 2010-01-24 17:48

[QUOTE=mdettweiler;203068]Mark, I've sent you a debug log from the last day or so on the server with debuglevel=3 as previously requested. Is this good, or should I set it to debuglevel=4 and get some more data?[/QUOTE]

I received your e-mail, so I'll look into it. Hopefully it is enough when combined with the other logs.

mdettweiler 2010-01-24 17:54

BTW @Gary regarding the 2.4.6 servers with the 3.1.4 clients: oops, my bad. When I put them in there like that I was under the mistaken impression that they were backwards-compatible. No, it shouldn't mess up anything on the server; what's going to happen is just that those k/n pairs will be left stranded, and will expire in 2 days (maybe 1 day, I forget what they were set to) to be handed out to someone else.

I'd recommend that you, and anyone else in a similar situation, stop your 3.1.4 clients, delete any "work_G3000.save" or "work_G5000.save" files (i.e. any 2.4.6 servers); comment out the lines for those servers in prpclient.ini (put a // in front of them); and restart the clients. That will ensure that they won't keep trying to return work futilely to servers they're not compatible with.

gd_barnes 2010-01-24 22:55

[quote=rogue;203049]The test time is in the database, but is not written to either the client or server log (unless debugging is enabled). I was trying to understand why the client would need to log that time. Note that the client does not log completed tests locally. Is that what you are really asking for?[/quote]

We are not communicating because this should not be such a big issue. We've asked for this many times in order for PRPnet to replace LLRnet. Have you looked at an LLRnet server and the results that it puts out on both the client and server side? Those are almost exactly what we need.

We need the test time in the prpclient.log file and preferrably on the equivalent file on the server side.

I have a question for you. If you are running a client and you don't have access to the server/database, how would you get the test time? You couldn't without jumping through hoops to calculate it based on the difference in the time of day from the last test. How is a person with 40-50 clients supposed to see how long their tests take?

Let me spell this out in detail. Here is a cut-and-paste of what LLRnet gives us:

Client lresults.txt file:
2959*2^522293-1 is not prime. Res64: 31FE419D5EEE8F44 Time : 1800.608 sec.
Result 2959/522293 succesfully sent to the server.

Server results.txt file:
user=gd_barnes
[2010-01-24 00:01:02]
2757*2^526348-1 is not prime. Res64: 4EE56376A6B5239E Time : 2305.0 sec.


Now, you see the test times of 1800 secs. and 2305 secs.? That's exactly what we need.

I'll take it one step further. The above is ALL that we need in the prpclient.log file.

We don't need info. about that a candidate was sent, that one was received. That should be optional info. and it should be shown in another file.

In other words, we need specific info. about the test in an "official" results file. All of the other info. about sending and receiving and other server messages/errors, IMHO, should be in another file.

Would it make sense for you to look at and run an LLRnet server? Since that is what you're wanting to accomplish, that is having PRPnet replace LLRnet, then I think that is what should be done. If you look at one, I think you'll see why the transition has been so extremely difficult for us. The file names in LLRnet are clear and consice but I've never gotten a warm fuzzy about what the PRPnet server file names mean. But what LLRnet is missing is the flexibility, a newer version of LLR, and the detailed server messages that we need. That is why we are excited about PRPnet. We'd really rather have those server messages separate from the results.

That said, at this point, we're OK if you want to leave all of that info. in the prpclient.log file now. But we must have the testing time right there.

I hope this finally puts this issue to rest and clarifies exactly what we need.


Thank you,
Gary

rogue 2010-01-25 00:15

I have not run an LLRNet server. Although the goal is for PRPNet to replace LLRNet, it doesn't mean that PRPNet has to replicate every feature of LLRNet.

You stated that you wanted test time. I added it, but only as a data point collected by the server. I made that clear in the release notes. Nobody told me that I misinterpreted the requirement. PRPNet has never recorded completed tests on the client side, so adding test time only made sense (to me) on the server. If the client recorded tests, then I would have added it there as well because it would have been an easy thing to infer from the requirement. What I'm saying is that the real requirement here is that the client has to record tests locally and those tests need to include the test time. This is not how you stated the requirement.

Anyways, I will add the test time to the client, but I need to know what pieces of information you want logged. Unlike LLRNet, PRPNet supports multiple helper programs. The log on the client will need to use a consolidated format. I also need to know if the client wants to record the PRP test alone or both the PRP and primality tests. For NPLB, everything is base 2, so LLR is already doing a primality test. Other projects, such as those by PrimeGrid and CRUS, are more than just base 2, so this is an important question to answer.

kar_bon 2010-01-25 00:26

only for another information: i've changed my LLRnet-file 'llrnet.lua' and 'client.lua' for displaying the results in 'lresults.txt' like this:

[code]
[2010-01-24 03:31:11] 2445*2^526111-1 is not prime. Res64: 718DAA79CA386FD4 Time : 610.165 sec.
[2010-01-24 03:41:22] 2493*2^526111-1 is not prime. Res64: 330234CFD2F0B77B Time : 609.989 sec.
[2010-01-24 03:51:32] 2563*2^526111-1 is not prime. Res64: 837A3BF4E9B5FB2F Time : 610.112 sec.
[2010-01-24 04:01:42] 2781*2^526111-1 is not prime. Res64: B481441102FA5DAE Time : 609.728 sec.
[2010-01-24 04:11:52] 2929*2^526111-1 is not prime. Res64: 3A54A9D73847E985 Time : 609.897 sec.
[2010-01-24 04:22:02] 2991*2^526111-1 is not prime. Res64: 3BB36A1F3EE4E26A Time : 609.961 sec.
[2010-01-24 04:32:12] 2043*2^526112-1 is not prime. Res64: F126A11EE5562927 Time : 609.768 sec.
[2010-01-24 04:42:22] 2117*2^526112-1 is not prime. Res64: D6F069913D1BD6E3 Time : 609.736 sec.
[/code]

the line with '... sucessfully sent to the server' omitted and time/dates in the file added.

all primes found will also recorded in a separate file 'primes.txt' in the dir-level above the LLRnet-dir:

[code]
[2009-10-29 07:50:19] 741 724834
[2009-12-02 08:25:37] 699 736845
[2009-12-23 10:16:49] 1895 665256
[/code]

so only a wink of an eye to see, if there's a new prime found!

perhaps a switch to turn off all server/client messages in the PRP-output will do it, too.
i think, there's no need for all those messages put in a file when all is working fine. if not switch on and record those texts.

for PRPnet with 4 cores (-> 4 dirs with different ini's) i use such script to determine the number of tests or primes found:

[code]
@echo off
if exist PRPnet_G3001\llr.prime type PRPnet_G3001\llr.prime
find /C "Residue" PRPnet_G3001\prpclient.log
if exist PRPnet_G3002\llr.prime type PRPnet_G3002\llr.prime
find /C "Residue" PRPnet_G3002\prpclient.log
if exist PRPnet_G3003\llr.prime type PRPnet_G3003\llr.prime
find /C "Residue" PRPnet_G3003\prpclient.log
if exist PRPnet_G3004\llr.prime type PRPnet_G3004\llr.prime
find /C "Residue" PRPnet_G3004\prpclient.log
pause
[/code]

which gives me this info:
[code]
---------- PRPNET_G3001\PRPCLIENT.LOG: 343
---------- PRPNET_G3002\PRPCLIENT.LOG: 341
---------- PRPNET_G3003\PRPCLIENT.LOG: 332
---------- PRPNET_G3004\PRPCLIENT.LOG: 331
[/code]
(no prime yet so all llr.prime do not exist)

Karsten

gd_barnes 2010-01-26 00:23

I agree with Karsten here about needing separate results and logging. But...I think Mark is giving us our wish with version 3.1.5. We've needed a separate results and logging file for a very long time now. I made the mistake in thinking that Max had brought this up to you at some point.

You said you need what pieces of information you want logged as for the results on the client side. I gave that to you in detail cut-and-pasted from an LLRnet server and client. Please reread my prior post. As for the server messages that have been in prpclient.log that need to be in a separate file from the results, I'll leave it up to Max on that. I think what we're getting now is fine.

As for how I stated the requirement. I stated it in a manner that I thought you knew what a results file was like is written out by LLR, PFGW, Proth, Proht, LLRnet, etc. That's all we've needed this entire time. I had not realized that you had not run an LLRnet server before. I'm sorry if it came out in a confusing manner. Now that I know that, I'll know to give you specific examples cut-and-pasted from other software of what is needed instead of just stating what is needed.

Not to beat a dead horse here Mark, but I feel like it would be very good if you would download an LLRnet server, set it up, load it with pairs, and then put at least 2 quads worth of clients on it for several hours. I realize that it is somewhat klunky and inflexible and that we are only base 2 and that LLRnet is only intended for base 2. But the files that LLRnet writes out are a close approximation of what many different prime search projects need. That is results, pairs remaining, a joblist (explained below), and a rejected file. That's the main crux of the output from LLRnet on the server side: The joblist.txt, knpairs.txt, rejected.txt, and results.txt file. Each has a very specific purpose and is easy to wade through...no extraneous information.

What it doesn't have is PRPnet's flexibility in its programs used and logging of problems.

I promise you: LLRnet will enlighten you as to what many prime search projects need. One of its best features is the joblist.txt file. For pairs handed out and not processed, it's a file that shows exactly when they were handed out and who has them "reserved". What makes it easy is that that is ALL that is shows...no other information to filter through. It can help isolate problem users.

One more thought: Would it make sense to always test a release on at least 2-3 quads for at least an hour or so before rolling it out to the general public for testing? Set the batching to 1, load it with some very small k/n pairs, and let it rip for a good starting stress test. I'm fairly confident that only a Windows setup running ~10 cores would find some of these problems related to load on the server ahead of time. (Make sure the k/n pairs have an n-value around n=50000 for the best test.) Even without a Linux testing setup, I really feel that this will nip a lot of small problems in the bud and there will be less releases. (I don't know; maybe you're already doing that level of testing.)

What I'm trying to come up with is an ahead-of-time test plan that we can all live with. The idea here is to reduce the # of releases and total testing time by everyone.

This is only my two cents on what I feel would help everyone in testing. Please know that we are with you with the MySQL version of PRPnet. We really want to get it correct this time around.

Max, I'd like just you and me to test version 3.1.5. Mark has informed us at CRUS that he didn't test it so we need to do some serious testing on it. I don't want to coordinate any kind of testing problems with several different people here. The good news is that it is supposed to have separate results (with a testing time) and server logging messages. That's a huge deal to many of us "record keepers" and "work coordinators" here at NPLB and CRUS.


Gary

mdettweiler 2010-01-26 00:33

[quote=gd_barnes;203220]Max, I'd like just you and me to test version 3.1.5. Mark has informed us at CRUS that he didn't test it so we need to do some serious testing on it. I don't want to coordinate any kind of testing problems with several different people here. The good news is that it is supposed to have separate results (with a testing time) and server logging messages. That's a huge deal to many of us "record keepers" and "work coordinators" here at NPLB and CRUS.[/quote]
Okay. Mark just sent me an email answering a question I had about building the 3.1.5 server for Linux, so I should be able to get it built and ready to roll shortly.

rogue 2010-01-26 02:05

[QUOTE=gd_barnes;203220]Not to beat a dead horse here Mark, but I feel like it would be very good if you would download an LLRnet server, set it up, load it with pairs, and then put at least 2 quads worth of clients on it for several hours. I realize that it is somewhat klunky and inflexible and that we are only base 2 and that LLRnet is only intended for base 2. But the files that LLRnet writes out are a close approximation of what many different prime search projects need. That is results, pairs remaining, a joblist (explained below), and a rejected file. That's the main crux of the output from LLRnet on the server side: The joblist.txt, knpairs.txt, rejected.txt, and results.txt file. Each has a very specific purpose and is easy to wade through...no extraneous information.[/quote]

IMO, just because LLRNet does something doesn't mean that PRPNet has to do it the same way, if at all. I created the other thread so that you and others could give me detailed requirements so that PRPNet could replace LLRNet. None of what you are listing here is in that thread. Note that PrimeGrid is using PRPNet on many diverse projects and nobody over there has asked for these things.

I agree that I need to do more testing, but I don't have any quads to do testing with. I also do not have Linux boxes to test with and Linux has been the source of most of the more difficult problems that PRPNet has run into. I have done a lot of testing on Mac OS X and Windows, and they have not exhibited any (and I mean any) of the socket or MySQL problems that Linux has exhibited. Many of the mods for PRPNet are specifically written to address the nuances of Linux.

gd_barnes 2010-01-26 04:45

[quote=rogue;203224]IMO, just because LLRNet does something doesn't mean that PRPNet has to do it the same way, if at all. I created the other thread so that you and others could give me detailed requirements so that PRPNet could replace LLRNet. None of what you are listing here is in that thread. Note that PrimeGrid is using PRPNet on many diverse projects and nobody over there has asked for these things.

I agree that I need to do more testing, but I don't have any quads to do testing with. I also do not have Linux boxes to test with and Linux has been the source of most of the more difficult problems that PRPNet has run into. I have done a lot of testing on Mac OS X and Windows, and they have not exhibited any (and I mean any) of the socket or MySQL problems that Linux has exhibited. Many of the mods for PRPNet are specifically written to address the nuances of Linux.[/quote]

I'm through with this conversation.

[URL="http://www.mersenneforum.org/showpost.php?p=199647&postcount=5"]This post[/URL] says what we wanted related to the testing time; that is what side it needed to be on (client side), yet somehow we weren't specifc enough. Did we need to say exactly what file it needed to be in? If so, sorry. I'll do that in the future.

You don't need "linux boxes". You can run Linux Ubuntu on a Windows machine.

The problems are load related, not Linux related.

Just because LLRnet has it doesn't mean that PRPnet should NOT have it.

etc. etc.

I could say more but it's not worth it.

rogue 2010-01-26 13:55

[QUOTE=gd_barnes;203246]You don't need "linux boxes". You can run Linux Ubuntu on a Windows machine.

The problems are load related, not Linux related.[/QUOTE]

I would get into quite a bit of trouble "borrowing" my wife's PC to install Linux. I get into enough trouble running other stuff on it. My other windows PC is from work and I definitely cannot touch that one. In other words, I don't have a Windows box that I am free to install Linux on.

Depending upon the problem, a number of them have been Linux only. All of the issues I've had with buffering of data in sockets can only be reproduced on Linux. The duplicate keys problem existed in an early version of 3.x, but was fixed when changing the engine to InnoDB and making a code change on the server. I pounded my own server harder than you guys are pounding yours (small workunits and 5 clients) and I did not see the duplicate key issue after those changes. From the logs I saw you quickly got duplicate key issues with only 4 clients and medium sized workunits.

I don't have enough information yet regarding the "No candidates available on server" message. This is the most troublesome issue (to me) that I would like to get resolved.

vaughan 2010-01-27 08:19

[QUOTE=rogue;203292]I would get into quite a bit of trouble "borrowing" my wife's PC to install Linux. I get into enough trouble running other stuff on it. ..[/QUOTE]
Yeah I get that scolding too. :smile:

It runs something like ... "What's this cr@p running on MY computer. Get it off there now."

My solution is to run projects that allow it eg. the old D2OL project and currently DPAD/Muon from a USB memory stick. When I hear her car in the driveway I quickly hit Ctrl-C, Y and safely remove the flash drive. A quick shutdown of Windows completes the task. :innocent:

henryzz 2010-01-27 16:37

[quote=rogue;203292]I would get into quite a bit of trouble "borrowing" my wife's PC to install Linux. I get into enough trouble running other stuff on it. My other windows PC is from work and I definitely cannot touch that one. In other words, I don't have a Windows box that I am free to install Linux on.[/quote]
What about a virtual machine? Virtualbox should do exactly what you need.

rogue 2010-01-27 16:49

[QUOTE=henryzz;203424]What about a virtual machine? Virtualbox should do exactly what you need.[/QUOTE]

The issue is that I use her PC and put other software on it. I don't see how a virtual environment solves the problem.

henryzz 2010-01-27 17:23

[quote=rogue;203427]The issue is that I use her PC and put other software on it. I don't see how a virtual environment solves the problem.[/quote]
If you use a virtual environment then linux will effectively be like a program instead of having to create a partition, boot into it etc. and yet it will still have all the functionality of normal linux.

mdettweiler 2010-01-28 00:22

Actually, what might be even better is for me to set up an account on the server itself for Mark. That way he could debug and test the software directly on the machine we're intending to use it on. Thoughts?

rogue 2010-01-28 00:48

[QUOTE=mdettweiler;203518]Actually, what might be even better is for me to set up an account on the server itself for Mark. That way he could debug and test the software directly on the machine we're intending to use it on. Thoughts?[/QUOTE]

That might work.

mdettweiler 2010-01-28 23:12

Windows and Linux client binaries have been posted for version 3.1.5. The main difference between this and 3.1.4 is that now the client ouputs a separate log file listing just its results, along with the time it took to test each one.

Since there were no changes made to prpclient.ini in this version, those upgrading existing clients can just swap out the prpclient binary and leave the rest intact.

Note: despite the fact that this client version is not compatible with our older 2.4.6 servers, I've left those servers configured in prpclient.ini since we'll probably be upgrading those sometime soon (assuming that 3.1.5 checks out in further testing). Until then, the only server these will work with is the G7465 beta test server, which is on 3.1.5. I'd recommend commenting out the other lines for now so that the client doesn't try to fall back on them in case it can't reach port 7465, since that will cause it to waste its time crunching tests that it can't send back to the older server.

gd_barnes 2010-01-29 12:54

[quote=mdettweiler;203518]Actually, what might be even better is for me to set up an account on the server itself for Mark. That way he could debug and test the software directly on the machine we're intending to use it on. Thoughts?[/quote]

Max and I discussed this and I thought it would be a very good idea. Fire away when ready Max.

mdettweiler 2010-02-09 21:31

Hi all,

I've now posted client packages for PRPnet 3.2.0 in the first post of this thread. Note that they are not compatible with 2.4 servers, which includes all of our public servers right now since we're still beta-testing 3.x. This is despite the fact that I included preconfigured server= lines for those servers in the 3.2.0 client packages' prpclient.ini's; I did that so that they'd be all ready to go when we do upgrade our public servers to 3.2 later on. In the meantime, stick with a 2.4 client on those servers; I've added links to download 2.4.6 clients in the first post. (I know 2.4.7 is a little newer, but as I recall I had trouble building Linux binaries for that and ended up "skipping" that version since the changes since 2.4.6 in the client don't really affect NPLB or CRUS. If you want a 2.4.7 client, you can find client packages for it over at the Prime Sierpinski Project forum.)

One other significant change with this release is that I've included the new LLR 3.8.0 in the PRPnet 3.2.0 client packages. (They're not in the 2.4.6 client packages; just swap them out with the latest binaries if you'd like.) LLR 3.8.0, like the latest PFGW and Prime95, uses gwnum 25.13, which gives it a huge speed improvement over earlier versions on non-base-2 numbers, and a smaller but nontheless nonnegligible improvement for base 2. (I don't recall the exact figures off the top of my head.) LLR 3.8.0's input and output follow the same conventions as 3.7.1c, so it's completely cross-compatible as far as PRPnet is concerned.

I've also upgraded the G7465 beta-test server to 3.2.0. Per Gary's request, we'd like to do the first phase of testing it "in-house", with just Gary's clients hammering it, to avoid spreading around any problems if they show up. :smile: I'll let you guys know when we're done with that and are ready to open it up to the public.

Max :smile:


All times are UTC. The time now is 06:46.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.