Replacing Entropia's primenet server
 2002-10-04, 03:30 #34 Prime Monster     Aug 2002 10416 Posts TNGG (The Next Generation of GIMPS) Lets get back to basics here. How we do it and where we do it, is at the moment not important. I am very serious: I could quite easily handle the network bandwidth requirement as it is today - personally :mrgreen: . What I am trying to find out, is what we would like to see in The Next Generation of Gimps (TNGG). Talking about Source Forge (or any similar solution) is a bit premature at the moment. This is not to say that this type of solution will not be needed in the future. dswanson is on the right track: What would he like to see in TNGG, and he gives his answer. I am only asking what would you like to see in TNGG? I am not really concerned about feasability, or 1st, 2nd, 3rd or 4th generation functionality. If you do not make a well designed DB, then you can forget about g2, g3, g4 etc, so I assume that is done reasonably well. What I am asking about, is what do you want from TNGG. If we have or get a good database design, then it is a piece of cake to provide any report you can think of - the web programs might take a bit of time to implement, but that is another issue. Lets get the requirement and wishlist in place first, then we, George, PrimeNet, and other parties will have a chance to say / tell their opinions about the list. Then we desing and after that we implement. We need someone to volunteer to be "project manager" for the "discovery" process, ie for identifying and documenting all requirements. I suggest Xyzzy, not only because he runs this bulletin board and therefore is available, but also because I think and believe he would not jeopardize the project and would only think in terms of doing the best for TNGG. Alf / Prime Monster / heretic / etc :D
Quote:
 Originally Posted by Prime Monster PS: Remember we are dealing with 30-40.000 users and hopefully growing. This is not a system you run on a win95 box
I wouldn't want to see GIMPS using a win95 box as a server... but let's look at that "30-40,000" figure for a moment. That's not 40,000 permanent connections to the server; that's 40,000 boxes which connect to the server maybe once a week. That means an average of about four hits per minute.

phpBB tells me that this page took 0.574978 seconds to generate (which says something about the sluggishness of php, but let's not get sidetracked); if it takes more than half that time to assign an exponent to a machine, something is severely broken. Slashdot normally handles 25 hits per second with 10 machines, and their pages are far more complicated than anything GIMPS-server would be doing.

As for talk of proxies... there are two purposes behind proxies, and both are unnecessary. The first is to handle load; but the load which GIMPS would take is easily small enough for a single box to handle. The second is to hide transient server failures; a much better solution is to simply have prime95 not complain as much if it can't contact the server.

It's been a week since George has posted here; perhaps we should put off our discussions until he clarifies further what he wants to do?

Quote:
 Originally Posted by cperciva As for talk of proxies... there are two purposes behind proxies, and both are unnecessary. The first is to handle load; but the load which GIMPS would take is easily small enough for a single box to handle. The second is to hide transient server failures; a much better solution is to simply have prime95 not complain as much if it can't contact the server. It's been a week since George has posted here; perhaps we should put off our discussions until he clarifies further what he wants to do?
What I want to do??? I want my house construction nightmare to end!

Seriously, this discussion has been useful. I had not considered designing in proxies. I'm not sure how that would work or be implemented. It is true that one server can handle the current load and small outages are a nuisance. But wouldn't proxies serve as insurance against catastrophic failure? A serious hardware failure could take a few days to repair. A hurricane could knock out power for a week. Also, redundancy helps keep our data safe. Are these advantages worth the headache of solving several technical hurdles? I don't know.

I agree with previous posters that there are two major hurdles to solve. One is cost. I think enough interest has been expressed here that the roughly $100 / month cost could be handled. Even an initial server outlay may not be a problem. The second hurdle is design and implementation. Coming up with a grand plan and implementing in stages seems reasonable. Has xyzzy expressed interest in coordinating a planning document? 2002-10-04, 20:34 #37 QuintLeo Oct 2002 Lost in the hills of Iowa 26×7 Posts Re: why not use distributed.net Quote:  Originally Posted by cperciva why not "transfer" this project to distributed.net? > 1. Culture clash. People running d.net generally don't have the patience to run LL tests taking weeks or months to complete Some might not. A lot would - and they've got easily 10 TIMES as many users at any given point. A LOT of us d.net folks have been active in d.net for 3+ years - over 5 in my case. > 2. Architecture. The problems which d.net was designed to attack have lots of tiny blocks which are simply "done" or "not done". Not true for OGR. OGR does in fact pass an answer back - the length and designation of the smallest ruler found in a given "block". The answers that Prime passes back shouldn't be any longer - I think they would be shorter. The data Primes pulls FROM the server is definitely shorter, though not a lot. The assigned "problems" are a lot shorter, I grant. > 3. Independence. This discussion arose partly out of a desire to make GIMPS independent of Entropia; going from being dependent upon Entropia to being dependent upon United Devices distributed.net IS NOT PART OF UNITED DEVICES. They do have quite a few (NOT a majority) of the major "names" working for UD. Doesn't invalidate the rest of your point here, though - it *is* still a change of dependency, not going "independent".  2002-10-05, 02:05 #38 trif Aug 2002 2·101 Posts With only 58 MB per day of bandwidth (what does this spike to on heavy days, like last year when M39 was discovered?), I was thinking this would be trivial to host on a$30/month account at pair.com (this forum is hosted there), but when George said the reports can gobble 100% CPU for several minutes that went out the window. You can't do longrunning CPU intensive processes on their shared machines. You'd need the services of one of their dedicated servers, the cheapest being $250/month. The only problem is that's a lot of money for a lot more bang than the project needs in terms of bandwidth (1 GB per day on the lowest level). But it might be possible to share that with others. For instance, I need to find hosting for my daughter's school, and I've made inquiries with pair.com. In terms of space and bandwidth, we also need about what we can get from one of those$30 accounts, but the restrictions on what you can do with the shared machines (to ensure you don't step on everybody else's toes) make that solution less than desirable. We don't need CPU, we need scripting flexibility for small mailing lists and we want to run a newsserver, so cohabiting with a site that hogs the CPU for several minutes once an hour is not onerous as long as the web site stays responsive. I myself would like to open up a small site for myself, I'd probably go for the $18 a month option. If we get enough people that want a share of a dedicated server, it could fly.  2002-10-05, 03:26 #39 Lumly Aug 2002 Quebec, Canada 29 Posts I must point out that if someone impliments good stats served from the, well, server :) then the bandwidth usage will probably shoot through the roof. 2002-10-05, 03:53 #40 Xyzzy "Mike" Aug 2002 177408 Posts Quote:  Originally Posted by trif In terms of space and bandwidth, we also need about what we can get from one of those$30 accounts, but the restrictions on what you can do with the shared machines (to ensure you don't step on everybody else's toes) make that solution less than desirable. We don't need CPU, we need scripting flexibility for small mailing lists and we want to run a newsserver, so cohabiting with a site that hogs the CPU for several minutes once an hour is not onerous as long as the web site stays responsive. I myself would like to open up a small site for myself, I'd probably go for the $18 a month option. If we get enough people that want a share of a dedicated server, it could fly. All of TPR, this BBS, and my many other sites are all on one of Pair's$30 a month plan... Just the other day all of my stuff used 1.6GB in one day and the server just motored on...

The only thing I do not get is CPU time... Remember when this BBS had those awful times pulling pages? That was because someone else on the shared server was nuking it with a runaway process... The server will kill it eventually, but in the meantime, the runaway process kills interactive performance...

And you know human nature requires that if it don't work the first time, we must click the button again...

Personally, for a project like the stats, I'd prefer to have a bit more control over the system... Pair is nice, but they severely limit what I can do... I'm only here for the uptime...

Quote:
 Originally Posted by Prime95 Has xyzzy expressed interest in coordinating a planning document?
I have absolutely no experience with doing that, but that never stopped me before...

I would suggest that if there is someone more qualified than me then they should do this... But I am willing to do whatever you all want me to... (I suppose I might have to find out what a "planning document" is!) :)

Quote:
 Originally Posted by Prime95 Seriously, this discussion has been useful. I had not considered designing in proxies. I'm not sure how that would work or be implemented. It is true that one server can handle the current load and small outages are a nuisance. But wouldn't proxies serve as insurance against catastrophic failure? A serious hardware failure could take a few days to repair. A hurricane could knock out power for a week.
If that is a real concern, I'd suggest getting a dedicated server from somewhere like rackspace.com; given that they advertise 99.999% reliability (and have, at least in the past, provided it), I think getting a server from them would mean that hardware issues are practically solved. (Of course, it would be more expensive on a monthly basis; on the other hand, it wouldn't require the initial purchase of a server.)

But I don't really think it is a real concern. I always kept a couple weeks of work queued up; as long as any downtime doesn't exceed the duration for which people have their machines queuing work, there should be no damage (apart from perhaps scaring people with error messages).

 2002-10-05, 13:02 #43 Prime Monster     Aug 2002 22·5·13 Posts Xyzzy, If you need any help :D I will be willing to help. This is the easy phase, just document what people come up with of ideas and requirements. It might be a good idea to post something to the mailing list about this as well. Next phase is to turn all the ideas and requirements into a design - specification document and that is somewhat harder. Not difficult, but you have to be very cynical here, because what you are producing should be possible to implement. As for the web-server itself; with the current load it can run nearly anywhere. It is the database server that is the difficult part. It will need to be a high-performance server and as far as I know most service companies charge a large amount of dosh for those (if they provide them at all). As Lumly pointed out; If the server starts to provide better stats, then the load will go up quite a lot. But again those stats will, to a large extent, be provided by the database server. Alf
 2002-10-05, 18:25 #44 trif     Aug 2002 2×101 Posts If that is a real concern, I'd suggest getting a dedicated server from somewhere like rackspace.com; given that they advertise 99.999% reliability (and have, at least in the past, provided it), I think getting a server from them would mean that hardware issues are practically solved. (Of course, it would be more expensive on a monthly basis; on the other hand, it wouldn't require the initial purchase of a server.) [/quote] I cannot recommend Rackspace. They harbor spammers who pay a hefty premium, and as such a considerable amount of their IP space is in SPEWS (a public list of spam sources which many sites, including some US government sites, use to block spam with), and their entire IP space in many private blocking lists. GIMPS does need to send out mail now and then, even if the mailing list isn't moved to the new hosting (and I don't see why not). A site that can't deliver a large portion of that mail is much less valuable. This is one of the reasons why I recommended pair.com, as they are similarly reliable, and don't take pink contracts from spammers, so their IP's are not blocked. However, I've looked more closely at Pair's dedicated servers, and they don't allow multiple clients to share one dedicated server, and they don't allow "resale" out of their dedicated servers. But I still think this would be the way to go, we just need to find a reliable, nonspamfriendly provider.

