mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > PrimeNet

Closed Thread
 
Thread Tools
Old 2003-01-22, 01:42   #166
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

7,537 Posts
Default Re: view from 50,000 ft

Quote:
Originally Posted by aga
I'm not sure about old RPC interface. Are there still noticeable amount if RPC-communication clients?
v22 does not support RPCs. v21 did but had crash bugs.
Prime95 is online now  
Old 2003-01-22, 02:09   #167
aga
 
Oct 2002

25 Posts
Default Re: availability requirements

Quote:
Originally Posted by Old man PrimeNet
(a) those wanting to see immediate or frequent web-site feedback on their account's progress or the overall progress of the search, or
After years of GIMPSing I still feel very unhappy if I decide to browse stats and see ISAPI error instead.

( I should admit that after observing the errors multiply times I still survived. :) )

Quote:
Originally Posted by Old man PrimeNet
(c) outages that last longer than the average work-to-do queue.
This does not sound good at all. If it gets down to a point where the 'avarage work-to-do queue' gets exhausted, it already means that half of GIMPS clients stay idle. And I bet, that's mostly fastest computers.

Also, add (d) - if someone just downloaded GIMPS client and can not make it doing some work because PrimeNet is down till Monday, that is effectively lost participant.

Quote:
Originally Posted by Old man PrimeNet
3. ISP line out / long-term power failure 5%
Hmm, 95% availability? That sounds pretty low. I feel unhappy if my servers experience less than 99.95% availability. This is the level that goes unnoticed (and/or forgiven) almost always.

(I realize that 95% is not pretty legal here as 5% of percentage of downtime not total time; but if that was 99.95% then it didn't worth mentioning but you did. And thinking of even 99% uptime concerns me).

Quote:
Originally Posted by Old man PrimeNet
4. DoS attacks / Internet 'storms' 5%
Things like PrimeNet are indeed attractive targets for DoS attacks. Here failover servers are pretty useless - one can kick off one server, then another. But if there are several mutually-replicating servers at different geographical locations, each behind a fat link... well, good luck.

Also, replicating servers carry less risk to become target of DoS attack. If it is going to be pretty useless/tasky to establish a -successfull- DoS attack, trying it at all looks much less attractive.

Note that data corruption / loss is not even listed (though the v1 & v2 servers had this issue). My point here is that perhaps data integrity via replication is not the main concern to focus upon. Important to manage as a risk factor, but not a burning hot spot.

Quote:
Originally Posted by Old man PrimeNet
[Sidebar - Note item (5) has been a point of some headaches in the past, too - it seems impossible to provide a web form for manual database use w/o some party eventually writing a script that irresponsibly 'mines' PrimeNet's database; this issue will reappear among the open-source requirements issues to resolve.]
I think it's reasonable to allow a certain ip address to contact server only 100 times per hour at most. All further attempts will not be handled with instant error message returned instead. This is going to be pretty easy to implement, and as we are not going restarting servers now and then, tracking ip addresses can be done in RAM thus causing no I/O at all.

Thing that worries me, is a 'hypotetical' computers farm behind gate/firewall, all doing trial factoring. That could generate more than 100 valid requests per hour, especially at the beginning of business day when hardware gets turned on. Ok, let's set limit at 150 er server and use 4 servers. If we resort to failover scheme, the single active sever will not be as protected.

Quote:
Originally Posted by Old man PrimeNet
My second observation is that about 4 in 5 outages is due to failure that could be fixed by our efforts here. Note also that except for disaster risk and control being centralized, a fixed-up single server could avoid 85% of the outages (items 1,2,5).
But noone can provide alot of time for a project in a prolonged period of time? So half year later your business will again start pooling alot of time, and the 4 of 5 outages will return? I would feel much safer with 2 or 3 operators supervising Prinet. This almost ensures that at any times at least one server have human around, and thus number of alive servers will never reach zero.

Quote:
Originally Posted by Old man PrimeNet
Execute arbitrary app code?
You mean client-side? Why not. Sounds attractive to include into GIMPS all the non-x86 Unix and Mac powerful boxes. That will require to wide-spread client-side source code, and that in turn will require much higher robustness of server(s).
aga is offline  
Old 2003-01-22, 02:45   #168
aga
 
Oct 2002

25 Posts
Default

Quote:
Originally Posted by Prime95
I think we should aim for a server that accepts v22 and earlier CGI requests.
With JSP/Servlets web application one can emulate CGI interface pretty easily. That's extremely flexible environment (and of course Servlets provide much better performance that CGI).

Quote:
Originally Posted by Prime95
The new server will require a new client with new or improved messages. One particularly painful migration will be from the current userid scheme to the new not-yet-designed userid/team scheme.
It is not evident why to change protocol? Small improvements should suffice. (And if we nevertheless go changing all at once, I'd like seeing XML based protocol, to ensure that that's a very last large change).

In particular, why change client-server protocol adding teams there? Let's put all new users into 'unassigned' team and give user a web form where he/she can switch team at any moment (client-side software could also connect to the web page submitting new team; but I believe doing too much GUI stuff in client-side software does not worth time, and will burden PrimeNet with supporting the auxilarly pages forever).

Quote:
Originally Posted by Prime95
Note that v22 contacts mersenne.org directly whereas v21 goes through entropia.com.
I think, for new server(s) there should be subdomain v5.mersenne.org or alike. This removes requirement of hosting all versions at a single server (set of servers), tho it still can be done that way. If for some reason next major prime95/mprime release goes before new server is up, I suggest setting v4.mersenne.org as default server.

Quote:
Originally Posted by Prime95
I would like to add ECM factoring to the server. It would be nice if other small math projects could be added easily. The entire stats engine should work without modification for these other projects.
Hmm, in fact additional java web applications can be deployed pretty easily. Or the additional code might go as package, then it is possible to write class loader that will provide such packages a jailed, restricted environment. But I would prefer delaying things like that until v6.
I mean, P-1, ECM etc should indeed be in v5 (could you give a full list of currently appealing algorithms?), but as an integral part of web application, not as pluggable modules - latter might turn into a whole ocean of work (which might be interesting, tho, after more important concerns are resolved).

Not sure about other small projects. Database structure is going to be very mersenne-centric (or otherwise it will be fat and slow). But with sufficiently generic messaging layer, cloning the software for other distributed projects should be pretty easy.
aga is offline  
Old 2003-01-22, 02:49   #169
gowen72
 
Aug 2002

3116 Posts
Default Re: availability requirements

Quote:
Originally Posted by aga
Quote:
Originally Posted by Old man PrimeNet
Execute arbitrary app code?
You mean client-side? Why not. Sounds attractive to include into GIMPS all the non-x86 Unix and Mac powerful boxes. That will require to wide-spread client-side source code, and that in turn will require much higher robustness of server(s).
I don't know if that is what he means, but the ability for glucas and mlucas clients to connect directly to the server for work assignment and sending of results would be great.
gowen72 is offline  
Old 2003-01-22, 04:01   #170
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

7,537 Posts
Default

Quote:
Originally Posted by aga
It is not evident why to change protocol? Small improvements should suffice.
In particular, why change client-server protocol adding teams there?
V22 Prime95 only has a concept of userid (a team is really just a userid with lots of computers). V23 will have a userid and an optional teamid.

This is one of many v23 changes that will affect the messages passed to the primenet server. Maybe we're just having a terminology problem: I'm not suggesting changing the http/cgi protocol, just changing the messages sent using that protocol.

Quote:
Originally Posted by aga
Not sure about other small projects. Database structure is going to be very mersenne-centric (or otherwise it will be fat and slow).
The reservations and results will be Mersenne-centric. The teams / users / computers / stats databases should not be.
Prime95 is online now  
Old 2003-01-22, 04:10   #171
aga
 
Oct 2002

408 Posts
Default LL handling

I think about 2 things:

First, is v5 should allow running test and doublecheck at the same time, and thus assign exponents for testing and doublechecking at approximately the same time. Of course, client-side software will not show if it's first or second run; also:

- faster computers will factually do first run, and slower ones will doublecheck; instead the timed out exponenets will be assigned to faster computers thus making chances finding a prime are more even.
- milestones will be pretty ordered and predicatable
- servers will need to handle only exponents located in a relatively short range only, which [might] allow some optimizations
- handling all LL assignments similarly will simplify server codes, and will make unnecessary creating special anti-pouching codes.

Disadvantage is that slower computers will always do double-check, that will annoy them. But maybe, that's better for GIMPS as a whole?

Another idea, is running tripple checks too:

- that can be used as final testing of new programming codes. Results of the testing will not be quite useless - it would be interesting to find if there are both test and doublecheck wrong for some old exponent.
- new computers can be assigned tripplecheck at the very first time. First, it will improve experience for new GIMPSers, as the exponent will be tested in a matter of days or hours instead of weeks or months. Second, it will allow server to recognize faulty-hardware computers before they are assigned real work. Third, it will give server better initial estimation on factual computer speed (including hours/day it's on).

Disadvantage is of course the processing power used for tripple checking will not be used for
testing larger exponents. This is pretty serious.

As related idea, it might be reasonable to assign factoring work for first-timers; new computers will have low initial rating anyway, so small LL tests will not be available for them; but if they successfully run few factoring assignment, that should raise their rating quickly and then computer will start receiving 'good' LL assignments.
aga is offline  
Old 2003-01-22, 04:29   #172
aga
 
Oct 2002

25 Posts
Default

Quote:
Originally Posted by Prime95
Quote:
Originally Posted by aga
why change client-server protocol adding teams there?
V22 Prime95 only has a concept of userid (a team is really just a userid with lots of computers). V23 will have a userid and an optional teamid.

Maybe we're just having a terminology problem: I'm not suggesting changing the http/cgi protocol, just changing the messages sent using that protocol.
No, that is not terminology this case. What do I mean, is how dnet currently work: with dnet client, only userid (email address) is configured, dnet client knows nothing about teams; if you want join some team, you can do it at dnetc web site (at project stats pages). That's what I propose for GIMPS to adopt; and overall, don't complicate client and client-server protocols with stuff that can be implemented entirely at server-side.

After all, it's much easier to update server-side software than thousands client-side installations; also, if certain things are implemented at server-side, it makes client-side communication module smaller thus encouraging primenet support with other programs.
aga is offline  
Old 2003-01-22, 05:00   #173
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

1D7116 Posts
Default Re: LL handling

Quote:
Originally Posted by aga
First, is v5 should allow running test and doublecheck at the same time
This is a non-starter. Our goal is to keep GIMPS fun for the participants, not to make life easier for server development or to produce more orderly milestones. A big joy for GIMPS users is getting an exclusive assignment. You and only you have a chance to make a big find. Even relatively slow computers get this chance.

In general, we let the user decide how he wants to participate (within limits) rather than have the server impose a rigid testing procedure.
Prime95 is online now  
Old 2003-01-22, 05:09   #174
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

7,537 Posts
Default

Quote:
Originally Posted by aga
What do I mean, is how dnet currently work: with dnet client, only userid (email address) is configured, dnet client knows nothing about teams; if you want join some team, you can do it at dnetc web site (at project stats pages).
This is a good question for the users - configure everything in one place (the client) vs. two places (the client and the server). Obviously, either method works.

OTOH, we may be delving a little too deep into the details with this issue...
Prime95 is online now  
Old 2003-01-22, 16:55   #175
Old man PrimeNet
 
Old man PrimeNet's Avatar
 
Jan 2003
Altitude>12,500 MSL

11001012 Posts
Default

Migration: I saw the v22 source pointing to mersenne.org - I agree that was a necessary change.

> ... a server that accepts v22 and earlier CGI requests.
> The protocol is known and shouldn't be hard to support.

It would be a 'compatibility proxy' server for the 'old guard' GIMPS machines, v16 thru v21 (v14s & v15s ran until late 1998). An evil task, hacking at old code.

> ... will require a new client with new or improved messages.

Undoubtedly.

> One particularly painful migration will be from the current
> userid scheme to the new not-yet-designed userid/team scheme.

The sort order used now is binary byte... does that help or worsen things?


Outages: hopefully soon moot. We should have several admin-like folks.

> Multiple servers / Mirrored servers:

> Even though we only need a single server we must design
> the server code with multiple servers in mind.

ok. if we can mirror databases, shouldn't we be able to build snapshots from replayed transaction logs to run stats for a web server? How often and how large would a log file segment be? Can it be configured for hourly new log file segments that can be quickly post-processed (elsewhere) and then published on a web site (perhaps elsewhere again)?

If we stick with a stateless front end design, we would be able to multiply them as fan-outs from the database server for scalability. Also suggests that a machine-unique ID is passed with every server interaction.

I also recommend we use a different base URL for each version of Prime95 v23+, different in DNS name, path or executable name, to simplify the management of multiple client versions and increase our ability to route specific ones where we want. For example, v23primenet.mersenne.org/v23 or v24primenet.mersenne.org/v24, etc. Any drawback to putting out a new DNS name each build? We would have the flexibility of making the stateless front end routing each version via a different server and/or different URL path. Or the same server. In the initial case, they can share the same server or even the www server, but later we can move these components around as needed.


> Arbitrary apps:

I think you mean, 'add to the server at the code level'. I was thinking of binary modules encapsulating various API-sandboxed apps. I will think more on the source being modular instead...

Separate team ID: thank goodness!

Mostly I'm thinking about protocol, database and boundary changes. What does the database need this time to act upon stateless HTTP parameter-only hits?

Security - How about using HTTPS with HTTP fall-back? Or did the visibility of the HTTP data help GIMPS stay low-profile/harmless? What about trusted source builds vs. public open source builds accessing the network?
Old man PrimeNet is offline  
Old 2003-01-22, 18:22   #176
Old man PrimeNet
 
Old man PrimeNet's Avatar
 
Jan 2003
Altitude>12,500 MSL

6516 Posts
Default team managed at server idea

I like the team managed at server/web-site idea. This is what we did at Entropia for two other distributed computing projects. It avoids the whole merge/defect statistics issue by letting the database manage teams as loose federations.
Old man PrimeNet is offline  
Closed Thread



Similar Threads
Thread Thread Starter Forum Replies Last Post
Report of monitoring primenet server unavailability Peter Nelson PrimeNet 13 2005-10-18 11:17
Is Entropia in trouble? ekugimps PrimeNet 1 2005-09-09 16:18
mprime stalls if primenet server is unavailable :( TheJudger Software 1 2005-04-02 17:08
Primenet Server Oddity xavion PrimeNet 28 2004-09-26 07:56
PrimeNet server replacement PrimeCruncher PrimeNet 10 2003-11-19 06:38

All times are UTC. The time now is 14:55.


Mon Aug 2 14:55:34 UTC 2021 up 10 days, 9:24, 0 users, load averages: 2.64, 3.05, 3.46

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.