Thread: Get-er-done!
View Single Post
Old 2005-07-11, 11:02   #5
Peter Nelson
 
Peter Nelson's Avatar
 
Oct 2004

232 Posts
Default

Ndpowell you are not entirely wrong. You need only have offered HALF your apology.

Although GIMPS work via primenet is not time-critical; that is: as long as a result can be accepted when the server come back online then no work is lost, and the delay has not been any disadvantage to the project as a whole.

In my opinion the main problem that occurs is the link to the back end database fails, which means the web server process is still up and serving but cannot access the database.

Implications:

there are sometimes extended periods when participants cannot access their individual account report (which may disincentivise participation).

this may (and has) created a problem for some NEW participants who want to join in while the "server" is having problems. eg cannot obtain assignment, login for their stats etc. These newbies tend to post asking "what am I doing wrong" when it is not THEIR fault at all. Others might simply not bother and leave the project.

Thirdly, don't assume that work IS processed correctly.
eg on several occasions when my machine has tried to submit results things have not gone smoothly even when the server comes back to normal.
eg. exponents remaining in my account report even though they finished.
eg. not getting primenet credit assigned to me for having done particular work.
The only way to resolve this is to email George who then will be helpful have put things right. HE SHOULD NOT NEED TO DO THIS manual fixing if the server was reliable.

You are not the only IT admin here, and there are others of us who would not accept the current situation as acceptable in a commercial application, nor should we in a math project.

There are other things need changing on the server side software too eg. reporting properly what client version was used on a client machine. eg being able to deal with new processor types.

Current situation: It works, most of the time, but only just, and from the outside it gives the impression the insides are held together with bits of sticky tape and elastic bands. LOL.

I have already suggested at the very least that:

a)if the server side cannot be easily changed quickly, that the client be augmented so it tests whether the server is up when submitting work, and reports intelligent messages to the user. ie the client be designed to accomodate downtime rather than assume it always up.

b)a (simple) monitoring system should be put in place. When the server goes "down" it might restart the backend database process.
At a minimum it should email or page George etc that there is a problem requiring their urgent intervention.

If the webserver part could be modified to error-handle properly, it might at least give a message understandable to a novice rather than the "cryptic" SQL error message.

I think it can only benefit the project if the system reliability can be improved.
Peter Nelson is offline   Reply With Quote