![]() |
|
|
#12 |
|
Apr 2003
22·193 Posts |
Complete domain inclusive ping/ssh/.... unreachable.
I will have to talk to the ISP about their quality of service. (It is 2 in the morning so not much i can do now) Lars |
|
|
|
|
|
#13 |
|
Apr 2003
22×193 Posts |
The complete cluster server on which my virtual server is running is offline.
I could not force the hotline to give a estimation when the problem is solved. I suspect that many of the servers are offline as the hotline person knew the time off failure before i could state it. Keep you informet. Lars |
|
|
|
|
|
#14 |
|
Apr 2003
22×193 Posts |
System is reachable again since 09:11 CET.
It looks to me like a problem with the network and not with the server itself. I will confirm this in the evening but all the data i can see at the moment it looks like the server itself never went down. Lars |
|
|
|
|
|
#15 |
|
Jun 2003
116758 Posts |
Down again
![]() EDIT:- Never mind. It's up again! Last fiddled with by axn on 2006-11-17 at 13:06 |
|
|
|
|
|
#16 |
|
Apr 2003
11000001002 Posts |
Checked the logs. None of the servers had a downtime this night. There was data comming in from other participants all the time.
So this time i guess it was on your side and (for the first time) not on the server. Cheers, Lars |
|
|
|
|
|
#17 |
|
Oct 2006
7×37 Posts |
again :( riesel side ...
not reachable since 5:30 am european time Last fiddled with by tnerual on 2007-10-23 at 15:17 |
|
|
|
|
|
#18 |
|
Apr 2003
11000001002 Posts |
Service stopped and started again.
I have no clue what happened. The application was still running but did not react to requests anymore. That is the reason why my cronjob was not able to react. It can only check if the app is still available and not if it is doing usefull work. |
|
|
|
|
|
#19 |
|
Oct 2006
10316 Posts |
server stopped again, riesel side ...
|
|
|
|
|
|
#20 |
|
Apr 2003
22×193 Posts |
I am at work at the moment. Will look into it when i am home.
if it is the same behaviour as last time i will do some DB reorgansation as it could be possible that there is a timeout problem with the server if the DB response is to slow. Hope to have everything up again latest at 21:00 CET. Last fiddled with by ltd on 2007-11-05 at 14:10 |
|
|
|
|
|
#21 |
|
Apr 2003
22×193 Posts |
Automatic restart of the server should be done within the next 15 minutes.
I removed several entries from one of the transfer tables into an archive table. I think that we had a response timeout problem. We only had problems with the riesel side of the project so i compared the transfer tables of the different projects: Riesel5 >250000 datasets sier5 > 150000 datasets PSP >35000 datasets After the cleanup all the tables are down to less then 10000 datasets. I hope we are stable again. If this is the case i will create a automatic script that will take care that archiving is done on aregular base. Sorry for the outage. |
|
|
|
|
|
#22 |
|
Apr 2003
22×193 Posts |
Data is handed out again. Keep your fingers crossed that i found the problem.
|
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| What exactly is sent to the server? | paul0 | NFS@Home | 2 | 2015-03-12 23:00 |
| Server bug -- new? | Christenson | Information & Answers | 5 | 2011-07-12 21:44 |
| Server Down? | Grant | Information & Answers | 13 | 2008-11-24 19:37 |
| New ECM-server available | andi314 | Factoring | 3 | 2003-08-31 11:22 |
| New Server Hardware and price quotes, Funding the server | Angular | PrimeNet | 32 | 2002-12-09 01:12 |