![]() |
|
|
#1 |
|
(loop (#_fork))
Feb 2006
Cambridge, England
2×132×19 Posts |
The Haswell machine that I bought in June 2013 has crashed (dropped off the network; when you go to look at it, the power light is on but there's nothing displaying on the screen) twice in the last week. It was running linear algebra at the time, but it's been running linear algebra for months and up to now it's been very reliable. Any idea how to investigate?
|
|
|
|
|
|
#2 |
|
"/X\(‘-‘)/X\"
Jan 2013
2×5×293 Posts |
My first guess is overheating. When was the last time you dusted out the CPU cooler and power supply?
Last fiddled with by Mark Rose on 2015-04-22 at 23:42 |
|
|
|
|
|
#3 | |
|
∂2ω=0
Sep 2002
República de California
5·17·137 Posts |
Quote:
Note: Properly reseating the heatsink is a bit tricky, those split-end plastic connectors are prone to having their ends bent rather than slipping into the MoBo holes. Same issue as arises on new-install, but it's worse when reseating things after dedusting, because the connector ends have been spread apart by the coaxial push-to-lock mechanism, and one may need to physically squeeze them together with pliers to recover something approaching the when-new state. |
|
|
|
|
|
|
#4 |
|
(loop (#_fork))
Feb 2006
Cambridge, England
2·132·19 Posts |
I brought the machine in and opened it up; it looks pretty clean inside - dust is basically a product of human activity, and it lives in an outbuilding with a concrete floor which humans go into only when one of the servers crashes.
|
|
|
|
|
|
#5 |
|
Sep 2002
Database er0rr
5·751 Posts |
Assuming dust is not the problem, fire up the BIOS and have a look at temperatures and voltages and fan speeds. If these are all right, run memtest for a while, followed by mprime/prime95 torture test. You might need to increase the CPU voltage a smidgen HTH
![]() ps. Check all plugs and sockets, like the power cables and disk cables and cards, by reseating them. Last fiddled with by paulunderwood on 2015-04-23 at 12:02 |
|
|
|
|
|
#6 | |
|
"/X\(‘-‘)/X\"
Jan 2013
2×5×293 Posts |
Quote:
After that, it's anyone's guess what the faulty component is. I would start with the power supply, as in my experience they're the first to go from a power surge, but if you have a spare component you can swap in, start with that as it's free. |
|
|
|
|
|
|
#7 |
|
"Carlos Pinho"
Oct 2011
Milton Keynes, UK
3·17·97 Posts |
Only blow things out with compressed air if air is dried.
|
|
|
|
|
|
#8 |
|
Jan 2015
11·23 Posts |
I'm not positive but look for swollen capacitors too. Sometimes the cheap ones fail, or maybe your surge suppressor has exhausted its protection ability, which happens over time.
|
|
|
|
|
|
#9 | |
|
"Kieren"
Jul 2011
In My Own Galaxy!
27AE16 Posts |
When it cackles, does it lay eggs, golden or otherwise?
Quote:
Last fiddled with by kladner on 2015-05-08 at 01:41 |
|
|
|
|
|
|
#10 |
|
(loop (#_fork))
Feb 2006
Cambridge, England
11001000101102 Posts |
The machine has run happily for the last week, sitting in my study inside. I shook it and something which I suspect was an extremely well-dried dead slug fell out of one of the PCI slots, which I'm sure helped.
|
|
|
|
|
|
#11 |
|
"Kieren"
Jul 2011
In My Own Galaxy!
2×3×1,693 Posts |
|
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Random starts of conversations | Historian | Lounge | 11 | 2010-03-29 06:11 |
| Prime95 Stops Mid-Test, Starts New One | jinydu | Lounge | 25 | 2008-09-08 02:35 |
| Hardware Of the Week #1 | moo | Hardware | 4 | 2005-10-19 15:58 |
| LLRnet starts as a system tray icon | vaughan | Prime Sierpinski Project | 1 | 2005-01-26 15:43 |
| Computer Starts Beeping | Unregistered | Hardware | 10 | 2003-12-15 19:41 |