![]() |
![]() |
#1 |
Dec 2010
Monticello
179510 Posts |
![]()
My six-core beast is ill...it seems that six cores of work after six months has given it indigestion.
System, roughly: Eric-AMD-6-Core: AMD Phenom II x6 1090T ASRock 880GM-LE Mobo (SB 710 Chipset, factory clocking) GTX440 GPU. 8 Gig of Ram in 2 sticks Antec Earthwatts Green 380W Power supply. 200 W at full load. Xubuntu. Runs fine, mfaktc, no mprime. Eric-AMD-6-Core Crash 1 Month ago, was crashing regularly and a crack was observed run mprime about 1 hour, 5 cores. Kill-a-watt reports 200W used. Resets suddenly without warning. run sensors, run mprime, about 20 minutes. No temperatures near limits, voltages seem OK, but there are some peculiarities -- like minima above maxima... Get text console with saned disabled: edit /etc/default/saned [OK] [1045.373055] [Hardware Error]: CPU 4: Machine Check Exception: 4 Bank 0: b62bc000ea000135 [1045.373285][Hardware Error]: TSC 31cd301d4d2 ADDR 1c9973b00 [1045.373425][Hardware Error]:Processor 2: 100fa0 TIME 1322374257 SOCKET 0 APIC 4 [1045.373556][Hardware Error]:MC0_STATUS[-|UE|-|PCC|AddrV|CECC]: 0xb62bc000ea000135 [1045.373692][Hardware Error]:Data Cache Error: Data/Tag DRD error. [1045.373810][Hardware Error]:cache Level: L1, tx: DATA, mem-tx: DRD [1045.373927][Hardware Error]:Machine-Check: Processor context corrupt [1045.374044] Kernel Panic - not synching: Fatal machine check on current CPU [1045.374159]Pid: 2580, comm: mprime Tainted: P M 3.0.0-13-generic#22-Ubuntu [1045.374295]Call Trace: [1045.374338]<#MC> [..... ... [1045.374828]<<EOE>> [1045.374873] panic occurred, switching back to text console
Clearly, between power supply, mobo, ram, and CPU I have a major issue. I don't really want to end up with a lot of spare parts...but am considering a second/upgraded system, probably running Sandy Bridge. 0) I suppose step1 is to run memtest86...I gave 6 Gig to P-1... 1) The system has always been prone to crashing when the lights jump with the voltage in the house. Could I have damaged the power supply with the intermittent in the power strip? 2) A small regret with this system is that the Power supply is a bit small to run a truly high-end GPU. Is it worth buying a 600W or 800W power supply as an upgrade to see if that fixes the problem? 3) Should I go ahead and invest in a regulating UPS as a test, with enough capacity to run both systems? Am I likely to get anywhere fiddling with the overclock settings? |
![]() |
![]() |
![]() |
#2 | |
Basketry That Evening!
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88
3·29·83 Posts |
![]()
All those hardware errors seem to point to the CPU, specifcally core 2 and/or 4 (or might be 3 and/or 5). It seems to me that Ubuntu Forums would be a better place to take this, with that output. Specifically:
Quote:
Also, what do you mean by "a crack was observed"? Last fiddled with by Dubslow on 2011-11-29 at 02:13 |
|
![]() |
![]() |
![]() |
#3 |
Oct 2011
Maryland
2×5×29 Posts |
![]()
I want to start by saying that all I have is guesses. There are people who would know better than me.
Anyway, my initial thoughts: - 200W full load seems low. Those Phenoms are power hogs at full load, plus HDDs, RAM, and that Video card. I don't know what a 440 should draw (I know much more about AMD's) but it has to be higher than the 50-70W your numbers suggest. So you get 100% GFX and CPU utilization at 200W? - With power supplies generally, 12V amperage is more imporant than overall wattage, so I would check that. Garbage PSU's claim high wattage, and throw it all somewhere worthless like the 5V line. Though Antec's are generally very high quality, so that probably isn't the issue. - I agree - always check RAM first. It, along with HDD, are the two most likely things to be corrupt in my experience as long as you run at normal temps. - A quality PSU and a quality surge protector should protect all your parts from damage relating to power fluctuations. Rebooting unexpectedly from power drops shouldn't hurt anything too badly, in my opinion. Power spikes would be the larger concern. - Even though you report normal temps, if you have a stock AMD cooler I would still potentially suspect that cooling could be an issue (I have had faulty gauges before). So lowering/removing the overclocks could be fruitful, in my opinion. Again, just some random opinions. Someone who knows more about linux than I can probably shed more light on your specific error messages. I was getting hard crashes in Debian (which ubuntu is built off of) on my box if I ran mprime on over two cores (I was showing low 80s) with no overclock. I replaced the stock cooler with an aftermarket one and that issue completely went away. |
![]() |
![]() |
![]() |
#4 | |
"Kieren"
Jul 2011
In My Own Galaxy!
2·3·1,693 Posts |
![]() Quote:
This is different from your load conditions, but it does make a 380 watt supply seem a bit below optimum, though a good PSU might hold up under that kind of load. Last fiddled with by kladner on 2011-11-29 at 03:34 |
|
![]() |
![]() |
![]() |
#5 |
Dec 2010
Monticello
5·359 Posts |
![]()
I mean, that when I flexed the surge protector/power strip, the computer would crash, and you could hear the connection being made or broken, due to the internal arc. This continued after plugging in the Kill-a-watt device, even at no load, the Kill-a-watt would come and go. I probably should have taken it back to Staples and claimed it damaged my equipment....I garbaged it instead...
As for the power usage...recall that it's only a GT440 (not the world's fastest GPU beast, just enough performance to make it interesting) and that I'm only running one HDD and not overclocking at all that I know of....and I have on-board AMD graphics, too, but I'm not really pushing performance except with mprime and mfaktc... **************** So, ramtest first... Remove heatsink and replace with aftermarket cooler second (in my parts kit, don't forget about bent and partially unbent pin 997 on the CPU) second....use the good (arctic silver) heatsink paste in case the original factory stuff has dried out and thermostat is wrong...wonder if a hot spot under that is possible? Upgrade PS third...or get regulating UPS instead? -- more $$, but I want one anyway Is it worth a lapping kit if I get a CPU? Should I get a cheap (4-core) for testing, 8-core upgrade for real use? ******** Last fiddled with by Christenson on 2011-11-29 at 03:43 |
![]() |
![]() |
![]() |
#6 | |
"Kieren"
Jul 2011
In My Own Galaxy!
2×3×1,693 Posts |
![]() Quote:
Also, your load is substantially less than mine. In addition to the things I listed above, I have 4 HDD's. So I really don't know what to suggest. |
|
![]() |
![]() |
![]() |
#7 | |
Oct 2011
7×97 Posts |
![]() Quote:
Last fiddled with by bcp19 on 2011-11-29 at 15:36 |
|
![]() |
![]() |
![]() |
#8 |
"GIMFS"
Sep 2002
Oeiras, Portugal
2·5·157 Posts |
![]()
I wouldn´t be surprised at all if the power supply turns out to be responsible for the crashes reported.
380 W is a low value. That is the overall power rated, you may be stressing the PSU too much in some of the lines. And as you have reported some primary power unstability, that may have caused some damage to the PSU. If you have the chance, try replacing the PSU by a more powerful one for a start. Next thing to check, if the problem doesn´t go away, is the memory. In any case, a regulating UPS is a good safeguard against power fluctuations, and you should get one. |
![]() |
![]() |
![]() |
#9 |
Dec 2010
Monticello
5×359 Posts |
![]()
The UPS, from Cyberpower, at newegg.com, rated for PFC, and regulated, is on its way...Model CP1500PFCLCD, is on its way... I was impressed by Cyberpower's willingness to help when their stuff wasn't working, and to change the way they did stuff when it was causing problems, like setting units on top of cords in shipping.
I probably spent more than absolutely necessary...$219...maybe....I didn't quite see what an extra $100 bought for the fat, squat models, but I am considering a second system. Newegg had a stepped-sine-wave output model on sale for $149...decided I didn't want to fool with that. I'm seriously considering a high-end (800W or more) Antec, as I could use that on system #2 and/or upgrade the GPUSuggestions? Are there better brands without getting tremendously more pricey? And while I'm at it, what's the best way to mount a small fan for spot cooling inside a case, preferably without drilling extra mounting holes? (the north bridge chip fins run a tad warm, the engineer wants to direct some cooling air at it). |
![]() |
![]() |
![]() |
#10 | |
Oct 2011
Maryland
2×5×29 Posts |
![]() Quote:
Good PSUs are significantly more expensive than cheap ones. With that said, that is the one part of the computer that you absolutely positively don't want to go cheap with, in my opinion. The well being of every component relies on it. Plus it will save you money over the long run if you get one with a decent energy rating. Last fiddled with by KyleAskine on 2011-11-30 at 03:10 |
|
![]() |
![]() |
![]() |
#11 |
Basketry That Evening!
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88
1C3516 Posts |
![]()
[tangent]
How about Rosewill? Would you say they're a good company? My brother's looking into a new comp, and money isn't exactly easy to come by... 1000W PSU's. He'd decided on this: http://www.newegg.com/Product/Produc...82E16817171056 Which seems good quality, and has 5 eggs. But this is cheaper (by a lot): http://www.newegg.com/Product/Produc...82E16817182188 So I'm wondering: would the cheaper one be a safe buy, if we had to shave some money off? Thanks. [/tangent] |
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Heuristics | jnml | Miscellaneous Math | 15 | 2018-01-03 18:36 |
Computer diet - Need help | garo | Hardware | 41 | 2011-10-06 04:06 |
Heuristics | davieddy | Math | 13 | 2010-06-10 17:44 |
Double Check not assigned to this computer | RMAC9.5 | PrimeNet | 2 | 2008-02-21 23:52 |
Check out my new computer!! | JuanTutors | Hardware | 15 | 2006-09-26 14:45 |