mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware

Reply
 
Thread Tools
Old 2005-03-25, 02:15   #1
thermalMan
 
Mar 2005
Ottawa, Canada, ON

52 Posts
Default Isolated the problem, now what?

Here's what I've done so far:

1. Ran Memtest twice, going thru all 9 steps. Took more than an hour. No problem (even though to be more conclusive, it should have run much longer).
2. Ran Prime95 Torture Test - Blend test < 1 min before hardware error message
3. Ran Prime95 Torture Test - Small FFTs < 13 min before hardware error message

Thinking that either of my Samsung PC2700 512MB RAM sticks were damaged, I decided to test them seperately.

1. Dimm 1: ran Blend for 12 minutes without any problems (Beat the 1 minute record when having both dimms in).
2. Dimm 2: ran Blend for 8 minutes without any problems (Beat the 1 minute record when having both dimms in).
3. Switched both dimms in slot 2 and 3 (they were in slot 1 and 3 before individual tests).
4. Ran Prime95 Blend and got a hardware error < 2 min

Automatically, my motherboard runs both of my twin dimms in dual channel mode, whereas individual dimms are run using single channel mode. I'm thinking that it's the dual channel that is making Prime95 display the hardware error message.

My system specs:
Athlon XP 3000+ (stock speeds)
Asus A7N8X Deluxe BIOS 1007
2x512 MB PC2700 Samsung RAM
Geforce 3 Ti 200
Soundblaster Live value
NEC 3250A Dual Layer Burner
2x120GB 8MB 7200RPM SATA Maxtor HD
1x60GB 2MB 7200RPM Quantum LM HD

BIOS:
Optimal settings (vs. aggressive)
-FSB 167Mhz
-2.5-3-3-7 CAS Timings

Motherboard Monitor 5 screen capture

Would you think that its the dual channel that is creating the hardware error?
thermalMan is offline   Reply With Quote
Old 2005-03-25, 02:35   #2
moo
 
moo's Avatar
 
Jul 2004
Nowhere

809 Posts
Default

try setting it to single channel mode for mem also relax mem timeings try again then. also try one stick then other.
moo is offline   Reply With Quote
Old 2005-03-25, 02:54   #3
thermalMan
 
Mar 2005
Ottawa, Canada, ON

52 Posts
Default

As I said, I did try individual sticks, which indicated that it worked. It was only when I put both of them, in dual channel, that it didn't work. I appologize for not being clearer.

I'll have to check to see what is needed for running them in single channel. BIOS setting I figure. Could you suggest a better timming in order to give me a hint of what to change (newbie in CAS interpretations).

Thanks!
thermalMan is offline   Reply With Quote
Old 2005-03-25, 16:13   #4
Peter Nelson
 
Peter Nelson's Avatar
 
Oct 2004

232 Posts
Default

Memtest86 speeds report should indicate that mem is in single/dual channel config by what the throughput speed is. From what you posted I don't think you yet tried BOTH mem modules BUT in the slots to still give single channel (as opposed to the particular physical slots you must pair up for dual channel). Does THAT work? ie both modules single channel.

Possible make sure ALL your ram sockets are clean of dust etc.

My suggestion would be to slightly increase your RAM voltage if your mobo bios settings support doing this.

The reason is that supplying a given voltage, with one chip is ok, but when you have two in the load is greater which may cause a very slight voltage drop. This could be enough to change operation from stable to borderline.

Last fiddled with by Peter Nelson on 2005-03-25 at 16:13
Peter Nelson is offline   Reply With Quote
Old 2005-03-25, 18:06   #5
thermalMan
 
Mar 2005
Ottawa, Canada, ON

52 Posts
Default

Well, having 2 dimms in slot 1 and 3 or 2 and 3 runs the memory in dual channel. Which is what Memtest ran under without any hitch. I never tried individual dimms in Memtest just because both, in dual channel, worked.

Now, I have them in slot 1 and 2 which runs them in single channel (BIOS cleaerly indicates this as well as the manual)

Other info on my system:
CPUZ's CPU Info
CPUZ's Memory Info

I'll check for dust in the case and will clean the slots with compressed air. Will check the RAM voltages too. Thanks for the tip! So, the new CPU would suck more energy than the old one, thus robbing the memory of some power?

I tried Prime95 having both dimms in single channel mode (slot 1 and 2) and it still gave me the error message. So, having both dimms in, either dual or single channel, seems to cause instability in my system...

Note: 3 times this morning, the case temps have gone from 26C to 106C? MM5 freaks out and stuff...

Last fiddled with by thermalMan on 2005-03-25 at 18:08
thermalMan is offline   Reply With Quote
Old 2005-03-25, 18:29   #6
moo
 
moo's Avatar
 
Jul 2004
Nowhere

809 Posts
Default

hmm try turning off mbm5 for a bit and run prime95 wonder if its causeing errors.
moo is offline   Reply With Quote
Old 2005-03-25, 18:38   #7
thermalMan
 
Mar 2005
Ottawa, Canada, ON

52 Posts
Default

Thanks for the hint! I just tried it without MM5 running:

FATAL ERROR: Resulting wum was -3.914402764883792e+028, expected: 9.6197848261e+016
Hardware failure detected, consult stress.txt file.
Torture Test ran 0 minutes - 1 error, 0 warnings.
Execution halted.

This is the error message that I get across the board in all my test... just different numbers maybe.

Last fiddled with by thermalMan on 2005-03-25 at 18:38
thermalMan is offline   Reply With Quote
Old 2005-03-25, 18:57   #8
thermalMan
 
Mar 2005
Ottawa, Canada, ON

1916 Posts
Default

I increased the "DDR Reference Voltage" from 2.6V to 2.7. I then booted and ran Prim95. As soon as I got in the app, it rebooted my whole system. I guess that wasn't it?
thermalMan is offline   Reply With Quote
Old 2005-03-26, 10:52   #9
Kosmaj
 
Kosmaj's Avatar
 
Nov 2003

2·1,811 Posts
Default

Do you have an adequate power supply? I think you need at least a 300 W unit.

Is your system stable when not running GIMPS?

What kind of heat-sink do you use? What's your CPU temperature under the full load? 52C appearing in your MBM screen capture sounds too low for a full load temp...
Kosmaj is offline   Reply With Quote
Old 2005-03-26, 12:51   #10
dsouza123
 
dsouza123's Avatar
 
Sep 2002

2·331 Posts
Default

Try underclocking your CPU to half speed and running the tests.
This will reduce heat and power usage by the CPU.
dsouza123 is offline   Reply With Quote
Old 2005-03-26, 16:50   #11
db597
 
db597's Avatar
 
Jan 2003

7·29 Posts
Default

Try using slower memory timings. Perhaps 3-4-4-8 and see if it stabilises.
db597 is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
What is the problem here? didgogns Msieve 1 2016-11-15 03:31
problem 2.4 MattcAnderson Puzzles 4 2014-08-21 04:40
Problem with LMH derekg Lone Mersenne Hunters 2 2007-02-26 22:47
51 problem Neves Miscellaneous Math 5 2004-02-10 22:59
51 problem Neves Puzzles 15 2004-02-05 23:11

All times are UTC. The time now is 21:15.


Fri Aug 6 21:15:35 UTC 2021 up 14 days, 15:44, 1 user, load averages: 2.76, 2.57, 2.54

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.