![]() |
|
|
#12 |
|
Mar 2003
Melbourne
20316 Posts |
I tried lower exponents and do "stuff" to try to induce the error. I tried double check 10000139 which takes <2hours, and do things like start my washing machine and exhaust memory to force pagefile usage. But couldn't cause an error - residue match.
But no luck. So I decided to test M39295301 again, but this time, do "InterimFiles=600000" in prime.txt, so I could go back in time for troubleshooting purposes. i.e. do disastrous activity 'x', and see what happens, and go back to prior residue. So I have something 'weird'. Interim residues went to zero. Code:
[Sat Jul 30 16:21:06 2016] M39295301 interim We4 residue CC3C48495000037E at iteration 12000000 M39295301 interim We4 residue 3C49659327DD0526 at iteration 12000001 M39295301 interim We4 residue F44162741F0EE10A at iteration 12000002 [Sat Jul 30 16:48:48 2016] M39295301 interim We4 residue 0000000000000000 at iteration 12600000 M39295301 interim We4 residue 0000000000000000 at iteration 12600001 M39295301 interim We4 residue 0000000000000000 at iteration 12600002 etc... Each residue thereafter was all 0s. Code:
$ ls -lart -rwxrwx---+ 1 4911984 Jul 30 16:21 p2E95301.020 -rwxrwx---+ 1 3733828 Jul 30 16:48 p2E95301.021 So I run the same test again from the "16:21 / *.020" save. And the .021 save was the same size now with non zero residue. Code:
[Sat Jul 30 20:12:53 2016] M39295301 interim We4 residue 64DD890177223A30 at iteration 12600000 M39295301 interim We4 residue E9CF3C57BEBBD866 at iteration 12600001 M39295301 interim We4 residue FED40900A4BD82C6 at iteration 12600002 Code:
$ ls -alrt p2E95301.021 -rwxrwx---+ 1 4911984 Jul 30 20:12 p2E95301.021 Code:
Jul 30 16:27:06> ./CUDAPm1-v0.20.exe Still doing more testing. |
|
|
|
|
|
#13 |
|
Mar 2003
Melbourne
5×103 Posts |
I think I'm happy it's fixed now.
The machine has done 5x matching double checks in a row. 3x new exponents, and 2x exponents done previously. This is higher than anything done previously. Resolution? Given the randomness of the issue, it's probably not just one the items below. But here's what I did: - remove and reinstall SATA AHCI drivers - remove old GPU drivers, and reinstall nvidia drivers - Change the user temp directory and Windows\Temp directory variables to point to a fresh directory and reboot (not all services pickup the change) Nvidia temp files: Code:
Directory of U:\TEMP\NVIDIA Corporation\NV_Cache
05/08/2016 08:01 PM <DIR> .
05/08/2016 08:01 PM <DIR> ..
02/08/2016 10:43 PM 16,384 22d784da9a2078597920020ef1ee250e_fce8395c8fd8a85e_15f74c7777689be5_0_0.bin
02/08/2016 10:43 PM 4,096 22d784da9a2078597920020ef1ee250e_fce8395c8fd8a85e_15f74c7777689be5_0_0.toc
02/08/2016 10:43 PM 16,384 22d784da9a2078597920020ef1ee250e_fce8395c8fd8a85e_15f74c7777689be5_1_0.bin
02/08/2016 10:43 PM 4,096 22d784da9a2078597920020ef1ee250e_fce8395c8fd8a85e_15f74c7777689be5_1_0.toc
03/08/2016 01:57 AM 1,048,576 22d784da9a2078597920020ef1ee250e_fce8395c8fd8a85e_15f74c7777689be5_1_1.bin
04/08/2016 09:11 PM 262,144 22d784da9a2078597920020ef1ee250e_fce8395c8fd8a85e_15f74c7777689be5_1_1.toc
03/08/2016 01:27 PM 16,384 22d784da9a2078597920020ef1ee250e_fce8395c8fd8a85e_3b623872478f08e_0_0.bin
03/08/2016 01:27 PM 4,096 22d784da9a2078597920020ef1ee250e_fce8395c8fd8a85e_3b623872478f08e_0_0.toc
05/08/2016 08:01 PM 16,384 22d784da9a2078597920020ef1ee250e_fce8395c8fd8a85e_f3279b66e87c6f22_0_0.bin
05/08/2016 08:01 PM 4,096 22d784da9a2078597920020ef1ee250e_fce8395c8fd8a85e_f3279b66e87c6f22_0_0.toc
03/08/2016 01:28 PM 16,384 9542647a3283db9e625cd51c00efe7d_fce8395c8fd8a85e_b82f6581efe09e9e_0_0.bin
03/08/2016 01:28 PM 4,096 9542647a3283db9e625cd51c00efe7d_fce8395c8fd8a85e_b82f6581efe09e9e_0_0.toc
03/08/2016 01:28 PM 1,048,576 9542647a3283db9e625cd51c00efe7d_fce8395c8fd8a85e_b82f6581efe09e9e_0_1.bin
03/08/2016 01:28 PM 262,144 9542647a3283db9e625cd51c00efe7d_fce8395c8fd8a85e_b82f6581efe09e9e_0_1.toc
03/08/2016 01:32 PM 16,777,216 9542647a3283db9e625cd51c00efe7d_fce8395c8fd8a85e_b82f6581efe09e9e_0_2.bin
03/08/2016 01:34 PM 4,194,304 9542647a3283db9e625cd51c00efe7d_fce8395c8fd8a85e_b82f6581efe09e9e_0_2.toc
03/08/2016 01:35 PM 16,777,216 9542647a3283db9e625cd51c00efe7d_fce8395c8fd8a85e_b82f6581efe09e9e_0_3.bin
17 File(s) 40,472,576 bytes
2 Dir(s) 8,476,454,912 bytes free
Code:
C:\>set | find "U:" TEMP=U:\TEMP TMP=U:\TEMP -- Craig Last fiddled with by nucleon on 2016-08-07 at 02:37 Reason: typo |
|
|
|
|
|
#14 |
|
Mar 2003
Melbourne
51510 Posts |
I feel like an idiot. Pretty much ignore the above.
My GPU blew up a few days ago. The magic smoke was released. I could smell the smoke. My guess is something blew or shorted on the gpu board that regulates power. I tried powering up the PC again, but it seems the PSU over voltage or over current kicked in and didn't allow bootup. As soon as I replaced GPU, all OK. I'm back up and no errors since. So I'd say the power components on the GPU board have been playing up for a while. Possibly since my interstate relocation in March this year. -- Craig |
|
|
|
|
|
#15 |
|
Romulan Interpreter
Jun 2011
Thailand
72×197 Posts |
Ha! The famous "mosfet bug" hit you too! Those cards have a design error, the mosfet that powers the memory on the back side is not properly dimensioned or not properly covered by the cooler, and it burns when you stress the memories. That is why LL/cudaLucas can kill the card, but TF will not, as mfaktc does not use much memory transfers.
I started to "collect" Titan cards with this issue, hoping to be able to repair them. Three of them will be sent to me soon by airsquirrels, adding to my already existent "stock" (another two damaged, one repaired). So, if you decide to part with it toward the rubbish bin, better send it to me and I may pay you the postal fees, in case you will not try to skin me off (sometimes the postal fees are more than the new goods, hehe, and I never "imported" electronics form Australia). In case I can repair it, I may give it a new life, producing for gimps. Last fiddled with by LaurV on 2016-08-24 at 07:45 |
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Combining low quality random numbers sources | only_human | Miscellaneous Math | 3 | 2016-05-20 05:47 |
| Skylake and RAM scaling | mackerel | Hardware | 34 | 2016-03-03 19:14 |
| So does skylake-nonXeon actually get us anything? | fivemack | Hardware | 36 | 2015-09-08 01:42 |
| Skylake AVX-512 | clarke | Software | 15 | 2015-03-04 21:48 |
| Quality of results | ltd | Prime Sierpinski Project | 2 | 2004-08-10 22:09 |