![]() |
That's interesting. On my card, errors never come at positions = 2 mod 3. Those positions have small data values, ~10^(-6) or smaller. I thought that had something to do with it, but I guess its just specific card idiosyncrasies.
And an aside, I thought some more (too late as usual) and realized that certain read errors will get mis-interpreted as write errors. Not a problem in your test yet. |
1 Attachment(s)
[QUOTE=chalsall;339277]I will give you the full data-set once the run is complete...[/QUOTE]
I grew bored... Does the test ever finish? I'm now rerunning the test with 1.8 GB of vRAM. Will report.... |
It was 3/4 done. The positions go from 0 to size - 1. Not as bad as mine, but bad enough.
[CODE] Position 37, Iteration 1000, Total Errors: read 2781, write 4894 [/CODE] |
[QUOTE=owftheevil;339282]It was 3/4 done. The positions go from 0 to size - 1. Not as bad as mine, but bad enough.
[CODE] Position 37, Iteration 1000, Total Errors: read 2781, write 4894 [/CODE][/QUOTE] OK, I'll be more patient... Currently running ./memtest 75 1000 1 So far... [CODE]Position 2, Iteration 120, Total Errors: read 0, write 0 Position 2, Iteration 130, Total Errors: read 0, write 0 Position 2, Iteration 140, Total Errors: read 0, write 0[/CODE] |
[QUOTE=chalsall;339283]So far... [/QUOTE]
Further so far... [CODE]Position 5, Iteration 630, Total Errors: read 0, write 0 Position 5, Iteration 640, Total Errors: read 0, write 0 Position 5, Iteration 650, Total Errors: read 0, write 0 Position 5, Iteration 660, Total Errors: read 0, write 0 Position 5, Iteration 670, Total Errors: read 0, write 0 Position 5, Iteration 680, Total Errors: read 0, write 0 Position 5, Iteration 690, Total Errors: read 0, write 0 Position 5, Iteration 700, Total Errors: read 0, write 0 Position 5, Iteration 710, Total Errors: read 0, write 0 Position 5, Iteration 720, Total Errors: read 0, write 0 Position 5, Iteration 730, Total Errors: read 0, write 0 Position 5, Iteration 740, Total Errors: read 0, write 0 [/CODE] [CODE]Every 2.0s: nvidia-smi Sat May 4 20:30:48 2013 +------------------------------------------------------+ | NVIDIA-SMI 4.313.30 Driver Version: 313.30 | |-------------------------------+----------------------+----------------------+ | GPU Name | Bus-Id Disp. | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GeForce GTX 560 | 0000:03:00.0 N/A | N/A | | 75% 85C N/A N/A / N/A | 96% 1963MB / 2047MB | N/A Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Compute processes: GPU Memory | | GPU PID Process name Usage | |=============================================================================| | 0 Not Supported | +-----------------------------------------------------------------------------+[/CODE] And this is while the CPU is using all four hyper-threaded cores, and 2,982MB of memory, doing a P-1 job.... |
1 Attachment(s)
Results from the 570:
[CODE] Position 37, Iteration 1000, Total Errors: write 0, read 0[/CODE] And preliminary results from version 0.1 which more accurately counts and categorizes the errors, and gives progress reports. [CODE]Position 9, Iteration 140, Total Errors: read 58, write 0, completed 23.44%[/CODE] |
I am about to start a LL test on a big exponent(200M+), need advice on the FFT size.
Preliminary benchmarks show that a good FFT size may saw off a couple of days of work, and as the expo grows, so does the gain from a surgically picked FFT size. |
1 Attachment(s)
[QUOTE=owftheevil;339286]And preliminary results from version 0.1 which more accurately counts and categorizes the errors, and gives progress reports.[/QUOTE]
Not sure if this is still useful to you, but attached are three runs of your memory test under my EVGA GTX560 SC 2048GB card. The last run is using your V0.1. Thank you for providing this. It might explain why my card passed every other non-GIMPS test out there -- it appears to be *just* borderline unstable. Perhaps a new GIMPS slogan: "Our software tests hardware like no other!".... :smile: |
Thanks for the data. Version 0.1 does fewer reads than version 0.0 to help distinguish read and write errors, which is probably why there were fewer errors on that test. I was worried that on borderline cards, it might give false positives.
|
[QUOTE=owftheevil;339374]I was worried that on borderline cards, it might give false positives.[/QUOTE]
Please forgive me if I'm about to demonstrate my ignorance, but is there not a race condition... [CODE]void test (int n, int s, int iter, int pos) { int compare, i, j, k; for(k = 1; k <= iter; k++) { /*Copy data from pos to all other chunks*/ for(i = 0; i < s; i++) copy_kernel <<<n / 512, 512 >>> (&g_ttp[i * n], &g_ttp[pos * n]); [COLOR="Red"]/* ...right about here? Shouldn't there be a sync here? */[/COLOR] /*Compare data from pos with all other chunks*/ for(i = 0; i < 10; i++) { for(j = 0; j < s; j++) if(j != pos) compare_kernel<<<n / 512, 512>>> (&g_ttp[j * n], &g_ttp[pos * n], g_compare); cutilSafeThreadSync(); [COLOR="Red"]/* This shouldn't be needed, and may be masking the bug. */[/COLOR] } if(k%10 == 0) { cutilSafeCall (cudaMemcpy (&compare, g_compare, sizeof (int), cudaMemcpyDeviceToHost)); cutilSafeCall (cudaMemset (g_compare, 0, sizeof (int))); write_total += compare / 10; read_total += compare % 10; printf("Position %d, Iteration %d, Total Errors: read %d, write %d\n", pos, k, read_total, write_total); } } }[/CODE] |
The different kernels run synchronously, the cutilSafeThreadSync call is so the cpu doesn't do busy waiting and eat up an entire cpu core.
|
| All times are UTC. The time now is 23:13. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.