mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   The P-1 factoring CUDA program (https://www.mersenneforum.org/showthread.php?t=17835)

ET_ 2013-11-22 18:12

[QUOTE=James Heinrich;360002]Note that the "<gpu> fft.txt" and "<gpu> threads.txt" files are distinct from each other.

<gpu> fft.txt should look something like[code]Device GeForce GTX 670
Compatibility 3.0
clockRate (MHz) 980
memClockRate (MHz) 3004

fft max exp ms/iter
4 85933 0.0697
16 333803 0.1153
32 657719 0.1306
36 738083 0.1618
48 978041 0.1635
... skip a whole bunch of fft lines ...
28800 511382147 76.5273
32768 580225813 79.6749[/code]Whereas "<gpu> threads.txt" should be quite short (and more cryptic), mine looks like:[code]17496 256 64 512 45.9160
3456 256 128 32 8.0790[/code]
I suspect it didn't make a "<gpu> threads.txt" file for you because it appears to have failed partway through the process:[/QUOTE]

Thanks James. :bow:

From what you said, I assume that there should be 2 distinct files: the first created by cufftbench 1.8192 1, the second by -cufftbench 4096 4096 4

I'll try to modify the [COLOR="Red"]r[/COLOR] parameter of the second bench run and see if it suffices.

Luigi

ET_ 2013-11-22 18:26

[QUOTE=ET_;360003]Thanks James. :bow:

From what you said, I assume that there should be 2 distinct files: the first created by cufftbench 1.8192 1, the second by -cufftbench 4096 4096 4

I'll try to modify the [COLOR="Red"]r[/COLOR] parameter of the second bench run and see if it suffices.

Luigi[/QUOTE]

Sadly, I always get "[FONT="Courier New"][COLOR="Red"]CUDAPm1.cu(2163) : cufftSafeCall() CUFFT error 6: CUFFT_EXEC_FAILED[/COLOR][/FONT]" with r between 1 and 5 and Threads=128 or 256.

Hints?

Luigi

owftheevil 2013-11-22 19:57

Does it always fail at the same place in the test?

Also try putting

[CODE]cutilSafeThreadSync();[/CODE]

after the cufft call on line 2161 and after the square call on 2162. That will at least tell us what is failing.

ET_ 2013-11-22 20:05

[QUOTE=owftheevil;360013]Does it always fail at the same place in the test?

Also try putting

[CODE]cutilSafeThreadSync();[/CODE]

after the cufft call on line 2161 and after the square call on 2162. That will at least tell us what is failing.[/QUOTE]

Yes, it always fails at the same place.

Added the line in the 2 places you asked. A new result:

[code]
CUDAPm1.cu(2165) : cufftSafeCall() CUFFT error 6: CUFFT_EXEC_FAILED
[/code]

Added a new sync after line 2165: same error.

Luigi

owftheevil 2013-11-22 20:10

Sorry, I jumped too quickly on the safecall stuff. More is needed. Let me think a bit.

ET_ 2013-11-22 20:14

[QUOTE=owftheevil;360017]Sorry, I jumped too quickly on the safecall stuff. More is needed. Let me think a bit.[/QUOTE]

No hurry. I'm actually playing with Threads=128 and the program is working: I just tried to squeeze some more juice from it.

I'll be quietly waiting for your thoughts, thank you.

Luigi :smile:

owftheevil 2013-11-22 20:38

Could you try this little snippet after the square call on 2162?

[CODE]cudaThreadSynchronize();
{
cudaError_t error = cudaGetLastError();
if(error != cudaSuccess)
{
printf("CUDA error: %s\n", cudaGetErrorString(error));
exit(2);
}
}[/CODE]

ET_ 2013-11-22 21:13

[QUOTE=owftheevil;360021]Could you try this little snippet after the square call on 2162?

[CODE]cudaThreadSynchronize();
{
cudaError_t error = cudaGetLastError();
if(error != cudaSuccess)
{
printf("CUDA error: %s\n", cudaGetErrorString(error));
exit(2);
}
}[/CODE][/QUOTE]

The error is:

[code]
CUDA error: too many resources requested for launch
[/code]

while the environment is:

[code]
------- DEVICE 0 -------
name GeForce GTX 580
Compatibility 2.0
clockRate (MHz) 1594
memClockRate (MHz) 2025
totalGlobalMem 1610285056
totalConstMem 65536
l2CacheSize 786432
sharedMemPerBlock 49152
regsPerBlock 32768
warpSize 32
memPitch 2147483647
maxThreadsPerBlock 1024
maxThreadsPerMP 1536
multiProcessorCount 16
maxThreadsDim[3] 1024,1024,64
maxGridSize[3] 65535,65535,65535
textureAlignment 512
deviceOverlap 1
[/code]

HTH... :smile: thanks.

Luigi

ET_ 2013-11-22 21:15

Sorry for the delay... I was dining.

owftheevil 2013-11-22 22:42

Thanks for getting back with that. The only thing I can think of right now is that somehow, either t2 or the threads array have messed up values. I'll look at it over the weekend and get back on Monday.

ET_ 2013-11-23 11:03

[QUOTE=owftheevil;360032]Thanks for getting back with that. The only thing I can think of right now is that somehow, either t2 or the threads array have messed up values. I'll look at it over the weekend and get back on Monday.[/QUOTE]

Thanks :bow:

I add that I am using Linux_64, driver 304.88.

CUDA version info:

[code]
CUDA version info
binary compiled for CUDA 4.10
CUDA runtime version 4.10
CUDA driver version 5.0
[/code]

Luigi


All times are UTC. The time now is 23:19.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.