mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   CUDALucas (a.k.a. MaclucasFFTW/CUDA 2.3/CUFFTW) (https://www.mersenneforum.org/showthread.php?t=12576)

mognuts 2013-12-16 19:25

[QUOTE=owftheevil;362199]Crashes how?[/QUOTE]

This is the console output:[CODE]
C:\Users\John\Desktop\cudalucas>CUDALucas_205Beta_x64_r52 -cufftbench 2048 1 5
------- DEVICE 0 -------
name GeForce GTX 460
Compatibility 2.1
clockRate (MHz) 1430
memClockRate (MHz) 1800
totalGlobalMem 1073741824
totalConstMem 65536
l2CacheSize 524288
sharedMemPerBlock 49152
regsPerBlock 32768
warpSize 32
memPitch 2147483647
maxThreadsPerBlock 1024
maxThreadsPerMP 1536
multiProcessorCount 7
maxThreadsDim[3] 1024,1024,64
maxGridSize[3] 65535,65535,65535
textureAlignment 512
deviceOverlap 1
Thread bench, testing various thread sizes for ffts 1K to 2048K, doing 5 passes.
fft size = 1K, ave time = 6.5100 msec, Norm1 threads 32, Norm2 threads 32
fft size = 1K, ave time = 6.5094 msec, Norm1 threads 32, Norm2 threads 64
fft size = 1K, ave time = 6.5098 msec, Norm1 threads 32, Norm2 threads 128
fft size = 1K, ave time = 6.5084 msec, Norm1 threads 32, Norm2 threads 256
fft size = 1K, ave time = 6.5089 msec, Norm1 threads 32, Norm2 threads 512
fft size = 1K, ave time = 6.5085 msec, Norm1 threads 32, Norm2 threads 1024
fft size = 1K, ave time = 6.5084 msec, Norm1 threads 64, Norm2 threads 32
fft size = 1K, ave time = 6.5088 msec, Norm1 threads 64, Norm2 threads 64
fft size = 1K, ave time = 6.5085 msec, Norm1 threads 64, Norm2 threads 128
fft size = 1K, ave time = 6.5087 msec, Norm1 threads 64, Norm2 threads 256
fft size = 1K, ave time = 6.5080 msec, Norm1 threads 64, Norm2 threads 512
fft size = 1K, ave time = 6.5087 msec, Norm1 threads 64, Norm2 threads 1024
fft size = 1K, ave time = 6.5084 msec, Norm1 threads 128, Norm2 threads 32
fft size = 1K, ave time = 6.5080 msec, Norm1 threads 128, Norm2 threads 64
fft size = 1K, ave time = 6.5082 msec, Norm1 threads 128, Norm2 threads 128
fft size = 1K, ave time = 6.5079 msec, Norm1 threads 128, Norm2 threads 256
fft size = 1K, ave time = 6.5082 msec, Norm1 threads 128, Norm2 threads 512
fft size = 1K, ave time = 6.5072 msec, Norm1 threads 128, Norm2 threads 1024
fft size = 1K, ave time = 6.5090 msec, Norm1 threads 256, Norm2 threads 32
fft size = 1K, ave time = 6.5091 msec, Norm1 threads 256, Norm2 threads 64
fft size = 1K, ave time = 6.5085 msec, Norm1 threads 256, Norm2 threads 128
fft size = 1K, ave time = 6.5088 msec, Norm1 threads 256, Norm2 threads 256
fft size = 1K, ave time = 6.5078 msec, Norm1 threads 256, Norm2 threads 512
fft size = 1K, ave time = 6.5085 msec, Norm1 threads 256, Norm2 threads 1024
fft size = 1K, ave time = 6.5099 msec, Norm1 threads 512, Norm2 threads 32
fft size = 1K, ave time = 6.5098 msec, Norm1 threads 512, Norm2 threads 64
fft size = 1K, ave time = 6.5093 msec, Norm1 threads 512, Norm2 threads 128
fft size = 1K, ave time = 6.5098 msec, Norm1 threads 512, Norm2 threads 256
fft size = 1K, ave time = 6.5096 msec, Norm1 threads 512, Norm2 threads 512
fft size = 1K, ave time = 6.5099 msec, Norm1 threads 512, Norm2 threads 1024
fft size = 1K, ave time = 5.9309 msec, Norm1 threads 128, Mult threads 32, Norm2 threads 1024
fft size = 1K, ave time = 5.9307 msec, Norm1 threads 128, Mult threads 64, Norm2 threads 1024
fft size = 1K, ave time = 5.9311 msec, Norm1 threads 128, Mult threads 128, Norm2 threads 1024
fft size = 1K, ave time = 5.9318 msec, Norm1 threads 128, Mult threads 256, Norm2 threads 1024
Best time for fft = 1K, time: 5.9307, t0 = 128, t1 = 64, t2 = 1024
[/CODE] Followed by a dialogue box containing the following text:

[B]CUDALucas_205Beta_x64_r52.exe has stopped working[/B]
A problem caused the program to stop working correctly. Windows will close the program and notify you if a solution is available.

This happens regardless of the parameters used for -cufftbench.

mognuts 2013-12-16 21:24

On a more positive note, r52 correctly found 3 known primes.:showoff:

M( 11213 )P, n = 1K, CUDALucas v2.05 Beta
M( 1257787 )P, n = 64K, CUDALucas v2.05 Beta
M( 2976221 )P, n = 256K, CUDALucas v2.05 Beta

owftheevil 2013-12-17 15:04

R53 is up, fixing the sparse <gpu> fft.txt file issue, the uninitialized pointer causing mismatched residues in the self-test, an incorrect fft length in the threads bench and a bad bounday case condition in the fft initialization.

@mognuts: I could not get the behaviour your 460 showed to happen, so I don't know if the problem is fixed or not.

Windows version is not up yet.

ET_ 2013-12-17 15:29

[QUOTE=owftheevil;362287]R53 is up, fixing the sparse <gpu> fft.txt file issue, the uninitialized pointer causing mismatched residues in the self-test, an incorrect fft length in the threads bench and a bad bounday case condition in the fft initialization.

@mognuts: I could not get the behaviour your 460 showed to happen, so I don't know if the problem is fixed or not.

Windows version is not up yet.[/QUOTE]

You are referring to CUDALucas, not CUDAPm1 issues, aren't you?

Luigi

flashjh 2013-12-17 15:29

r53 is on [URL="https://sourceforge.net/projects/cudalucas/files/2.05%20Beta/"]SourceForge[/URL].

[B].ini file is updated, please re-download.[/B]

Formatting output can be customized now.

Please run the tests in this [URL="http://www.mersenneforum.org/showthread.php?p=360992#post360992"][COLOR=#0066cc]post[/COLOR][/URL] and continute to post any issues or bugs.

Thanks!

[QUOTE=ET_;362291]You are referring to CUDALucas, not CUDAPm1 issues, aren't you?

Luigi[/QUOTE]
Yes

flashjh 2013-12-17 16:42

Posted Win32 .exe files on SourceForge - first time I've built Win32 with 2.05 Beta, please test accordingly.

flashjh 2013-12-18 02:10

[URL="http://www.mersenneforum.org/26926727"]Successful[/URL] test of Win32 version of r53

I am now able to build CUDA version 4.0 and up, 64 bit only, if anyone needs a version, let me know.

flashjh 2013-12-19 01:42

[URL="https://sourceforge.net/projects/cudalucas/files/2.05%20Beta/"]SorceForge[/URL] updated with latest commit, currently r55. Minor formatting changes and updated makefile.win file to allow for Win32 or x64 compiles with CUDA 4.0 up to 5.5.

If anyone wants help compiling with make or in MSVS, let me know.

Had another successful DC with Win32 version. With the help of petrw1 I have 23/24 good DCs. The bad one was probably caused by all my stopping/starting while compiling, etc. None the less, that's why we DC.

flashjh 2013-12-21 03:39

1 Attachment(s)
[QUOTE=mognuts;362210]This is the console output:[CODE]<snip>
[/CODE] Followed by a dialogue box containing the following text:

[B]CUDALucas_205Beta_x64_r52.exe has stopped working[/B]
A problem caused the program to stop working correctly. Windows will close the program and notify you if a solution is available.

This happens regardless of the parameters used for -cufftbench.[/QUOTE]
I am running tests to cause the NVIDIA Windows Kernel Mode Driver failure. Testing all versions of NVidia WHQL drivers since 296.10. Those results later...

@mognuts, I was able to (accidentally) reproduce the results you experienced.

@owftheevil

-Anytime I run -cufftbench fft# [B]smallerfft# [/B]1 it causes CUDALucas to crash like mognuts experienced
-When I run -cufftbench fft# fft# any# it skips [U]some[/U] of the fft tests completely

See the attached file for screenshot and bench.txt output for the skipped tests. I included the .exe file I'm using for testing. I'm currently on 314.22, but it doesn't seem to matter what driver I use.

owftheevil 2013-12-21 21:53

I'll take a look.

New commit r56, fixes a regression concerning command line input. Try to specify a nonstandard fft like 3150k and you'll see what I'm talking about.

flashjh 2013-12-22 19:44

Windows r56 executables posted to [URL="https://sourceforge.net/projects/cudalucas/files/2.05%20Beta/"]SourceForge[/URL]


All times are UTC. The time now is 23:09.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.