mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2012-06-10, 02:05   #1420
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

226658 Posts
Default

Quote:
Originally Posted by Dubslow View Post
What's sfv?
Google is your friend :D (in this case wikipedia)

edit: cuffts are on the way. What's the drop limit? (zip file is 33 megs and I wonder if I should compress it harder or split it in two. I tried already two times and yer friend's site says I/O error and crashes at about half. Is it my net - as usually - bad Sunday morning or the site has a limit of 15 megs?)

edit 2: It went through. Here the link. The .sfv is attached to this post. BTW, yer linux should have this built-in by default, no need any tools like total commander which I am using in windoze. Ye should know it... WinSFV is very convenient, just doubleclick on any sfv file and it tells you if any of the files checksumed inside was changed. Of course you may need to rename it back (delete the .txt extension which I added for forum reasons)
Attached Files
File Type: txt cuffts.sfv.txt (181 Bytes, 574 views)

Last fiddled with by LaurV on 2012-06-10 at 02:17
LaurV is offline   Reply With Quote
Old 2012-06-10, 02:12   #1421
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3·29·83 Posts
Default

Theoretically it should handle it, but with files that large you never know. Email it to me.

Edit: Wow, I'm amazed. I managed to DL the file in 2 seconds flat. FS must have a server nearby to me.

(Edit2: No, it wasn't installed by default. MD5 is much more common. Addendum:
Quote:
Originally Posted by man cksfv
cksfv is a tool for verifying CRC32 checksums of files. CRC32 checksums
are used to verify that files are not corrupted. The algorithm is cryp‐
tographically crippled so it can not be used for security purposes.
md5sum (1) or sha1sum (1) are much better tools for checksuming files.
cksfv should only be used for compatibility with other systems.
)

Last fiddled with by Dubslow on 2012-06-10 at 02:25
Dubslow is offline   Reply With Quote
Old 2012-06-10, 03:12   #1422
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3×29×83 Posts
Default

LaurV, for printing total time, does that need to appear in the results file, or just on the screen?

Edit: I've found a bug in 2.03, however it will only manifest itself if you screw up the formatting of the SaveFolder, ResultsFile, or WorkFile options in the ini file.

PS @LaurV:
Code:
bill@Gravemind:~/CUDALucas∰∂ ./new.CUDALucas 132049 -f 0

Warning: Couldn't parse ini file option ResultsFile; using default "results.txt"
Starting M132049 fft length = 7168
Iteration 30000 M( 132049 )C, 0xbcd4392925c8b6c9, n = 7K, CUDALucas v2.04 Alpha err = 0.0103 (0:03 real, 0.0892 ms/iter, ETA 0:08)
^C
SIGINT caught. Writing checkpoint.

bill@Gravemind:~/CUDALucas∰∂ ./new.CUDALucas 132049 -f 0

Warning: Couldn't parse ini file option ResultsFile; using default "results.txt"
Continuing work from a partial result of M132049 fft length = 7168 iteration = 40129
Iteration 60000 M( 132049 )C, 0x1a3c4b80c267f04f, n = 7K, CUDALucas v2.04 Alpha err = 0.0097 (0:02 real, 0.0578 ms/iter, ETA 0:03)
Iteration 90000 M( 132049 )C, 0x28ecbb0541f5ec16, n = 7K, CUDALucas v2.04 Alpha err = 0.0097 (0:03 real, 0.0873 ms/iter, ETA 0:02)
Iteration 120000 M( 132049 )C, 0x816902f6d3a9764a, n = 7K, CUDALucas v2.04 Alpha err = 0.0103 (0:02 real, 0.0872 ms/iter, ETA 0:00)
M( 132049 )P, n = 7K, CUDALucas v2.04 Alpha. Estimated total time: 0:11

bill@Gravemind:~/CUDALucas∰∂

The best part is that it can still read the old checkpoints, realize it's reading old checkpoints and then not print the time.

Last fiddled with by Dubslow on 2012-06-10 at 03:59
Dubslow is offline   Reply With Quote
Old 2012-06-11, 20:12   #1423
flashjh
 
flashjh's Avatar
 
"Jerry"
Nov 2011
Vancouver, WA

100011000112 Posts
Default

Quote:
Originally Posted by Dubslow View Post
After much fighting with SF Beta and SVN, there is now a CUDALucas SourceForge page.

In addition, timeval.c is gone entirely, not to mention various changes to the defines and includes. Any binaries compiled should be identical, but flash should also test and make sure this newer stuff does actually compile.

Does anybody feel like writing a README?

(I've marked it as GPL, but that's certainly open to discussion.)

msft and flash, please report your SF usernames so I can add you. Anyone else is welcome to join as a "Member". (Or Developer if you ask nicely. )
Ok, SVN was a pretty big learning curve

I updated the files per the comments to rev20

@Dusbslow, can you recompile and test in Linux? I changed the
IniGetStr function and added a custom sprintf_s for MSVS that should only affect writing results.txt files, but I need you to test compile/run again.

Otherwise, everything seems to be working well. After you test compile/run, I'll recompile and post to SourceForge.
flashjh is offline   Reply With Quote
Old 2012-06-11, 20:54   #1424
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3·29·83 Posts
Default

Quote:
Originally Posted by flashjh View Post
Ok, SVN was a pretty big learning curve
Yeah, me too. Check out the comments for r19
Quote:
Originally Posted by flashjh View Post
I updated the files per the comments to rev20

@Dusbslow, can you recompile and test in Linux? I changed the
IniGetStr function and added a custom sprintf_s for MSVS that should only affect writing results.txt files, but I need you to test compile/run again.

Otherwise, everything seems to be working well. After you test compile/run, I'll recompile and post to SourceForge.
That was indeed the bug I was referring to. I'll test it, but the code looks good. (spritf?)

I wasn't intending to post executables to SF until it moves to at least Beta. I'm only about halfway through the changes, not done yet (Among other things, results file locking isn't implemented yet.) If you want, feel free to post executables here, but I do warn everyone, it's still in Alpha.

In particular, you can compile that "test" version I mentioned flash, use /DTEST as defined by the makefile rule "test". I was waiting to ask for you to make it until I had written the Python I need locally to interpret the results, but you can make it now if you want.
Dubslow is offline   Reply With Quote
Old 2012-06-11, 21:56   #1425
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3·29·83 Posts
Default

Edit (I wish): Yes, it does compile fine.

In other news, I suddenly can't commit to svn/tags.
Code:
bill@Gravemind:~/CUDALucas∰∂ svn commit --username=dubslow tags/v2.03-final/
svn: Commit failed (details follow):
svn: Server sent unexpected return value (403 Forbidden) in response to MKACTIVITY request for '/p/cudalucas/code/!svn/act/2c8fdab0-c8b5-425a-b904-ad76555a2b37'
svn: Your commit message was left in a temporary file:
svn:    '/home/bill/CUDALucas/tags/svn-commit.tmp'
bill@Gravemind:~/CUDALucas∰∂ svn commit --username dubslow tags/v2.03-final/
svn: Commit failed (details follow):
svn: Server sent unexpected return value (403 Forbidden) in response to MKACTIVITY request for '/p/cudalucas/code/!svn/act/56dab32b-3b9e-4bfe-877f-0520584bae98'
svn: Your commit message was left in a temporary file:
svn:    '/home/bill/CUDALucas/tags/svn-commit.2.tmp'
bill@Gravemind:~/CUDALucas∰∂
It didn't even give me a chance to enter my password.
Dubslow is offline   Reply With Quote
Old 2012-06-12, 01:53   #1426
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

160658 Posts
Default

Quote:
Originally Posted by Dubslow View Post
I'm only about halfway through the changes, not done yet
I got off my butt and coded some of them. A silly bug took 20 minutes of my time, but the results are pretty.

Code:
bill@Gravemind:~/CUDALucas/test∰∂ cat worktodo.txt 
DoubleCheck=N/A,216091,24K,1,69
Test=12K,N/A,69,216091
Test=86243,4K
DoubleCheck=CA9CAECD26710FC828DFBBB8________,26458577,69,1
bill@Gravemind:~/CUDALucas/test∰∂ ./new.CUDALucas 

Starting M216091 fft length = 24K
Iteration 10000 M( 216091 )C, 0x30247786758b8792, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:02 real, 0.1734 ms/iter, ETA 0:34)
Iteration 20000 M( 216091 )C, 0x13e968bf40fda4d7, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:02 real, 0.1752 ms/iter, ETA 0:33)
Iteration 30000 M( 216091 )C, 0x540772c2abb7833a, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:02 real, 0.1708 ms/iter, ETA 0:30)
Iteration 40000 M( 216091 )C, 0xc26da9695ac418c1, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:01 real, 0.1750 ms/iter, ETA 0:29)
Iteration 50000 M( 216091 )C, 0x95ce3ff44abdd1e5, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:02 real, 0.1723 ms/iter, ETA 0:27)
Iteration 60000 M( 216091 )C, 0x99aa87c495daffe7, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:02 real, 0.1713 ms/iter, ETA 0:25)
Iteration 70000 M( 216091 )C, 0x505d249be3145893, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:02 real, 0.1716 ms/iter, ETA 0:24)
Iteration 80000 M( 216091 )C, 0xddf612c72037b8a1, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:01 real, 0.1703 ms/iter, ETA 0:22)
Iteration 90000 M( 216091 )C, 0xb5d8309a1ce9e2b6, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:02 real, 0.1747 ms/iter, ETA 0:20)
Iteration 100000 M( 216091 )C, 0x4de7f101ee1cb7a5, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:02 real, 0.1702 ms/iter, ETA 0:18)
Iteration 110000 M( 216091 )C, 0x10aa3286c0b03369, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:01 real, 0.1694 ms/iter, ETA 0:16)
Iteration 120000 M( 216091 )C, 0x3981b56788b529e2, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:02 real, 0.1718 ms/iter, ETA 0:15)
Iteration 130000 M( 216091 )C, 0x80438af231f8fccd, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:02 real, 0.1734 ms/iter, ETA 0:13)
Iteration 140000 M( 216091 )C, 0x669382faea06df89, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:02 real, 0.1744 ms/iter, ETA 0:12)
Iteration 150000 M( 216091 )C, 0x1b73cb121df7d6fa, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:01 real, 0.1715 ms/iter, ETA 0:10)
Iteration 160000 M( 216091 )C, 0xb391010f29c70ee1, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:02 real, 0.1710 ms/iter, ETA 0:08)
Iteration 170000 M( 216091 )C, 0x04055d84a77be1d8, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:02 real, 0.1709 ms/iter, ETA 0:06)
Iteration 180000 M( 216091 )C, 0xe3d74c104f02967d, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:01 real, 0.1711 ms/iter, ETA 0:05)
Iteration 190000 M( 216091 )C, 0x54b2a8b9cb149f9f, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:02 real, 0.1713 ms/iter, ETA 0:03)
Iteration 200000 M( 216091 )C, 0xf433496947b7b103, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:02 real, 0.1708 ms/iter, ETA 0:01)
Iteration 210000 M( 216091 )C, 0xcfe091c8f59f8a7b, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:02 real, 0.1700 ms/iter, ETA 0:00)
M( 216091 )P, n = 24K, CUDALucas v2.04 Alpha. Estimated total time: 0:38

Starting M216091 fft length = 12K
Iteration 10000 M( 216091 )C, 0x30247786758b8792, n = 12K, CUDALucas v2.04 Alpha err = 0.0045 (0:01 real, 0.1511 ms/iter, ETA 0:30)
Iteration 20000 M( 216091 )C, 0x13e968bf40fda4d7, n = 12K, CUDALucas v2.04 Alpha err = 0.0045 (0:02 real, 0.1476 ms/iter, ETA 0:28)
Iteration 30000 M( 216091 )C, 0x540772c2abb7833a, n = 12K, CUDALucas v2.04 Alpha err = 0.0045 (0:01 real, 0.1479 ms/iter, ETA 0:26)
Iteration 40000 M( 216091 )C, 0xc26da9695ac418c1, n = 12K, CUDALucas v2.04 Alpha err = 0.0045 (0:02 real, 0.1494 ms/iter, ETA 0:25)
Iteration 50000 M( 216091 )C, 0x95ce3ff44abdd1e5, n = 12K, CUDALucas v2.04 Alpha err = 0.0045 (0:01 real, 0.1464 ms/iter, ETA 0:23)
Iteration 60000 M( 216091 )C, 0x99aa87c495daffe7, n = 12K, CUDALucas v2.04 Alpha err = 0.0045 (0:02 real, 0.1490 ms/iter, ETA 0:22)
Iteration 70000 M( 216091 )C, 0x505d249be3145893, n = 12K, CUDALucas v2.04 Alpha err = 0.0045 (0:01 real, 0.1551 ms/iter, ETA 0:21)
Iteration 80000 M( 216091 )C, 0xddf612c72037b8a1, n = 12K, CUDALucas v2.04 Alpha err = 0.0045 (0:02 real, 0.1506 ms/iter, ETA 0:19)
Iteration 90000 M( 216091 )C, 0xb5d8309a1ce9e2b6, n = 12K, CUDALucas v2.04 Alpha err = 0.0045 (0:01 real, 0.1480 ms/iter, ETA 0:17)
Iteration 100000 M( 216091 )C, 0x4de7f101ee1cb7a5, n = 12K, CUDALucas v2.04 Alpha err = 0.0045 (0:02 real, 0.1499 ms/iter, ETA 0:16)
Iteration 110000 M( 216091 )C, 0x10aa3286c0b03369, n = 12K, CUDALucas v2.04 Alpha err = 0.0045 (0:01 real, 0.1463 ms/iter, ETA 0:14)
Iteration 120000 M( 216091 )C, 0x3981b56788b529e2, n = 12K, CUDALucas v2.04 Alpha err = 0.0045 (0:02 real, 0.1481 ms/iter, ETA 0:13)
Iteration 130000 M( 216091 )C, 0x80438af231f8fccd, n = 12K, CUDALucas v2.04 Alpha err = 0.0046 (0:01 real, 0.1458 ms/iter, ETA 0:11)
Iteration 140000 M( 216091 )C, 0x669382faea06df89, n = 12K, CUDALucas v2.04 Alpha err = 0.0046 (0:01 real, 0.1460 ms/iter, ETA 0:10)
Iteration 150000 M( 216091 )C, 0x1b73cb121df7d6fa, n = 12K, CUDALucas v2.04 Alpha err = 0.0046 (0:02 real, 0.1463 ms/iter, ETA 0:08)
Iteration 160000 M( 216091 )C, 0xb391010f29c70ee1, n = 12K, CUDALucas v2.04 Alpha err = 0.0046 (0:01 real, 0.1469 ms/iter, ETA 0:07)
Iteration 170000 M( 216091 )C, 0x04055d84a77be1d8, n = 12K, CUDALucas v2.04 Alpha err = 0.0046 (0:02 real, 0.1470 ms/iter, ETA 0:05)
Iteration 180000 M( 216091 )C, 0xe3d74c104f02967d, n = 12K, CUDALucas v2.04 Alpha err = 0.0046 (0:01 real, 0.1449 ms/iter, ETA 0:04)
Iteration 190000 M( 216091 )C, 0x54b2a8b9cb149f9f, n = 12K, CUDALucas v2.04 Alpha err = 0.0046 (0:02 real, 0.1469 ms/iter, ETA 0:02)
Iteration 200000 M( 216091 )C, 0xf433496947b7b103, n = 12K, CUDALucas v2.04 Alpha err = 0.0046 (0:01 real, 0.1482 ms/iter, ETA 0:01)
Iteration 210000 M( 216091 )C, 0xcfe091c8f59f8a7b, n = 12K, CUDALucas v2.04 Alpha err = 0.0046 (0:02 real, 0.1485 ms/iter, ETA 0:00)
M( 216091 )P, n = 12K, CUDALucas v2.04 Alpha. Estimated total time: 0:32

Starting M86243 fft length = 4K
Iteration 10000 M( 86243 )C, 0x26d11035920b3773, n = 4K, CUDALucas v2.04 Alpha err = 0.2617 (0:01 real, 0.1160 ms/iter, ETA 0:08)
Iteration 20000 M( 86243 )C, 0x233a5255467a4c6e, n = 4K, CUDALucas v2.04 Alpha err = 0.2617 (0:01 real, 0.1137 ms/iter, ETA 0:06)
Iteration 30000 M( 86243 )C, 0x88e3195a12367bb8, n = 4K, CUDALucas v2.04 Alpha err = 0.2617 (0:01 real, 0.1141 ms/iter, ETA 0:05)
Iteration 40000 M( 86243 )C, 0x70b63ef639328851, n = 4K, CUDALucas v2.04 Alpha err = 0.2617 (0:01 real, 0.1153 ms/iter, ETA 0:04)
Iteration 50000 M( 86243 )C, 0x0ff1f54cfeeb4909, n = 4K, CUDALucas v2.04 Alpha err = 0.2617 (0:01 real, 0.1136 ms/iter, ETA 0:03)
Iteration 60000 M( 86243 )C, 0x25a4a96c66e7f897, n = 4K, CUDALucas v2.04 Alpha err = 0.2812 (0:02 real, 0.1183 ms/iter, ETA 0:02)
Iteration 70000 M( 86243 )C, 0xb639453c818baba2, n = 4K, CUDALucas v2.04 Alpha err = 0.2812 (0:01 real, 0.1139 ms/iter, ETA 0:01)
Iteration 80000 M( 86243 )C, 0xdd477c413184da18, n = 4K, CUDALucas v2.04 Alpha err = 0.2812 (0:01 real, 0.1089 ms/iter, ETA 0:00)
M( 86243 )C, 0x2de7056ebffee28b, n = 4K, CUDALucas v2.04 Alpha. Estimated total time: 0:09

Starting M26458577 fft length = 1536K
^C
SIGINT caught. Writing checkpoint.

bill@Gravemind:~/CUDALucas/test∰∂
(It can't actually handle underscores, but I edited the output for obvious reasons. )


PS Would any code gurus be willing to examine parse_worktodo_line() starting from line 317 and check for any stupids?

Last fiddled with by Dubslow on 2012-06-12 at 02:01
Dubslow is offline   Reply With Quote
Old 2012-06-17, 05:31   #1427
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

11100001101012 Posts
Default Reproducible error in cufftbench

@msft: I've stumbled across this seemingly reproducible error in cufftbench().
Code:
bill@Gravemind:~/CUDALucas∰∂ CUDALucas -threads 128 -cufftbench 5881856 5914624 64

CUFFT bench start = 5881856 end = 5914624 distance = 64
CUFFT_Z2Z size= 5881856 time= 986.398254 msec
CUDALucas.cu(1066) : cufftSafeCall() CUFFT error 2: CUFFT_ALLOC_FAILED
CUFFT_INVALID_TYPE
CUFFT_INVALID_VALUE
CUFFT_INTERNAL_ERROR
CUFFT_EXEC_FAILED
CUFFT_SETUP_FAILED
CUFFT_INVALID_SIZE
CUFFT_UNALIGNED_DATA
CUFFT Unknown error code
bill@Gravemind:~/CUDALucas∰∂ CUDALucas -cufftbench 5881856 5914624 64

CUFFT bench start = 5881856 end = 5914624 distance = 64
CUFFT_Z2Z size= 5881856 time= 986.098572 msec
CUDALucas.cu(1066) : cufftSafeCall() CUFFT error 2: CUFFT_ALLOC_FAILED
CUFFT_INVALID_TYPE
CUFFT_INVALID_VALUE
CUFFT_INTERNAL_ERROR
CUFFT_EXEC_FAILED
CUFFT_SETUP_FAILED
CUFFT_INVALID_SIZE
CUFFT_UNALIGNED_DATA
CUFFT Unknown error code
bill@Gravemind:~/CUDALucas∰∂
(This is with v2.03, although it also occurs in v2.04_test. In the latter case, it continued to test more lengths, but it did stop before it was supposed to.)



Also, as I previously reported, cufftbench() still uses 1-2 full cores. Is that a bug or the nature of the function?
Dubslow is offline   Reply With Quote
Old 2012-06-18, 05:06   #1428
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3×29×83 Posts
Default More data on cufft crash

Code:
bill@Gravemind:~/CUDALucas∰∂ CUDALucas -cufftbench $((256*128)) $((65535*128)) $((256*128))

CUFFT bench start = 32768 end = 8388480 distance = 32768
<good output snipped>
CUFFT_Z2Z size= 6815744 time= 17.126163 msec
CUFFT_Z2Z size= 6848512 time= 21.510880 msec
CUFFT_Z2Z size= 6881280 time= 13.638905 msec
CUFFT_Z2Z size= 6914048 time= 699.387634 msec
CUFFT_Z2Z size= 6946816 time= 22.775032 msec
CUFFT_Z2Z size= 6979584 time= 30.465769 msec
CUFFT_Z2Z size= 7012352 time= 37.825619 msec
CUFFT_Z2Z size= 7045120 time= 20.284300 msec
CUFFT_Z2Z size= 7077888 time= 12.884492 msec
CUFFT_Z2Z size= 7110656 time= 18.780321 msec
CUFFT_Z2Z size= 7143424 time= 39.204491 msec
CUFFT_Z2Z size= 7176192 time= 31.473606 msec
CUFFT_Z2Z size= 7208960 time= 18.138344 msec
CUFFT_Z2Z size= 7241728 time= 23.035593 msec
CUFFT_Z2Z size= 7274496 time= 22.267868 msec
CUDALucas.cu(1066) : cufftSafeCall() CUFFT error 2: CUFFT_ALLOC_FAILED
CUFFT_INVALID_TYPE
CUFFT_INVALID_VALUE
CUFFT_INTERNAL_ERROR
CUFFT_EXEC_FAILED
CUFFT_SETUP_FAILED
CUFFT_INVALID_SIZE
CUFFT_UNALIGNED_DATA
CUFFT Unknown error code
It's a different size this time.

Last fiddled with by Dubslow on 2012-06-18 at 05:06
Dubslow is offline   Reply With Quote
Old 2012-06-19, 00:28   #1429
msft
 
msft's Avatar
 
Jul 2009
Tokyo

2×5×61 Posts
Default

Hi ,Dubslow
I believe you can read source code.
msft is offline   Reply With Quote
Old 2012-06-19, 05:06   #1430
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3×29×83 Posts
Default

Quote:
Originally Posted by msft View Post
Hi ,Dubslow
I believe you can read source code.
Yes I can, but I don't have the first clue about CUDA in general or cufft in particular. Just in case, I did look through it and I see the line that's causing the issue, but I have no idea what's wrong or how to fix it.

Code:
void
cufftbench (int cufftbench_s, int cufftbench_e, int cufftbench_d)
{
  cudaEvent_t start, stop;
  double *x;
  float outerTime;
  int i, j;
  printf ("CUFFT bench start = %d end = %d distance = %d\n", cufftbench_s,
	  cufftbench_e, cufftbench_d);

  cutilSafeCall (cudaMalloc ((void **) &g_x, sizeof (double) * cufftbench_e));
  x = ((double *) malloc (sizeof (double) * cufftbench_e + 1));
  for (i = 0; i <= cufftbench_e; i++)
    x[i] = 0;
  cutilSafeCall (cudaMemcpy
		 (g_x, x, sizeof (double) * cufftbench_e,
		  cudaMemcpyHostToDevice));
  cutilSafeCall (cudaEventCreate (&start));
  cutilSafeCall (cudaEventCreate (&stop));
  for (j = cufftbench_s; j <= cufftbench_e; j += cufftbench_d)
    {
      cufftSafeCall (cufftPlan1d (&plan, j / 2, CUFFT_Z2Z, 1));
      cufftSafeCall (cufftExecZ2Z
		     (plan, (cufftDoubleComplex *) g_x,
		      (cufftDoubleComplex *) g_x, CUFFT_INVERSE));
      cutilSafeCall (cudaEventRecord (start, 0));
      for (i = 0; i < 100; i++)
	cufftSafeCall (cufftExecZ2Z
		       (plan, (cufftDoubleComplex *) g_x,
			(cufftDoubleComplex *) g_x, CUFFT_INVERSE));
      cutilSafeCall (cudaEventRecord (stop, 0));
      cutilSafeCall (cudaEventSynchronize (stop));
      cutilSafeCall (cudaEventElapsedTime (&outerTime, start, stop));
      printf ("CUFFT_Z2Z size= %d time= %f msec\n", j, outerTime / 100);
      cufftSafeCall (cufftDestroy (plan));
    }
  cutilSafeCall (cudaFree ((char *) g_x));
  cutilSafeCall (cudaEventDestroy (start));
  cutilSafeCall (cudaEventDestroy (stop));
  free ((char *) x);
}
The bolded line is the one that's barfing. (I do recognize that it's the line that sets up the FFT, and it's the next line and inner loop that actually execute the FFT.)

Last fiddled with by Dubslow on 2012-06-19 at 05:08
Dubslow is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
Don't DC/LL them with CudaLucas LaurV Data 131 2017-05-02 18:41
CUDALucas / cuFFT Performance on CUDA 7 / 7.5 / 8 Brain GPU Computing 13 2016-02-19 15:53
CUDALucas: which binary to use? Karl M Johnson GPU Computing 15 2015-10-13 04:44
settings for cudaLucas fairsky GPU Computing 11 2013-11-03 02:08
Trying to run CUDALucas on Windows 8 CP Rodrigo GPU Computing 12 2012-03-07 23:20

All times are UTC. The time now is 07:24.


Mon Aug 2 07:24:11 UTC 2021 up 10 days, 1:53, 0 users, load averages: 0.97, 1.19, 1.43

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.