mersenneforum.org GPU Computing Cheat Sheet (a.k.a. GPU Computing Guide)
 Register FAQ Search Today's Posts Mark Forums Read

 2012-05-04, 08:16 #1 Brain     Dec 2009 Peine, Germany 331 Posts GPU Computing Cheat Sheet (a.k.a. GPU Computing Guide) Hi, find here the latest version of the PDF known as "GPU Computing Cheat Sheet". It's the essence of many GPU Computing thread posts on a single piece of paper. Current latest is: 1.04 GIMPS GPU Computing Cheat Sheet latest (pdf) All files Bye, Brain
 2012-06-07, 20:00 #2 Brain     Dec 2009 Peine, Germany 331 Posts GPU Computing Cheat Sheet Update to v1.01 Changes: mfakto 0.11 integrated Please report errors / suggestions. Last fiddled with by Brain on 2012-08-05 at 09:54
 2012-08-01, 19:16 #3 Brain     Dec 2009 Peine, Germany 1010010112 Posts GPU Computing Cheat Sheet Update to v1.02 Quick update: Changes: mfakto 0.12, CUDALucas 2.03 Last fiddled with by Brain on 2012-08-05 at 09:57
 2012-08-01, 22:10 #4 Dubslow Basketry That Evening!     "Bunslow the Bold" Jun 2011 40 Test=,,[,] can be "N/A" (actually, it can be anything, AID isn't actually used anywhere in 2.03). As of sometime before 2.00 but after 1.2, save files should be O(n) in size, where n is the fft length. A length of 1474560 (1440K, though 2.03 isn't that smart (2.04 is!)) should have a save file a bit under 1.5 MB. As for max FFT, threads is capped at 1024, and max FFT is capped at 64K*threads, or 64M. That assumes, of course, that there is sufficient memory, that's an excellent point. I would add that if a user gets an "over specifications Grid" error (as you once did) the solution is either to increase threads or decrease FFT length (again assuming sufficient memory). (That help message is added in 2.04 as well.) Also, thanks for the links to the .dlls. Whenever I feel like cleaning up the SourceForge files page, I'll make use of those. (LaurV was able to provide some of them, but not all.)
 2012-08-02, 03:38 #6 Dubslow Basketry That Evening!     "Bunslow the Bold" Jun 2011 40
 2012-08-02, 04:13 #7 LaurV Romulan Interpreter     Jun 2011 Thailand 2×5×859 Posts edited to explain (the two "[edit...]" brackets). added text, corrected grammar as much as I could spot. I woke up in the wrong posture today... I better refrain posting until after the forth coffee cup... Related to installing cuda and msvs, well, I already did, but beside of trying to recompile some of older flashjh's releases, I didn't do too much. I can't really find the time for programming at home (at the office is no way! plenty of little things and Thai minions are pissing me off every minute), and I still have a list of "things to program at home", including that P-1 stuff, I did not write a line of code to it since months, but for that project the story is different, beside of scarce time, there is also scarce inspiration/knowledge. I am still playing with P-1 in pari/gp, trying to optimize it (from the theoretical point of view) as much as possible for mersenne numbers, and trying to get it as parallel as possible, but beside of multiplying primes in pairs on different threads there is not too much to optimize. I have learned a lot of things from this, but the magic spark is still missing. It may be a good reason why other (more clever) people didn't implement P-1 on cuda till now. If any spark, I may return to writing "close to the metal" (i.e. cuda) again, but the chances are low for the time being.
 2012-08-02, 04:26 #8 Dubslow Basketry That Evening!     "Bunslow the Bold" Jun 2011 40
2012-08-02, 17:26   #9
Brain

Dec 2009
Peine, Germany

331 Posts
GPU Computing Cheat Sheet Update to v1.02a

Quote:
 Originally Posted by LaurV And by the way, trying to be not totally offtopic, the "B" in the FFT length [edit: in the PDF file] makes no sense and it is technically incorrect. Please correct that. The FFT lenght is not measured in bytes. In fact, each FFT "element" has 8 bytes, and what Dubslow said [edit: about the lenght of the saving files] is therefore wrong: the 1440K FFT size (or 1474560 FFT size, or 1.44M FFT size, but NOT 1440KB FFT, nor 1.44MB FFT, these are WRONG) produces a save file of exactly 11 megabytes, if you do not compress it with gzip or whatever compression algorithm (which compression will be very bad if you do it, because it will not be possible to directly compare residue files using a binary editor/viewer - someone did it in the past for former releases and people, including me, got mad about it).
Ah, a simple typo. Save file size should have suggested that 2M FFT length needs 16 MB disk space. ;-)

I assume v1.03 will come out soon as of the new upcoming mfaktc/o kernels and CL 2.04.

File now here.

Last fiddled with by Brain on 2012-08-05 at 09:58

2012-08-02, 17:31   #10
Brain

Dec 2009
Peine, Germany

5138 Posts

Quote:
 Originally Posted by Dubslow Some notes on CUDALucas 2.03: In many places, examples and instructions are for 2.01, not 2.03. All command line options are the same (except for -i which prints device info), however the work file requires pseudo-GIMPS format. Code: Test= Test=,,[,]` can be "N/A" (actually, it can be anything, AID isn't actually used anywhere in 2.03). Also, thanks for the links to the .dlls. Whenever I feel like cleaning up the SourceForge files page, I'll make use of those. (LaurV was able to provide some of them, but not all.)
I will mention the new file format in the next release.

The web space I use is very limited. I will have to remove older versions for publishing new CUDA dlls. I think they have a nice place to be over there at sourceforge.

 2012-08-02, 17:35 #11 Dubslow Basketry That Evening!     "Bunslow the Bold" Jun 2011 40

 Similar Threads Thread Thread Starter Forum Replies Last Post Nick Computer Science & Computational Number Theory 0 2017-10-10 20:45 Antonio NFS@Home 5 2016-06-30 17:30 Unregistered Information & Answers 10 2011-05-10 00:57 Brain Hardware 7 2009-12-19 18:54 GP2 Lounge 2 2003-12-03 14:13

All times are UTC. The time now is 03:00.

Wed Jul 15 03:00:08 UTC 2020 up 112 days, 33 mins, 0 users, load averages: 1.66, 2.01, 1.74

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.