mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   CUDALucas (a.k.a. MaclucasFFTW/CUDA 2.3/CUFFTW) (https://www.mersenneforum.org/showthread.php?t=12576)

kriesel 2017-04-02 07:49

most builds are missing the bad-intermediate-results patch
 
[CODE]2487: 2016/03/02
correction made. it now aborts the test when a premature 0 residue occurs. It wil take a few days to get sourceforge updated. (Extemely unreliable and slow internet connection.)
Thank you owftheevil, let us know when the new build is ready for download :-)
[/CODE]

This change apparently only made it into the Windows x64 CUDA8 version on sourceforge, (no other Windows versions, no linux versions), judging by file dates. See below.

[CODE]
Directory of W:\sources\mersennes\cudalucas\CUDALucas.2.05.1-CUDA4.2-CUDA8.0-linux
02/11/2015 05:27 PM 425,256 CUDALucas-2.05.1-CUDA4.2-linux-x86_64
02/11/2015 05:28 PM 478,576 CUDALucas-2.05.1-CUDA5.0-linux-x86_64
02/11/2015 05:38 PM 609,904 CUDALucas-2.05.1-CUDA5.5-linux-x86_64
02/11/2015 05:34 PM 749,040 CUDALucas-2.05.1-CUDA6.0-linux-x86_64
02/11/2015 05:36 PM 753,136 CUDALucas-2.05.1-CUDA6.5-linux-x86_64

Directory of W:\sources\mersennes\cudalucas\CUDALucas.2.05.1-CUDA4.2-CUDA8.0-Windows-32.64
10/01/2016 10:30 PM 76 cuda8.0win32notpossible.txt
02/10/2015 09:25 AM 491,520 CUDALucas2.05.1-CUDA4.2-Windows-WIN32.exe
02/10/2015 09:30 AM 553,984 CUDALucas2.05.1-CUDA4.2-Windows-x64.exe
02/10/2015 09:26 AM 549,888 CUDALucas2.05.1-CUDA5.0-Windows-WIN32.exe
02/10/2015 09:29 AM 606,208 CUDALucas2.05.1-CUDA5.0-Windows-x64.exe
02/10/2015 09:33 AM 730,112 CUDALucas2.05.1-CUDA5.5-Windows-WIN32.exe
02/10/2015 09:38 AM 756,224 CUDALucas2.05.1-CUDA5.5-Windows-x64.exe
02/10/2015 09:35 AM 963,584 CUDALucas2.05.1-CUDA6.0-Windows-WIN32.exe
02/10/2015 09:40 AM 1,014,784 CUDALucas2.05.1-CUDA6.0-Windows-x64.exe
02/10/2015 09:37 AM 1,090,560 CUDALucas2.05.1-CUDA6.5-Windows-WIN32.exe
02/10/2015 09:44 AM 1,159,168 CUDALucas2.05.1-CUDA6.5-Windows-x64.exe
10/01/2016 10:20 PM 1,178,624 CUDALucas2.05.1-CUDA8.0-Windows-x64.exe
[/CODE]

There does not appear to be a CUDA 8 version available for linux.

Would someone please update some or all of these to include the bad-intermediate-residue check?

Thanks!

kriesel 2017-04-02 17:30

CUDALucas v2.05.1 readme update draft
 
3 Attachment(s)
Hi,

It began as editing it for my own use as I figured some things out, one thing led to another, and so I thought I'd share, as well as ask for review for correctness and other input.It's a work in progress. Every time I think I've eliminated misspellings I find another, for example.

I'd appreciate it if those who have modified the code would look it over for accuracy and provide feedback. Line-numbered output of the FC file compare utility is provided for convenience.

Thanks!

kriesel 2017-04-02 17:36

Easiest way to get cufft*.dll, cudart*.dll
 
What's the easiest way? I downloaded the developer's toolkit which contains them, but each separate version level seems to be a separate mammoth download, which is painfully slow on low speed DSL.
(Some are over 1GB.)

Thx!

kriesel 2017-04-02 17:57

benchmarking cards versus CUDA level
 
Hi,

Has anyone run comparative fft benchmarks for a variety of CUDA levels, holding other things constant, on the same graphics card, for older cards, such as the GeForce GTX480, or the Quadro 2000?

Thanks-

kriesel 2017-04-02 18:07

CUDALucas INI file questions
 
Hi,

Please note, all these are requests for clarification, not requests for features or changes. (Many values that might be legal may not be advisable, for performance reasons. That's not what I'm asking about. I'm trying to understand the program limits that would make values be not accepted, cause errors, or cause the program to crash.)

1. Are all numerical values in the ini file restricted to being integers? (No floats?)

2. Can workfile= include absolute or relative path?

3. Can resultsfile= include absolute or relative path?

4. SaveFolder= appears to specify a subfolder of the current folder. Can it specify an absolute path? Other forms of relative path?

5. Devicenumber: is there a program limit to the maximum number supported, and if so what is it?

6. Erroriterations I assume has a minimum value of 1. Is there a maximum? Is any arbitrary positive integer allowed?

7. Reportiterations I assume has a minimum value of 1. Is there a maximum? Is any arbitrary positive integer allowed?

8. Checkpointiterations I assume has a minimum value of 1. Is there a maximum? Is any arbitrary positive integer allowed?

9. PoliteValue I assume has a minimum value of 1. Is there a maximum? Is any arbitrary positive integer allowed?

10. Will entering floats, eg 83.33 instead of integers such as 85 for ErrorReset cause trouble? Get truncated to 83 and work?

11. Are there other parameters for which clarification of bounds or format would be useful?


(end)

flashjh 2017-04-02 19:13

[QUOTE=kriesel;456016]Would someone please update some or all of these to include the bad-intermediate-residue check?[/QUOTE]

Are you asking for updated Linux binaries?

Where is the post with the bad-intermediate discussion? When was the code updated?

kriesel 2017-04-02 19:15

relative timings vs cuda version
 
1 Attachment(s)
At message 2535: [QUOTE=ATH;444596]I did a -cufftbench 2592 8192 20 on the different versions (only the 64 bit versions) so it does 20x 50iterations on each FFT and takes the average.

In most cases 6.5 is fastest but a few of them has 8.0 as the fastest (on a Titan Black, but this is probably GPU dependent).

CUDA 4.2 was quite a bit slower on all of them, so I left it out.


[CODE] 8.0 6.5 6.0 5.5 5.0
2592 48471289 1.6135 1.6683 1.6897 1.6145 1.6166
2744 51250889 2.0056 1.8606 1.8682 1.9980 1.8714
3136 58404433 2.0937 2.0480 2.0710 2.0201 2.0337
3200 59570449 2.4195 2.3907 2.4056 2.4150 2.4175
3240 60298969 2.4266 2.4388 2.4404 2.4435
3375 62756279 2.5147 2.5301
3888 72075517 2.5348 2.5584 2.4558
4000 74106457 2.4631 2.5498 2.5821 2.4590 2.4639
4096 75846319 2.5115 2.5800 2.5976 2.5200 2.5375
4320 79902611 3.2614 3.2573 3.2760 3.2685
4374 80879779 3.3753 3.2784 3.2924 3.2946 3.3003
4500 83158811 3.3535 3.3845
4536 83809729 3.3810 3.4006
5184 95507747 3.4279 3.3836 3.4208 3.2994 3.3273
5292 97454309 3.9568
5488 100984691 3.8843 3.8345 3.8336 3.9979 3.7484
5600 103000823 4.1630 4.3082 4.3430 4.1882 4.1943
5832 107174381 4.5283 4.3644 4.3842 4.3847 4.3903
6000 110194363 4.5328
6048 111056879 4.5338 4.5138 4.5275
6075 111541967 4.5586
6125 112440191 4.5530 4.5645 4.5780 4.5867
6144 112781477 4.6858
6250 114685037 4.8079 4.6418 4.6620 4.6714 4.6787
6272 115080019 4.6678 4.6820 4.6835 4.6908
6400 117377567 4.9088 4.7531 4.7721 4.7719 4.7809
6480 118813021 4.9438 4.8401 4.8670 4.8669
6561 120266023 5.1164 4.8818 4.9012 4.8968
6750 123654943 5.1792 5.0356 5.0432
6912 126558077 5.1957
7776 142017539 5.2343 5.0001 5.0441 4.8219
8000 146019329 5.2537 5.1762 5.2350 5.0527 5.0997
8192 149447533 5.3838 5.2219 5.2593 5.1617 5.2106[/CODE][/QUOTE]

If ATH would forward the V4.2 timings, I'd add them.

kriesel 2017-04-02 20:15

references for and timing of 2.05, 2.05.1, bad-intermediate-residues patch, and latest builds
 
[QUOTE=flashjh;456049]Are you asking for updated Linux binaries?

Where is the post with the bad-intermediate discussion? When was the code updated?[/QUOTE]

Hi,

Thanks for a quick reply. I am more interested in updated Windows binaries but would also like the Linux made current. I'll have some use for them later. (Of course I realize this is an all-volunteer operation and no one owes me anything. I think it's been an amazing impressive amount of work by many code authors.)

The relative timing of the patch and updates surfaced for me because of binge-reading the thread in sequence and taking notes along the way. It took days to get through.

The following is a summary of a subset of the posts involved, selected to illustrate adequately what I think the timeline was without flooding you with volume.


post 2286: (page 208)
V2.05 released Feb 8 2015

post 2289: (page 209)
feb 11 2015
CUDALucas 2.05.1 is posted to sourceforge. An error was discovered in the display output if ReportIterations=100 or 50 or 10. The error only caused the display 'Error' to stay at .25000. Actual results were not affected. I uploaded all windows versions as one file this time. You still need the .ini file.
If anything else is found, let us know.

2441: (prime95) (page 222)
Please add code to exit with an error message if when you write a save file (or any other convenient time) you find that the LL iteration is zero or two.

2447: (page 223)
(msft:) Please teach me good err message.
(prime95:)
"Illegal residue: 0x0000000000000000. See mersenneforum.org for help."
or
"Illegal residue: 0x0000000000000002. See mersenneforum.org for help."

(which I'd be inclined to change to explain in the case of 0, it's too early an iteration to indicate a prime. But hey, George is a Jedi master of code and organization, so maybe ignore this parenthetical)

2454: (page 224)
LaurV's good post, omitted here for length

2464: (page 224)
msft wrote and shared bad-intermediate-residue check code for review (Feb 9 2016)

2487: Mar 2 2016 (page 227)
Owftheevil quoted: correction made. it now aborts the test when a premature 0 residue occurs. It wil take a few days to get sourceforge updated. (Extemely unreliable and slow internet connection.)
ET: Thank you owftheevil, let us know when the new build is ready for download :-)

(I found no indication of a version number change applied for the intermediate-residue checking change. Files contain 2.05.1 in their names.)

(end)

kriesel 2017-04-03 00:44

Found a link to the library files cufft*.dll, cudart*.dll on another thread.

[url]https://sourceforge.net/projects/cudalucas/files/CUDA%20Libs/[/url]

flashjh 2017-04-05 01:20

Hello all,

From what I can see the updates from post [URL="http://mersenneforum.org/showpost.php?p=425734&postcount=2464"]2464[/URL] were not incorporated on sourceforge, so they're not in any of the code now.

It's been over a year since all that discussion went on, are there still issues with residues?

Either way, I have the code updated, along with some miscellaneous changes. The biggest change is that CUDA 8 does not support 32 bit for CUDALucas, nor compute < 2.0.

What versions is everyone using now? I don't mind getting setup to compile versions <8, but I don't want to do it if no one is using them anymore.

So, let me know what architecture everyone needs and I'll make it happen :smile:

LaurV 2017-04-05 13:37

does it get eny faster? Otherwise we are happy with the current version we use, and we don't want to fix it (i.e. upgrade) as long as it works... :razz:


All times are UTC. The time now is 22:42.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.