![]() |
|
|
#1739 | |
|
Jun 2005
3·43 Posts |
Quote:
|
|
|
|
|
|
|
#1740 |
|
"Jerry"
Nov 2011
Vancouver, WA
1,123 Posts |
I'll try tonight.
|
|
|
|
|
|
#1741 |
|
Jul 2003
So Cal
2·34·13 Posts |
Regarding FFT timings, I have brief access to a Tesla K20m. I ran a benchmark starting at 1440k going up in 16k increments, and here are the results. As usual, lengths slower than longer FFTs have been deleted. I'm surprised how relatively short this table is.
Code:
FFT length (k), ms/iteration 1568 0.508496 1600 0.596174 2000 0.645019 2048 0.655126 2592 0.820283 3136 1.123238 4000 1.256788 4096 1.304601 4320 1.804463 4608 1.876166 4704 1.910958 5120 2.120896 5488 2.136009 5600 2.270577 6000 2.438436 6048 2.448022 6144 2.480506 6272 2.526666 7776 2.620803 Code:
./CUDALucas -k 57885161 Iteration 10000 M( 57885161 )C, 0x76c27556683cd84d, n = 3200K, CUDALucas v2.04 Beta err = 0.1328 (0:41 real, 4.0933 ms/iter, ETA 65:47:57) Iteration 20000 M( 57885161 )C, 0xfd8e311d20ffe6ab, n = 3200K, CUDALucas v2.04 Beta err = 0.1328 (0:41 real, 4.0613 ms/iter, ETA 65:16:29) ./CUDALucas -f 4000k -k 57885161 Iteration 10000 M( 57885161 )C, 0x76c27556683cd84d, n = 4000K, CUDALucas v2.04 Beta err = 0.0009 (0:42 real, 4.2419 ms/iter, ETA 68:11:21) Iteration 20000 M( 57885161 )C, 0xfd8e311d20ffe6ab, n = 4000K, CUDALucas v2.04 Beta err = 0.0010 (0:42 real, 4.2032 ms/iter, ETA 67:33:15) Last fiddled with by frmky on 2013-03-01 at 21:56 |
|
|
|
|
|
#1742 |
|
Banned
"Luigi"
Aug 2002
Team Italia
12CF16 Posts |
Just to be sure...
I'm working fine with CudaLucas v2.01 under Linux 64bit: the program aautomatically recognize errors on the FFT computation and rollbacks to a safer size, even if not the most efficient. 1 - Is v2.04 available for Linux? 2 - Is it more reliable? 3 - Is it faster? 4 - Does it aautomagically choose the fastest FFT size? Thank you for the infos.. Luigi |
|
|
|
|
|
#1743 |
|
Basketry That Evening!
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88
3·29·83 Posts |
2.03 is stable, 2.04 Beta works (but was never pushed out of beta), 2.05 is under development, and all work on GNU-Linux.
None are safer and more reliable; the main differences from 2.01 to 2.04 are interface features (i.e. worktodo in version 2.03, other good stuff in 2.04). They all automagically choose a *roughly* good FFT size, but manual futzing can usually get an extra 5-10% performance boost. |
|
|
|
|
|
#1744 | |
|
Banned
"Luigi"
Aug 2002
Team Italia
32×5×107 Posts |
Quote:
![]() Luigi |
|
|
|
|
|
|
#1745 | ||
|
Dec 2009
Peine, Germany
1010010112 Posts |
Quote:
Quote:
Thoughts: - measure FFT lengths runtime and adapt formula to match ms/iter - theoretical Gflops throuhput as given by Nvidia for radix-2 to -7 - theoretical model with weighted scores for radix-2 to -7 and penalty for more distinct prime factors used <-- my try but scores must be analysed - ... Suggestions? Or has this been invented yet? |
||
|
|
|
|
|
#1746 |
|
Dec 2009
Peine, Germany
331 Posts |
I think Aaronhaviland did this with the cufftbench option.
I will do the same and normalize the timings by FFT length / time. Raw data that will be used can be found in this Titan's thread post. |
|
|
|
|
|
#1747 |
|
If I May
"Chris Halsall"
Sep 2002
Barbados
100110000000102 Posts |
Hey all. Looking for some guidance from the Gurus...
I offered to help alpha test owftheevil's new GPU P-1 program on my 2GB GTX560, and very quickly it started reporting round-off errors. owftheevil asked if I had run the CUDALucas self-test, which I had to admit I hadn't. After receiving a couple of different versions of the source from owftheevil, and also download the code from GIThub, it rarely passed the self test (only once out of about ten different runs). Concerned that my mfaktc work might be bad, I ran it's deep self-test. 100% success. (I know, of course, that the two programs work in very different ways.) I then reran the memory test program I used when I first bought the 560 -- http://wili.cc/blog/gpu-burn.html -- no errors. I downloaded and compiled the Open Source version of memtest80 -- https://github.com/ihaque/memtestG80 -- after more than an hour, no errors. I often use this GPU for a computer vision project I'm working on. This involves SIFTing large images, and then matching the descriptors. The former process uses about 90% of the card's memory. Shortly after purchasing the card I ran a sanity check experiment where I ran a >1000 image job on the GPU, and then the same job only on the CPU, and the results were almost identical (GPU SIFTing is known to have slightly different results). Lastly, when I first bought the card I also ran several tests under WinBlows, including FurMark. Not a single reported error. (I can't immediately rerun those tests as the machine is in an office on the other side of the country (not really that far away).) I am running the latest CUDA 5.0. The box is a hyper-threaded quad-core with 4GB of RAM, running CentOS 6.3 64-bit. Any thoughts from anyone? I'd be happy to provide unprivileged SSH access to the box to any of the CUDALucas developers if it might help being "in situ". |
|
|
|
|
|
#1748 |
|
Jun 2003
13BB16 Posts |
If you can, downclock the GPU memory and try. It is almost certainly hardware problem. You can also get GeneferCUDA and do a test. It should also produce similar issues.
|
|
|
|
|
|
#1749 |
|
Just call me Henry
"David"
Sep 2007
Cambridge (GMT/BST)
23×3×5×72 Posts |
It is not unknown for programs on this forum to find hardware faults that nothing else will. I would suggest reducing your memory clock and find what is stable.
|
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Don't DC/LL them with CudaLucas | LaurV | Data | 131 | 2017-05-02 18:41 |
| CUDALucas / cuFFT Performance on CUDA 7 / 7.5 / 8 | Brain | GPU Computing | 13 | 2016-02-19 15:53 |
| CUDALucas: which binary to use? | Karl M Johnson | GPU Computing | 15 | 2015-10-13 04:44 |
| settings for cudaLucas | fairsky | GPU Computing | 11 | 2013-11-03 02:08 |
| Trying to run CUDALucas on Windows 8 CP | Rodrigo | GPU Computing | 12 | 2012-03-07 23:20 |