![]() |
![]() |
#155 |
May 2010
7 Posts |
![]()
I get infinite loop in err=2 with exponent of 110503 where it claims to increase n from 16384 but never actually change n. I use GTX280.
|
![]() |
![]() |
![]() |
#156 |
Just call me Henry
"David"
Sep 2007
Cambridge (GMT/BST)
16A216 Posts |
![]()
Does anyone feel like converting this to llr? I am sure the writer of LLR wouldn't mind if you copied some of his code with permission.
|
![]() |
![]() |
![]() |
#157 |
Jul 2009
Tokyo
11428 Posts |
![]()
Hi, wavelet3000
Need more information, OS and CUDA version. And please test Version "K". |
![]() |
![]() |
![]() |
#158 |
Jun 2003
2·32·269 Posts |
![]()
is there a windows build available somewhere?
|
![]() |
![]() |
![]() |
#159 |
Jul 2009
Tokyo
10011000102 Posts |
![]()
Hi,
Version "O", fix infinite loop. |
![]() |
![]() |
![]() |
#160 |
May 2010
7 Posts |
![]()
On version K, I get "too small exponent" for 110503
Both version N and O still give infinite loop as follows: Code:
Iteration 10000 M( 110503 )C, 0xffff0003ffffc07e, n = 16384, MacLucasFFTW v8.1 Ballester Iteration 20000 M( 110503 )C, 0x0000fe03f8003f7d, n = 16384, MacLucasFFTW v8.1 Ballester Iteration 30000 M( 110503 )C, 0x1fc0fe0007e0007e, n = 16384, MacLucasFFTW v8.1 Ballester Iteration 40000 M( 110503 )C, 0xe03ffe03ffffc07e, n = 16384, MacLucasFFTW v8.1 Ballester Iteration 50000 M( 110503 )C, 0x1fc0fffc00003f7e, n = 16384, MacLucasFFTW v8.1 Ballester Iteration 60000 M( 110503 )C, 0x1fc0000007dfbf7e, n = 16384, MacLucasFFTW v8.1 Ballester Iteration 70000 M( 110503 )C, 0x0000fe0007dfbf7d, n = 16384, MacLucasFFTW v8.1 Ballester Iteration 80000 M( 110503 )C, 0x1fc001fc07e0007e, n = 16384, MacLucasFFTW v8.1 Ballester Iteration 90000 M( 110503 )C, 0x1fff01fc0000007e, n = 16384, MacLucasFFTW v8.1 Ballester Iteration 100000 M( 110503 )C, 0x1fff0003ffe03f7d, n = 16384, MacLucasFFTW v8.1 Ballester Iteration 110000 M( 110503 )C, 0xffff01fff800007e, n = 16384, MacLucasFFTW v8.1 Ballester err = 2, increasing n from 16384 Iteration 10000 M( 110503 )C, 0xffff0003ffffc07e, n = 16384, MacLucasFFTW v8.1 Ballester Iteration 20000 M( 110503 )C, 0x0000fe03f8003f7d, n = 16384, MacLucasFFTW v8.1 Ballester Iteration 30000 M( 110503 )C, 0x1fc0fe0007e0007e, n = 16384, MacLucasFFTW v8.1 Ballester Iteration 40000 M( 110503 )C, 0xe03ffe03ffffc07e, n = 16384, MacLucasFFTW v8.1 Ballester Iteration 50000 M( 110503 )C, 0x1fc0fffc00003f7e, n = 16384, MacLucasFFTW v8.1 Ballester Iteration 60000 M( 110503 )C, 0x1fc0000007dfbf7e, n = 16384, MacLucasFFTW v8.1 Ballester Iteration 70000 M( 110503 )C, 0x0000fe0007dfbf7d, n = 16384, MacLucasFFTW v8.1 Ballester Iteration 80000 M( 110503 )C, 0x1fc001fc07e0007e, n = 16384, MacLucasFFTW v8.1 Ballester Iteration 90000 M( 110503 )C, 0x1fff01fc0000007e, n = 16384, MacLucasFFTW v8.1 Ballester Iteration 100000 M( 110503 )C, 0x1fff0003ffe03f7d, n = 16384, MacLucasFFTW v8.1 Ballester Iteration 110000 M( 110503 )C, 0xffff01fff800007e, n = 16384, MacLucasFFTW v8.1 Ballester err = 2, increasing n from 16384 I also get warnings during compilation: Code:
setup.h(191): warning: omission of exception specification is incompatible with previous function "rename" /usr/include/stdio.h(159): here setup.h(200): warning: omission of exception specification is incompatible with previous function "sscanf" /usr/include/stdio.h(413): here setup.h(275): warning: omission of exception specification is incompatible with previous function "setvbuf" /usr/include/stdio.h(313): here setup.cu(1): warning: variable "RCSsetup_c" was declared but never referenced setup.cu(5): warning: variable "RCSsetup_h" was declared but never referenced setup.h(191): warning: omission of exception specification is incompatible with previous function "rename" /usr/include/stdio.h(159): here setup.h(200): warning: omission of exception specification is incompatible with previous function "sscanf" /usr/include/stdio.h(413): here setup.h(275): warning: omission of exception specification is incompatible with previous function "setvbuf" /usr/include/stdio.h(313): here setup.cu(1): warning: variable "RCSsetup_c" was declared but never referenced setup.cu(5): warning: variable "RCSsetup_h" was declared but never referenced |
![]() |
![]() |
![]() |
#161 |
May 2010
7 Posts |
![]()
Okay, definitely 32bit vs 64 bit issue. When I installed 32-bit CUDA toolkit and recompiled against it, the problem disappeared. As a matter of fact, I noticed now that 64-bit version, when it didn't fall into infinite loop trap, was still incorrect as it declared, for example, M44497 a composite.
|
![]() |
![]() |
![]() |
#162 |
Apr 2010
33 Posts |
![]()
Hello all, I getting closer to assembling my new hardware to test this. It will take about 2 more weeks but there are some things I need cleared up first.
I intend to run this on Ubuntu Server Edition 9.10. Is there a benefit to using the 64bit edition over the 32bit? Would some of the software instructions used execute faster under 64bit? I'm only going to run LL tests. Also do I need to install both the proprietary Nvidia drivers and the CUDA kit or would the CUDA kit be enough. Also if anyone is familiar with Ubuntu what packages must I install to compile and run this software? Thanks. |
![]() |
![]() |
![]() |
#163 |
Jul 2009
Tokyo
26216 Posts |
![]() |
![]() |
![]() |
![]() |
#164 | |
Jul 2009
Tokyo
10011000102 Posts |
![]() Quote:
need Developer Drivers for Linux,CUDA Toolkit for Ubuntu Linux 9.04,GPU Computing SDK code samples and more |
|
![]() |
![]() |
![]() |
#165 |
Jul 2003
So Cal
3·5·137 Posts |
![]()
The GTX480 arrived and is installed. However, there is a bug with the 64-bit CUDA 3.0 Linux driver and 32-bit binaries that prevented me from running 32-bit CUDA runtime binaries, and of course the code is still not 64-bit safe. So I converted verO from the runtime API to the driver API now that you can use runtime style kernel calls with the driver API. The converted MacLucasFFTW.cu file is attached. It will only work with CUDA 3.0, and not with 2.3. Note that I had to remove a bunch of safe call commands and didn't replace them with proper error checking, but I just wanted it to work!
![]() Anyway, bottom line. The 2048K FFT runs at 5.47 ms/iteration, and the 4096K FFT runs at 10.4 ms/iteration on a GTX 480. As expected, they have disabled DP units in the consumer card, so DP still runs at 1/8 SP. The speed increase is due to the increase in the number of compute cores. The Tesla version, when it is released, should run at about 4x this speed. As a check of both the code and the hardware, I'm going to run the test of 42643801 to completion. That should take a bit over 5 days. |
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Don't DC/LL them with CudaLucas | LaurV | Data | 131 | 2017-05-02 18:41 |
CUDALucas / cuFFT Performance on CUDA 7 / 7.5 / 8 | Brain | GPU Computing | 13 | 2016-02-19 15:53 |
CUDALucas: which binary to use? | Karl M Johnson | GPU Computing | 15 | 2015-10-13 04:44 |
settings for cudaLucas | fairsky | GPU Computing | 11 | 2013-11-03 02:08 |
Trying to run CUDALucas on Windows 8 CP | Rodrigo | GPU Computing | 12 | 2012-03-07 23:20 |