20091124, 15:34  #100 
6809 > 6502
"""""""""""""""""""
Aug 2003
101Ă—103 Posts
254B_{16} Posts 

20091124, 17:11  #101 
Banned
"Luigi"
Aug 2002
Team Italia
4,813 Posts 

20091124, 22:03  #102 
Sep 2004
2·5·283 Posts 
Can you test 4 numbers in parallel on your Q8400 and the exact 4 in series on your GTX260?
Last fiddled with by em99010pepe on 20091124 at 22:06 
20091125, 04:23  #103  
Jul 2009
Tokyo
1142_{8} Posts 
Hi, Uncwilly
Quote:
Hi, ET_ http://www.nvidia.com/object/io_1258360868914.html >Editorsâ€™ note: As previously announced, the first Fermibased consumer (GeForce) products are expected to be available first quarter 2010. Hi, em99010pepe Quote:
New GTX260 result. M22728263 Thank you, 

20091126, 04:11  #104 
Jul 2003
So Cal
2^{3}×3^{2}×29 Posts 
Three more double checks have finished. These are just under the 2048K/4096K boundary. Only one of three matched the previous result. Time will tell if the other two are correct. I also have a fourth running, 36500117, but after about a million iterations, the roundoff error grew above the limit and it switched over to a 4096K FFT. Therefore, it only about half way through the doublecheck.
36500089 36500111 36500119 
20091126, 09:05  #105 
Jul 2003
So Cal
2^{3}·3^{2}·29 Posts 
On the Tesla C1060, the TESRA version C is much slower than version y, but the nonTESRA version is the fastest yet with the 4096K FFT timing at 0.025 sec/iteration. If the improvements cannot also be used to optimize the TESRA version, then the TESRA version can now be dropped.

20091126, 09:37  #106  
"Tony Gott"
Aug 2002
Yell, Shetland, UK
3×107 Posts 
Quote:
Secondly, I did have problems compiling this version so wondered if you could refresh me on the steps I need to take.... Cheers 

20091126, 09:51  #107  
Jul 2009
Tokyo
2×5×61 Posts 
Hi, TheJudger
I wish your programing success. Fascinating,test is very important, overclocker's contaminating result? Quote:
Thank you, 

20091126, 09:53  #108 
Jul 2009
Tokyo
2×5×61 Posts 

20091126, 17:14  #110 
"Oliver"
Mar 2005
Germany
11·101 Posts 
Hi msft/frmky,
just some ideas out of my mind:  perhaps you should choose exponents which have allready verified? In this case you can be sure if your results are OK or not immediatly.  choose some exponents which are not so close to the fft limit. I didn't dive into the CUFFTW docs, perhaps the rounding/rounding errors are not so accurate as the CPU versions of MaclucasFFTW and you need to lower the FFT boundaries? TheJudger P.S. 22 million checks per second for TF :) 
Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
Don't DC/LL them with CudaLucas  LaurV  Data  131  20170502 18:41 
CUDALucas / cuFFT Performance on CUDA 7 / 7.5 / 8  Brain  GPU Computing  13  20160219 15:53 
CUDALucas: which binary to use?  Karl M Johnson  GPU Computing  15  20151013 04:44 
settings for cudaLucas  fairsky  GPU Computing  11  20131103 02:08 
Trying to run CUDALucas on Windows 8 CP  Rodrigo  GPU Computing  12  20120307 23:20 