![]() |
|
|
#56 |
|
Jul 2009
Tokyo
2·5·61 Posts |
Hi,
New Version. Support -d option. Print FFT error code. |
|
|
|
|
|
#57 |
|
"Mr. Meeseeks"
Jan 2012
California, USA
37·59 Posts |
this is what I get here. Maybe clamdfft lib error?
|
|
|
|
|
|
#58 |
|
Jul 2009
Tokyo
2×5×61 Posts |
http://devgurus.amd.com/thread/167302
AMD open sourcing APPML and creating clMath projects on GitHub |
|
|
|
|
|
#59 |
|
Jul 2009
Tokyo
61010 Posts |
On Ubuntu 12.04 LTS 64bit
Code:
$ sudo apt-get install libboost-dev From https://github.com/clMathLibraries/clFFT Dowmload ZIP /opt/AMDAPP/samples/opencl/lucas/clFFT/clFFT-develop$ cmake src -- The C compiler identification is GNU ... /opt/AMDAPP/samples/opencl/lucas/clFFT/clFFT-develop$ make Scanning dependencies of target clFFT [ 7%] Building CXX object library/CMakeFiles/clFFT.dir/transform.cpp.o .... /opt/AMDAPP/samples/opencl/lucas/0.60/0.59$ cat Makefile include $(DEPTH)/make/openclsdkdefs.mk #### # # Targets # #### OPENCL = 1 SAMPLE_EXE = 1 EXE_TARGET = CUDALucas EXE_TARGET_INSTALL = CUDALucas #### # # C/CPP files # #### FILES = CUDALucas CLFILES = Kernels.cl CFLAGS += -O3 -Wno-conversion-null -Wno-write-strings -Wno-pointer-arith -I /opt/AMDAPP/include/ -I /opt/AMDAPP/samples/opencl/lucas/clFFT/clFFT-develop/include -I /opt/AMDAPP/samples/opencl/lucas/clFFT/clFFT-develop/src/include LLIBS += SDKUtil LDFLAGS += -O3 /opt/AMDAPP/samples/opencl/lucas/clFFT/clFFT-develop/library/libclFFT.so -lOpenCL include $(DEPTH)/make/openclsdkrules.mk Code:
$ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/AMDAPP/samples/opencl/lucas/clFFT/clFFT-develop/library/ $ ./CUDALucas 216091 Platform :Advanced Micro Devices, Inc. Device 0 : Capeverde start M216091 fft length = 12288 Iteration 210000 M( 216091 )C, 0xcfe091c8f59f8a7b, n = 12288, CUDALucas v1.66 err = 0.005127 (0:05 real, 0.5255 ms/iter, ETA 0:00) M( 216091 )P, n = 12288, CUDALucas v1.66 Code:
Iteration 10000 M( 4232233 )C, 0x569040f5d6a8ca8e, n = 221184, CUDALucas v1.66 err = 0.2227 (0:18 real, 1.8229 ms/iter, ETA 0:00) Iteration 10000 M( 4257371 )C, 0x84a94daa63202f9d, n = 221184, CUDALucas v1.66 err = 0.2812 (0:19 real, 1.8701 ms/iter, ETA 0:00) Iteration 10000 M( 4304389 )C, 0x69141b7797e8a54b, n = 221184, CUDALucas v1.66 err = 0.3438 (0:18 real, 1.8825 ms/iter, ETA 0:00) Iteration 10000 M( 4321997 )C, 0x068f3e549762bc46, n = 245760, CUDALucas v1.66 err = 0.02832 (0:27 real, 2.6723 ms/iter, ETA 0:00) Iteration 10000 M( 4341223 )C, 0xe74e92b818a8c889, n = 245760, CUDALucas v1.66 err = 0.0332 (0:27 real, 2.7186 ms/iter, ETA 0:00) Iteration 10000 M( 4368059 )C, 0xc9453ef84d257afe, n = 245760, CUDALucas v1.66 err = 0.04102 (0:28 real, 2.7033 ms/iter, ETA 0:00) Iteration 10000 M( 4492549 )C, 0x6756c880963c42fd, n = 245760, CUDALucas v1.66 err = 0.07422 (0:27 real, 2.7233 ms/iter, ETA 0:00) Iteration 10000 M( 4492591 )C, 0x0cb002718b716b35, n = 245760, CUDALucas v1.66 err = 0.07617 (0:27 real, 2.7005 ms/iter, ETA 0:00) Iteration 10000 M( 4554163 )C, 0x616f265edd81322b, n = 245760, CUDALucas v1.66 err = 0.1094 (0:26 real, 2.6463 ms/iter, ETA 0:00) Iteration 10000 M( 4559299 )C, 0x8c6626b60314eae5, n = 245760, CUDALucas v1.66 err = 0.1094 (0:27 real, 2.7134 ms/iter, ETA 0:00) Iteration 10000 M( 4621231 )C, 0x067b36acf2e29497, n = 245760, CUDALucas v1.66 err = 0.1641 (0:27 real, 2.7057 ms/iter, ETA 0:00) Iteration 10000 M( 4645429 )C, 0x2de547228c1f3950, n = 245760, CUDALucas v1.66 err = 0.1738 (0:27 real, 2.6980 ms/iter, ETA 0:00) Iteration 10000 M( 4765693 )C, 0x5c23f227c3ec765c, n = 262144, CUDALucas v1.66 err = 0.0625 (0:26 real, 2.5750 ms/iter, ETA 0:00) Iteration 10000 M( 4803563 )C, 0xe09c697339e45f20, n = 262144, CUDALucas v1.66 err = 0.08398 (0:25 real, 2.5295 ms/iter, ETA 0:00) Iteration 10000 M( 4811783 )C, 0x70cd04c5d07fa36a, n = 262144, CUDALucas v1.66 err = 0.08398 (0:26 real, 2.5597 ms/iter, ETA 0:00) Iteration 10000 M( 4819853 )C, 0x54e5b2ff7b131792, n = 262144, CUDALucas v1.66 err = 0.08984 (0:26 real, 2.6090 ms/iter, ETA 0:00) Iteration 10000 M( 4836521 )C, 0x9dd012c19ba2e06b, n = 262144, CUDALucas v1.66 err = 0.09375 (0:25 real, 2.5826 ms/iter, ETA 0:00) |
|
|
|
|
|
#60 |
|
Romulan Interpreter
"name field"
Jun 2011
Thailand
1029110 Posts |
Shouldn't this be renamed "CLLucas", or something like that?
![]() Any windoze builds? |
|
|
|
|
|
#61 | |
|
"Mr. Meeseeks"
Jan 2012
California, USA
37×59 Posts |
Quote:
|
|
|
|
|
|
|
#62 |
|
Jul 2009
Tokyo
2·5·61 Posts |
sin(-2.0*pi/256) = -0.02454122852291228803173
Code:
$ grep 2454122 clfft.kernel.Stockham1.cl
(double2)(0.0245412285229122638374743559097624, -0.9996988186962042499672520534659270),
clFFT-develop/src/library/generator.stockham.cpp:
double theta = TWO_PI * ((double)k)/((double)L);
for(size_t j=1; j<radix; j++)
{
double c = cos(((double)j) * theta);
double s = sin(((double)j) * theta);
|
|
|
|
|
|
#63 |
|
Jul 2009
Tokyo
2·5·61 Posts |
Some result.
HD7750: Code:
Iteration 10000 M( 22256453 )C, 0x3d9450d492b7e880, n = 1310720, CUDALucas v1.66 err = 0.0293 (3:16 real, 19.5456 ms/iter, ETA 0:00) Iteration 10000 M( 24732709 )C, 0x81a12a304a754572, n = 1572864, CUDALucas v1.66 err = 0.006836 (4:23 real, 26.3195 ms/iter, ETA 0:00) Iteration 10000 M( 29412433 )C, 0x27d7d112a73aa203, n = 1638400, CUDALucas v1.66 err = 0.1211 (6:36 real, 39.6089 ms/iter, ETA 0:00) Iteration 10000 M( 30620113 )C, 0x212dca3cec0acde2, n = 1769472, CUDALucas v1.66 err = 0.0625 (8:50 real, 53.0814 ms/iter, ETA 0:00) Iteration 10000 M( 32993419 )C, 0xcf86a69b844e35c0, n = 1966080, CUDALucas v1.66 err = 0.03882 (7:10 real, 43.0477 ms/iter, ETA 0:00) Iteration 10000 M( 36418493 )C, 0x2f1388379572d5b4, n = 2097152, CUDALucas v1.66 err = 0.06885 (2:45 real, 16.4693 ms/iter, ETA 0:00) Iteration 10000 M( 38955173 )C, 0x8a45e3bbd4e4fc9b, n = 2359296, CUDALucas v1.66 err = 0.02393 (6:49 real, 40.9371 ms/iter, ETA 0:00) Iteration 10000 M( 43792559 )C, 0x7048d84bbfb0f810, n = 2621440, CUDALucas v1.66 err = 0.03418 (7:32 real, 45.2021 ms/iter, ETA 0:00) Iteration 10000 M( 48375209 )C, 0xf957e240d591a99e, n = 3145728, CUDALucas v1.66 err = 0.006104 (9:56 real, 59.5094 ms/iter, ETA 0:00) Iteration 10000 M( 57899201 )C, 0xa2ac01bbc76d92ee, n = 3276800, CUDALucas v1.66 err = 0.125 (13:56 real, 83.6109 ms/iter, ETA 0:00) Iteration 10000 M( 60622229 )C, 0xd81c849f11fd1054, n = 3538944, CUDALucas v1.66 err = 0.06641 (19:33 real, 117.3318 ms/iter, ETA 0:00) Iteration 10000 M( 65066623 )C, 0xde7aeb8cc7a2a826, n = 3932160, CUDALucas v1.66 err = 0.03711 (16:21 real, 98.0832 ms/iter, ETA 0:00) Iteration 10000 M( 67662869 )C, 0xf854d1dee3fbb5d7, n = 3932160, CUDALucas v1.66 err = 0.08984 (16:19 real, 97.8780 ms/iter, ETA 0:00) Iteration 10000 M( 72000007 )C, 0x404aa83a2e247882, n = 4194304, CUDALucas v1.66 err = 0.08105 (5:55 real, 35.4960 ms/iter, ETA 0:00) Iteration 10000 M( 76722161 )C, 0x4b6ba0a6078e4bbb, n = 4718592, CUDALucas v1.66 err = 0.02734 (30:11 real, 181.0818 ms/iter, ETA 0:00) Iteration 10000 M( 86109511 )C, 0x760de83047f1f7e9, n = 5242880, CUDALucas v1.66 err = 0.03369 (32:38 real, 195.8295 ms/iter, ETA 0:00) Iteration 10000 M( 99763721 )C, 0x6c11619b0ca6efe7, n = 6291456, CUDALucas v1.66 err = 0.01904 (48:14 real, 289.4293 ms/iter, ETA 0:00) Iteration 10000 M( 110125091 )C, 0x6b41b2b871780bdb, n = 6291456, CUDALucas v1.66 err = 0.1875 (48:08 real, 288.8099 ms/iter, ETA 0:00) Iteration 10000 M( 115362647 )C, 0xe40c813286d939e3, n = 6553600, CUDALucas v1.66 err = 0.1719 (45:49 real, 274.8969 ms/iter, ETA 0:00) Iteration 10000 M( 119591027 )C, 0x5998495f1a051165, n = 7077888, CUDALucas v1.66 err = 0.07812 (47:52 real, 287.1596 ms/iter, ETA 0:00) Iteration 10000 M( 145538021 )C, 0xa24220af7e2155f4, n = 8388608, CUDALucas v1.66 err = 0.1562 (11:54 real, 71.3361 ms/iter, ETA 0:00) |
|
|
|
|
|
#64 |
|
Jul 2009
Tokyo
2·5·61 Posts |
clFFT with 128bit sin,cos.
Code:
Iteration 10000 M( 22256453 )C, 0x3d9450d492b7e880, n = 1179648, CUDALucas v1.66 err = 0.2734 (2:56 real, 17.6419 ms/iter, ETA 0:00) Iteration 10000 M( 24732709 )C, 0x81a12a304a754572, n = 1310720, CUDALucas v1.66 err = 0.2812 (3:15 real, 19.5596 ms/iter, ETA 0:00) Iteration 10000 M( 29412433 )C, 0x27d7d112a73aa203, n = 1572864, CUDALucas v1.66 err = 0.2812 (4:23 real, 26.3191 ms/iter, ETA 0:00) Iteration 10000 M( 30620113 )C, 0x212dca3cec0acde2, n = 1638400, CUDALucas v1.66 err = 0.25 (6:20 real, 37.9453 ms/iter, ETA 0:00) Iteration 10000 M( 32993419 )C, 0xcf86a69b844e35c0, n = 1769472, CUDALucas v1.66 err = 0.2812 (8:28 real, 50.8171 ms/iter, ETA 0:00) Iteration 10000 M( 36418493 )C, 0x2f1388379572d5b4, n = 1966080, CUDALucas v1.66 err = 0.25 (7:10 real, 43.0098 ms/iter, ETA 0:00) Iteration 10000 M( 38955173 )C, 0x8a45e3bbd4e4fc9b, n = 2097152, CUDALucas v1.66 err = 0.2812 (2:48 real, 16.8227 ms/iter, ETA 0:00) Iteration 10000 M( 43792559 )C, 0x7048d84bbfb0f810, n = 2359296, CUDALucas v1.66 err = 0.2812 (6:52 real, 41.2492 ms/iter, ETA 0:00) Iteration 10000 M( 48375209 )C, 0xf957e240d591a99e, n = 2621440, CUDALucas v1.66 err = 0.2188 (7:37 real, 45.7406 ms/iter, ETA 0:00) Iteration 10000 M( 57899201 )C, 0xa2ac01bbc76d92ee, n = 3145728, CUDALucas v1.66 err = 0.2656 (9:59 real, 59.8521 ms/iter, ETA 0:00) Iteration 10000 M( 60622229 )C, 0xd81c849f11fd1054, n = 3276800, CUDALucas v1.66 err = 0.2812 (14:03 real, 84.2392 ms/iter, ETA 0:00) Iteration 10000 M( 65066623 )C, 0xde7aeb8cc7a2a826, n = 3538944, CUDALucas v1.66 err = 0.2812 (19:46 real, 118.6206 ms/iter, ETA 0:00) Iteration 10000 M( 67662869 )C, 0xf854d1dee3fbb5d7, n = 3932160, CUDALucas v1.66 err = 0.05469 (16:38 real, 99.8373 ms/iter, ETA 0:00) Iteration 10000 M( 72000007 )C, 0x404aa83a2e247882, n = 3932160, CUDALucas v1.66 err = 0.25 (16:41 real, 100.0775 ms/iter, ETA 0:00) Iteration 10000 M( 76722161 )C, 0x4b6ba0a6078e4bbb, n = 4194304, CUDALucas v1.66 err = 0.2812 (6:03 real, 36.2961 ms/iter, ETA 0:00) Iteration 10000 M( 86109511 )C, 0x760de83047f1f7e9, n = 4718592, CUDALucas v1.66 err = 0.25 (32:13 real, 193.3191 ms/iter, ETA 0:00) |
|
|
|
|
|
#65 | |
|
"Mr. Meeseeks"
Jan 2012
California, USA
37×59 Posts |
Quote:
|
|
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| mfakto: an OpenCL program for Mersenne prefactoring | Bdot | GPU Computing | 1724 | 2023-06-04 23:31 |
| Can't get OpenCL to work on HD7950 Ubuntu 14.04.5 LTS | VictordeHolland | Linux | 4 | 2018-04-11 13:44 |
| OpenCL accellerated lattice siever | pstach | Factoring | 1 | 2014-05-23 01:03 |
| OpenCL for FPGAs | TObject | GPU Computing | 2 | 2013-10-12 21:09 |
| AMD's Graphics Core Next- a reason to accelerate towards OpenCL? | Belteshazzar | GPU Computing | 19 | 2012-03-07 18:58 |