mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2013-08-10, 15:08   #56
msft
 
msft's Avatar
 
Jul 2009
Tokyo

2·5·61 Posts
Default

Hi,
New Version.
Support -d option.
Print FFT error code.
Attached Files
File Type: bz2 0.59.tar.bz2 (15.3 KB, 182 views)
msft is offline   Reply With Quote
Old 2013-08-10, 23:01   #57
kracker
 
kracker's Avatar
 
"Mr. Meeseeks"
Jan 2012
California, USA

37·59 Posts
Default

this is what I get here. Maybe clamdfft lib error?
Attached Thumbnails
Click image for larger version

Name:	error.jpg
Views:	214
Size:	29.2 KB
ID:	10111  
kracker is offline   Reply With Quote
Old 2013-08-14, 07:20   #58
msft
 
msft's Avatar
 
Jul 2009
Tokyo

2×5×61 Posts
Default

http://devgurus.amd.com/thread/167302

AMD open sourcing APPML and creating clMath projects on GitHub
msft is offline   Reply With Quote
Old 2013-08-14, 11:09   #59
msft
 
msft's Avatar
 
Jul 2009
Tokyo

61010 Posts
Default

On Ubuntu 12.04 LTS 64bit
Code:
$ sudo apt-get install libboost-dev
From https://github.com/clMathLibraries/clFFT Dowmload ZIP

/opt/AMDAPP/samples/opencl/lucas/clFFT/clFFT-develop$ cmake src
-- The C compiler identification is GNU
...
/opt/AMDAPP/samples/opencl/lucas/clFFT/clFFT-develop$ make
Scanning dependencies of target clFFT
[  7%] Building CXX object library/CMakeFiles/clFFT.dir/transform.cpp.o
....

/opt/AMDAPP/samples/opencl/lucas/0.60/0.59$ cat Makefile
include $(DEPTH)/make/openclsdkdefs.mk

####
#
#  Targets
#
####

OPENCL                  = 1
SAMPLE_EXE              = 1
EXE_TARGET              = CUDALucas
EXE_TARGET_INSTALL      = CUDALucas

####
#
#  C/CPP files
#
####

FILES   = CUDALucas
CLFILES = Kernels.cl

CFLAGS  +=  -O3 -Wno-conversion-null -Wno-write-strings -Wno-pointer-arith -I /opt/AMDAPP/include/ -I /opt/AMDAPP/samples/opencl/lucas/clFFT/clFFT-develop/include -I /opt/AMDAPP/samples/opencl/lucas/clFFT/clFFT-develop/src/include

LLIBS   += SDKUtil
LDFLAGS         += -O3 /opt/AMDAPP/samples/opencl/lucas/clFFT/clFFT-develop/library/libclFFT.so -lOpenCL

include $(DEPTH)/make/openclsdkrules.mk
Code:
$ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/AMDAPP/samples/opencl/lucas/clFFT/clFFT-develop/library/
$ ./CUDALucas 216091
Platform :Advanced Micro Devices, Inc.
Device 0 : Capeverde


start M216091 fft length = 12288
Iteration 210000 M( 216091 )C, 0xcfe091c8f59f8a7b, n = 12288, CUDALucas v1.66 err = 0.005127 (0:05 real, 0.5255 ms/iter, ETA 0:00)
M( 216091 )P, n = 12288, CUDALucas v1.66
HD7750:
Code:
Iteration 10000 M( 4232233 )C, 0x569040f5d6a8ca8e, n = 221184, CUDALucas v1.66 err = 0.2227 (0:18 real, 1.8229 ms/iter, ETA 0:00)
Iteration 10000 M( 4257371 )C, 0x84a94daa63202f9d, n = 221184, CUDALucas v1.66 err = 0.2812 (0:19 real, 1.8701 ms/iter, ETA 0:00)
Iteration 10000 M( 4304389 )C, 0x69141b7797e8a54b, n = 221184, CUDALucas v1.66 err = 0.3438 (0:18 real, 1.8825 ms/iter, ETA 0:00)
Iteration 10000 M( 4321997 )C, 0x068f3e549762bc46, n = 245760, CUDALucas v1.66 err = 0.02832 (0:27 real, 2.6723 ms/iter, ETA 0:00)
Iteration 10000 M( 4341223 )C, 0xe74e92b818a8c889, n = 245760, CUDALucas v1.66 err = 0.0332 (0:27 real, 2.7186 ms/iter, ETA 0:00)
Iteration 10000 M( 4368059 )C, 0xc9453ef84d257afe, n = 245760, CUDALucas v1.66 err = 0.04102 (0:28 real, 2.7033 ms/iter, ETA 0:00)
Iteration 10000 M( 4492549 )C, 0x6756c880963c42fd, n = 245760, CUDALucas v1.66 err = 0.07422 (0:27 real, 2.7233 ms/iter, ETA 0:00)
Iteration 10000 M( 4492591 )C, 0x0cb002718b716b35, n = 245760, CUDALucas v1.66 err = 0.07617 (0:27 real, 2.7005 ms/iter, ETA 0:00)
Iteration 10000 M( 4554163 )C, 0x616f265edd81322b, n = 245760, CUDALucas v1.66 err = 0.1094 (0:26 real, 2.6463 ms/iter, ETA 0:00)
Iteration 10000 M( 4559299 )C, 0x8c6626b60314eae5, n = 245760, CUDALucas v1.66 err = 0.1094 (0:27 real, 2.7134 ms/iter, ETA 0:00)
Iteration 10000 M( 4621231 )C, 0x067b36acf2e29497, n = 245760, CUDALucas v1.66 err = 0.1641 (0:27 real, 2.7057 ms/iter, ETA 0:00)
Iteration 10000 M( 4645429 )C, 0x2de547228c1f3950, n = 245760, CUDALucas v1.66 err = 0.1738 (0:27 real, 2.6980 ms/iter, ETA 0:00)
Iteration 10000 M( 4765693 )C, 0x5c23f227c3ec765c, n = 262144, CUDALucas v1.66 err = 0.0625 (0:26 real, 2.5750 ms/iter, ETA 0:00)
Iteration 10000 M( 4803563 )C, 0xe09c697339e45f20, n = 262144, CUDALucas v1.66 err = 0.08398 (0:25 real, 2.5295 ms/iter, ETA 0:00)
Iteration 10000 M( 4811783 )C, 0x70cd04c5d07fa36a, n = 262144, CUDALucas v1.66 err = 0.08398 (0:26 real, 2.5597 ms/iter, ETA 0:00)
Iteration 10000 M( 4819853 )C, 0x54e5b2ff7b131792, n = 262144, CUDALucas v1.66 err = 0.08984 (0:26 real, 2.6090 ms/iter, ETA 0:00)
Iteration 10000 M( 4836521 )C, 0x9dd012c19ba2e06b, n = 262144, CUDALucas v1.66 err = 0.09375 (0:25 real, 2.5826 ms/iter, ETA 0:00)
Correct.
msft is offline   Reply With Quote
Old 2013-08-14, 13:37   #60
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
"name field"
Jun 2011
Thailand

1029110 Posts
Default

Shouldn't this be renamed "CLLucas", or something like that?
Any windoze builds?
LaurV is offline   Reply With Quote
Old 2013-08-14, 14:31   #61
kracker
 
kracker's Avatar
 
"Mr. Meeseeks"
Jan 2012
California, USA

37×59 Posts
Default

Quote:
Originally Posted by msft View Post
http://devgurus.amd.com/thread/167302

AMD open sourcing APPML and creating clMath projects on GitHub
Hell yeah! Thanks for telling us, I will be trying to rebuild today from scratch when I have time.
kracker is offline   Reply With Quote
Old 2013-08-17, 09:43   #62
msft
 
msft's Avatar
 
Jul 2009
Tokyo

2·5·61 Posts
Default

sin(-2.0*pi/256) = -0.02454122852291228803173
Code:
$ grep 2454122 clfft.kernel.Stockham1.cl
(double2)(0.0245412285229122638374743559097624, -0.9996988186962042499672520534659270),

clFFT-develop/src/library/generator.stockham.cpp:   
                                        double theta = TWO_PI * ((double)k)/((double)L);
                                        for(size_t j=1; j<radix; j++)
                                        {
                                                double c = cos(((double)j) * theta);
                                                double s = sin(((double)j) * theta);
fftw use 128bit floating point for make constant table.
msft is offline   Reply With Quote
Old 2013-08-20, 15:41   #63
msft
 
msft's Avatar
 
Jul 2009
Tokyo

2·5·61 Posts
Default

Some result.
HD7750:
Code:
Iteration 10000 M( 22256453 )C, 0x3d9450d492b7e880, n = 1310720, CUDALucas v1.66 err = 0.0293 (3:16 real, 19.5456 ms/iter, ETA 0:00)
Iteration 10000 M( 24732709 )C, 0x81a12a304a754572, n = 1572864, CUDALucas v1.66 err = 0.006836 (4:23 real, 26.3195 ms/iter, ETA 0:00)
Iteration 10000 M( 29412433 )C, 0x27d7d112a73aa203, n = 1638400, CUDALucas v1.66 err = 0.1211 (6:36 real, 39.6089 ms/iter, ETA 0:00)
Iteration 10000 M( 30620113 )C, 0x212dca3cec0acde2, n = 1769472, CUDALucas v1.66 err = 0.0625 (8:50 real, 53.0814 ms/iter, ETA 0:00)
Iteration 10000 M( 32993419 )C, 0xcf86a69b844e35c0, n = 1966080, CUDALucas v1.66 err = 0.03882 (7:10 real, 43.0477 ms/iter, ETA 0:00)
Iteration 10000 M( 36418493 )C, 0x2f1388379572d5b4, n = 2097152, CUDALucas v1.66 err = 0.06885 (2:45 real, 16.4693 ms/iter, ETA 0:00)
Iteration 10000 M( 38955173 )C, 0x8a45e3bbd4e4fc9b, n = 2359296, CUDALucas v1.66 err = 0.02393 (6:49 real, 40.9371 ms/iter, ETA 0:00)
Iteration 10000 M( 43792559 )C, 0x7048d84bbfb0f810, n = 2621440, CUDALucas v1.66 err = 0.03418 (7:32 real, 45.2021 ms/iter, ETA 0:00)
Iteration 10000 M( 48375209 )C, 0xf957e240d591a99e, n = 3145728, CUDALucas v1.66 err = 0.006104 (9:56 real, 59.5094 ms/iter, ETA 0:00)
Iteration 10000 M( 57899201 )C, 0xa2ac01bbc76d92ee, n = 3276800, CUDALucas v1.66 err = 0.125 (13:56 real, 83.6109 ms/iter, ETA 0:00)
Iteration 10000 M( 60622229 )C, 0xd81c849f11fd1054, n = 3538944, CUDALucas v1.66 err = 0.06641 (19:33 real, 117.3318 ms/iter, ETA 0:00)
Iteration 10000 M( 65066623 )C, 0xde7aeb8cc7a2a826, n = 3932160, CUDALucas v1.66 err = 0.03711 (16:21 real, 98.0832 ms/iter, ETA 0:00)
Iteration 10000 M( 67662869 )C, 0xf854d1dee3fbb5d7, n = 3932160, CUDALucas v1.66 err = 0.08984 (16:19 real, 97.8780 ms/iter, ETA 0:00)
Iteration 10000 M( 72000007 )C, 0x404aa83a2e247882, n = 4194304, CUDALucas v1.66 err = 0.08105 (5:55 real, 35.4960 ms/iter, ETA 0:00)
Iteration 10000 M( 76722161 )C, 0x4b6ba0a6078e4bbb, n = 4718592, CUDALucas v1.66 err = 0.02734 (30:11 real, 181.0818 ms/iter, ETA 0:00)
Iteration 10000 M( 86109511 )C, 0x760de83047f1f7e9, n = 5242880, CUDALucas v1.66 err = 0.03369 (32:38 real, 195.8295 ms/iter, ETA 0:00)
Iteration 10000 M( 99763721 )C, 0x6c11619b0ca6efe7, n = 6291456, CUDALucas v1.66 err = 0.01904 (48:14 real, 289.4293 ms/iter, ETA 0:00)
Iteration 10000 M( 110125091 )C, 0x6b41b2b871780bdb, n = 6291456, CUDALucas v1.66 err = 0.1875 (48:08 real, 288.8099 ms/iter, ETA 0:00)
Iteration 10000 M( 115362647 )C, 0xe40c813286d939e3, n = 6553600, CUDALucas v1.66 err = 0.1719 (45:49 real, 274.8969 ms/iter, ETA 0:00)
Iteration 10000 M( 119591027 )C, 0x5998495f1a051165, n = 7077888, CUDALucas v1.66 err = 0.07812 (47:52 real, 287.1596 ms/iter, ETA 0:00)
Iteration 10000 M( 145538021 )C, 0xa24220af7e2155f4, n = 8388608, CUDALucas v1.66 err = 0.1562 (11:54 real, 71.3361 ms/iter, ETA 0:00)
msft is offline   Reply With Quote
Old 2013-08-21, 04:47   #64
msft
 
msft's Avatar
 
Jul 2009
Tokyo

2·5·61 Posts
Default

clFFT with 128bit sin,cos.
Code:
Iteration 10000 M( 22256453 )C, 0x3d9450d492b7e880, n = 1179648, CUDALucas v1.66 err = 0.2734 (2:56 real, 17.6419 ms/iter, ETA 0:00)
Iteration 10000 M( 24732709 )C, 0x81a12a304a754572, n = 1310720, CUDALucas v1.66 err = 0.2812 (3:15 real, 19.5596 ms/iter, ETA 0:00)
Iteration 10000 M( 29412433 )C, 0x27d7d112a73aa203, n = 1572864, CUDALucas v1.66 err = 0.2812 (4:23 real, 26.3191 ms/iter, ETA 0:00)
Iteration 10000 M( 30620113 )C, 0x212dca3cec0acde2, n = 1638400, CUDALucas v1.66 err = 0.25 (6:20 real, 37.9453 ms/iter, ETA 0:00)
Iteration 10000 M( 32993419 )C, 0xcf86a69b844e35c0, n = 1769472, CUDALucas v1.66 err = 0.2812 (8:28 real, 50.8171 ms/iter, ETA 0:00)
Iteration 10000 M( 36418493 )C, 0x2f1388379572d5b4, n = 1966080, CUDALucas v1.66 err = 0.25 (7:10 real, 43.0098 ms/iter, ETA 0:00)
Iteration 10000 M( 38955173 )C, 0x8a45e3bbd4e4fc9b, n = 2097152, CUDALucas v1.66 err = 0.2812 (2:48 real, 16.8227 ms/iter, ETA 0:00)
Iteration 10000 M( 43792559 )C, 0x7048d84bbfb0f810, n = 2359296, CUDALucas v1.66 err = 0.2812 (6:52 real, 41.2492 ms/iter, ETA 0:00)
Iteration 10000 M( 48375209 )C, 0xf957e240d591a99e, n = 2621440, CUDALucas v1.66 err = 0.2188 (7:37 real, 45.7406 ms/iter, ETA 0:00)
Iteration 10000 M( 57899201 )C, 0xa2ac01bbc76d92ee, n = 3145728, CUDALucas v1.66 err = 0.2656 (9:59 real, 59.8521 ms/iter, ETA 0:00)
Iteration 10000 M( 60622229 )C, 0xd81c849f11fd1054, n = 3276800, CUDALucas v1.66 err = 0.2812 (14:03 real, 84.2392 ms/iter, ETA 0:00)
Iteration 10000 M( 65066623 )C, 0xde7aeb8cc7a2a826, n = 3538944, CUDALucas v1.66 err = 0.2812 (19:46 real, 118.6206 ms/iter, ETA 0:00)
Iteration 10000 M( 67662869 )C, 0xf854d1dee3fbb5d7, n = 3932160, CUDALucas v1.66 err = 0.05469 (16:38 real, 99.8373 ms/iter, ETA 0:00)
Iteration 10000 M( 72000007 )C, 0x404aa83a2e247882, n = 3932160, CUDALucas v1.66 err = 0.25 (16:41 real, 100.0775 ms/iter, ETA 0:00)
Iteration 10000 M( 76722161 )C, 0x4b6ba0a6078e4bbb, n = 4194304, CUDALucas v1.66 err = 0.2812 (6:03 real, 36.2961 ms/iter, ETA 0:00)
Iteration 10000 M( 86109511 )C, 0x760de83047f1f7e9, n = 4718592, CUDALucas v1.66 err = 0.25 (32:13 real, 193.3191 ms/iter, ETA 0:00)
Accuracy is improved.
msft is offline   Reply With Quote
Old 2013-08-30, 15:53   #65
kracker
 
kracker's Avatar
 
"Mr. Meeseeks"
Jan 2012
California, USA

37×59 Posts
Default

Quote:
Originally Posted by msft View Post
sin(-2.0*pi/256) = -0.02454122852291228803173
Code:
$ grep 2454122 clfft.kernel.Stockham1.cl
(double2)(0.0245412285229122638374743559097624, -0.9996988186962042499672520534659270),

clFFT-develop/src/library/generator.stockham.cpp:   
                                        double theta = TWO_PI * ((double)k)/((double)L);
                                        for(size_t j=1; j<radix; j++)
                                        {
                                                double c = cos(((double)j) * theta);
                                                double s = sin(((double)j) * theta);
fftw use 128bit floating point for make constant table.
If you want, you can submit a request here for clFFT.
kracker is offline   Reply With Quote
Old 2013-08-31, 02:28   #66
msft
 
msft's Avatar
 
Jul 2009
Tokyo

2·5·61 Posts
Default

Quote:
Originally Posted by kracker View Post
If you want, you can submit a request here for clFFT.
I fix this issue.
Thank you for information.
msft is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
mfakto: an OpenCL program for Mersenne prefactoring Bdot GPU Computing 1724 2023-06-04 23:31
Can't get OpenCL to work on HD7950 Ubuntu 14.04.5 LTS VictordeHolland Linux 4 2018-04-11 13:44
OpenCL accellerated lattice siever pstach Factoring 1 2014-05-23 01:03
OpenCL for FPGAs TObject GPU Computing 2 2013-10-12 21:09
AMD's Graphics Core Next- a reason to accelerate towards OpenCL? Belteshazzar GPU Computing 19 2012-03-07 18:58

All times are UTC. The time now is 15:24.


Fri Jul 7 15:24:46 UTC 2023 up 323 days, 12:53, 0 users, load averages: 1.55, 1.19, 1.12

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔