![]() |
|
|
#67 |
|
Just call me Henry
"David"
Sep 2007
Liverpool (GMT/BST)
3×23×89 Posts |
Some of those fft lengths are slower than larger ones. It might be worth finding which those are.
Does CudaLucas use 128 bit sin/cos? |
|
|
|
|
|
#68 |
|
Jul 2009
Tokyo
2×5×61 Posts |
|
|
|
|
|
|
#69 |
|
Jul 2009
Tokyo
2·5·61 Posts |
I think I found a way to make do with 128 bit sin/cos without.
It is the arithmetic of the elementary school. Code:
double theta = TWO_PI * ((double)k)/((double)L);
for(size_t j=1; j<radix; j++)
{
double jt = - (j * theta);
double pi = 3.14159265358979323846264338327950288419716939937510 ;
double hpi=pi*0.5;
double s,c;
if(jt < hpi) s=-sin(jt);
else if(jt < pi) s=-sin(pi-jt);
else s=sin(jt-pi);
if(jt < hpi) c=sin(hpi-jt);
else if(jt < pi) c=-cos(pi-jt);
else if(jt < (pi+hpi)) c=sin(hpi-jt);
else c=cos(jt);
double r = -((double)k/(double)L)*(double)j;
if(r==-0.25){c=0;s=-1;}
if(r==-0.75){c=0;s=1;}
wc[nt] = c;
ws[nt++] = s;
}
|
|
|
|
|
|
#70 | |
|
"Mr. Meeseeks"
Jan 2012
California, USA
37×59 Posts |
Quote:
|
|
|
|
|
|
|
#71 |
|
Jul 2009
Tokyo
2×5×61 Posts |
|
|
|
|
|
|
#72 |
|
Just call me Henry
"David"
Sep 2007
Liverpool (GMT/BST)
3×23×89 Posts |
|
|
|
|
|
|
#73 |
|
Jul 2009
Tokyo
2×5×61 Posts |
|
|
|
|
|
|
#76 |
|
Jul 2009
Tokyo
26216 Posts |
Hi,
My system have sin() issue. Someone Please report Windows+VC system exec result. Intel Celeron G465 @ 1.90GHz ubuntu 12.04 LTS 64bit gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) Code:
$ cat sin.c
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
double sinsin(double a)
{ return sin(a);}
int main()
{
printf(" 64bit sin(%30.27lf) = %30.27lf\n",
0.509281621578032916985989687,
sinsin(0.509281621578032916985989687));
printf("128bit sin(%30.27lf) = %30.27lf\n",
0.509281621578032916985989687,
0.487550160148435940410394096);
}
$ cc -S sin.c
$ cat sin.s
.file "sin.c"
.text
.globl sinsin
.type sinsin, @function
sinsin:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
subq $16, %rsp
movsd %xmm0, -8(%rbp)
movsd -8(%rbp), %xmm0
call sin
leave
...
$ cc sin.c -lm
$ ./a.out
64bit sin( 0.509281621578032916985989687) = 0.487550160148435995921545327
128bit sin( 0.509281621578032916985989687) = 0.487550160148435940410394096
Please ignore. Last fiddled with by msft on 2013-09-01 at 14:05 Reason: Misconception. |
|
|
|
|
|
#77 |
|
"Mr. Meeseeks"
Jan 2012
California, USA
37×59 Posts |
Code:
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
double sinsin(double a)
{ return sin(a);}
int main()
{
printf(" 64bit sin(%30.27lf) = %30.27lf\n",
0.509281621578032916985989687,
sinsin(0.509281621578032916985989687));
printf("128bit sin(%30.27lf) = %30.27lf\n",
0.509281621578032916985989687,
0.487550160148435940410394096);
}
|
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| mfakto: an OpenCL program for Mersenne prefactoring | Bdot | GPU Computing | 1724 | 2023-06-04 23:31 |
| Can't get OpenCL to work on HD7950 Ubuntu 14.04.5 LTS | VictordeHolland | Linux | 4 | 2018-04-11 13:44 |
| OpenCL accellerated lattice siever | pstach | Factoring | 1 | 2014-05-23 01:03 |
| OpenCL for FPGAs | TObject | GPU Computing | 2 | 2013-10-12 21:09 |
| AMD's Graphics Core Next- a reason to accelerate towards OpenCL? | Belteshazzar | GPU Computing | 19 | 2012-03-07 18:58 |