![]() |
[QUOTE=Prime95;420992]Yes, I build with MSVC 2005.[/QUOTE]
And what compiler do you use to produce the Linux executables for mprime? It has been reported that this bug manifests under Linux as well as Windows (on some machines). |
[QUOTE=chalsall;421010]And what compiler do you use to produce the Linux executables for mprime?[/QUOTE]
GCC but all the critical FFT code is assembled in Windows using Masm. |
[QUOTE=Prime95;421024]GCC but all the critical FFT code is assembled in Windows using Masm.[/QUOTE]
Interesting... So while we thought we had eliminated a variable (it's OS independent) this might actually come down to focusing on Microsoft's assembler's interaction with the Skylake architecture. Or, maybe not... Solving intermittent problems is fun! Not easy, mind you, but rewarding.... |
[QUOTE=tha;420995]I will finish the current test, which will be another two hours. I will then start a new test with 8 threads working concurrently on the following worktodo.txt test case:
[/QUOTE] You might be better off just stopping what you're running now and doing the torture test at 768K since that's known to cause the problem on affected CPUs. Then if you get the roundoff errors you'll know your CPU is "in the club" and you can try to reproduce using a "real" exponent. My guess is that to replicate what the torture test is doing, you'd want to have all 8 (physical and HT) cores in a single worker. Not 8 separate workers doing 8 separate tests. It could be something specific to the threading code and combining the separate chunks from each large multiplication. |
Trying to interprete the results....
If I am making a FFT of type 2 - similar to gwsquare, NORMNUM = 1, NRMRTN = yi3eCORE (whatever this is, there's no source for it) for the data:
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, .... 0, 0 [10 ones then 768*1024 - 10 zeroes] I get the following output: 1 2 4 6 8 10 12 14 16 18 18 16 14 12 10 8 6 4 2 0 0 ..... all zeroes till the end. I was expecting: 1 2 3 4 5 6 7 8 9 10 9 8 7 6 5 4 3 2 1 0 .... zeroes till the end. Or at least double of this array because this is the real square. Is the norm routine doing something to this small data ? [CODE] for (int i = 768 * 1024; --i >= 0; ) { *addr(gwdata, s, i) = i < 10 ? 1 : 0; } gw_fft(gwdata, asm_data); for (int i = 768 * 1024; --i >= 0; ) { output[i] = *addr(gwdata, s, i); } [/CODE]The only explanation I find is that gwsquare(p) actually computes: 2 * p^2 - 2 * p + 1 instead of p^2. Is it right ? |
[QUOTE=megabit8;421033]If I am making a FFT of type 2 - similar to gwsquare, NORMNUM = 1, NRMRTN = yi3eCORE (whatever this is, there's no source for it) for the data:
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, .... 0, 0 [10 ones then 768*1024 - 10 zeroes] I was expecting: 1 2 3 4 5 6 7 8 9 10 9 8 7 6 5 4 3 2 1 0 .... zeroes till the end. Or at least double of this array because this is the real square. Is the norm routine doing something to this small data ? [/QUOTE] Gwnum performs weighted transforms. The initial FFT data values are rarely integers. If you call set_fft_value it will apply the proper weighting factor. Similarly, get_fft_value will return the FFT value after removing the weighting factor. Source for yi3eCORE is in ymult3a.asm. You'll have to wade through a pile of nasty MASM macros to see the generated assembly code. yi3eCORE is the rounding-to-integer and carry propagation code. y=AVX, i=Irrational FFT, e=calc round off error, CORE=optimized for CORE architectures. |
[QUOTE=Madpoo;421030]You might be better off just stopping what you're running now and doing the torture test at 768K since that's known to cause the problem on affected CPUs.
Then if you get the roundoff errors you'll know your CPU is "in the club" and you can try to reproduce using a "real" exponent. My guess is that to replicate what the torture test is doing, you'd want to have all 8 (physical and HT) cores in a single worker. Not 8 separate workers doing 8 separate tests. It could be something specific to the threading code and combining the separate chunks from each large multiplication.[/QUOTE] Amen to part 1. As to part 2, a torture test is more like 8 workers running separate tests. |
[QUOTE=Prime95;421036]Gwnum performs weighted transforms. The initial FFT data values are rarely integers. If you call set_fft_value it will apply the proper weighting factor. Similarly, get_fft_value will return the FFT value after removing the weighting factor.
Source for yi3eCORE is in ymult3a.asm. You'll have to wade through a pile of nasty MASM macros to see the generated assembly code. yi3eCORE is the rounding-to-integer and carry propagation code. y=AVX, i=Irrational FFT, e=calc round off error, CORE=optimized for CORE architectures.[/QUOTE] Thank you for your prompt response. |
[QUOTE=Prime95;421036]Gwnum performs weighted transforms. The initial FFT data values are rarely integers. If you call set_fft_value it will apply the proper weighting factor. Similarly, get_fft_value will return the FFT value after removing the weighting factor.
[/QUOTE] Tried set_fft_value for the input: 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, .... 0 still the magic output appears with get_fft_value: 1 2 4 6 8 10 12 14 16 18 18 16 14 12 10 8 6 4 2 0 ... 0 Tried with 1, 3, 5, 7, 9, 0, 0, ...., 0 The output is: 1 6 28 74 152 248 278 252 162 0 .... 0 Instead of: 1 6 19 44 85 124 139 126 81 0 .... 0 |
[QUOTE=megabit8;421048]Tried set_fft_value for the input:
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, .... 0 still the magic output appears with get_fft_value: 1 2 4 6 8 10 12 14 16 18 18 16 14 12 10 8 6 4 2 0 ... 0 Tried with 1, 3, 5, 7, 9, 0, 0, ...., 0 The output is: 1 6 28 74 152 248 278 252 162 0 .... 0 Instead of: 1 6 19 44 85 124 139 126 81 0 .... 0[/QUOTE] That is OK. Gwnum stuffs varying number of bits in each FFT word. In your case eith floor or ceiling of 14942209 / 768K. |
So, tomorrow most of the rest of the world get's back to work.
It might be interesting to see what happens, as we have done additional research (and engaged in heated debate) on this issue while other's enjoyed their time off... Sorry for being a prick. It's in my training; and my general nature... Question everything, and take no offence of anything.... |
| All times are UTC. The time now is 23:23. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.