![]() |
It's hard to remember all websites/sources.
|
[QUOTE=Aurum;420878]It's hard to remember all websites/sources.[/QUOTE]
Have you heard about Google? It's a little start-up which might help you find things you think you've read.... |
I even searched my bookmarks + history ... I don't even remember if it was a german or english website. This is by far not the only thing I'm working on.
|
[QUOTE=Aurum;420881]I even searched my bookmarks + history ... I don't even remember if it was a german or english website. This is by far not the only thing I'm working on.[/QUOTE]
OK... Understood. But please understand that making a claim, and then not being able to support it, doesn't go down well around here. |
The article also refereed to the sandy brige sata bug and said that a minor design change takes 8 weeks. That's why the workarounds in the microcode are not fixed in the hardware itself. Maybe someone else knows the source I'm talking about.
|
[QUOTE=Aurum;420878]It's hard to remember all websites/sources.[/QUOTE]
<never mind posted without thought again.> is it any of these ? [url]https://www.google.ca/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=%2B%22hardware+errata%22+%2B+%22sata+bug%22[/url] |
[QUOTE=Aurum;420886]The article also refereed to the sandy brige sata bug and said that a minor design change takes 8 weeks. That's why the workarounds in the microcode are not fixed in the hardware itself. Maybe someone else knows the source I'm talking about.[/QUOTE]
Anyone? Anyone at all... Please forgive me for this, but we often have people entering our space who try to distract rather than converge. It is important to have one's "signal to noise ratio" filter set to stun.... |
I found it in my history: [url]http://www.computerbase.de/2015-12/amd-zum-32c3-einblicke-in-die-komplexitaet-eines-x86-prozessordesigns/[/url]
|
[QUOTE=Aurum;420898]I found it in my history: [URL]http://www.computerbase.de/2015-12/amd-zum-32c3-einblicke-in-die-komplexitaet-eines-x86-prozessordesigns/[/URL][/QUOTE]
:tu: Warning! Google Translate link! [URL]https://translate.google.com/translate?sl=de&tl=en&js=y&prev=_t&hl=en&ie=UTF-8&u=http%3A%2F%2Fwww.computerbase.de%2F2015-12%2Famd-zum-32c3-einblicke-in-die-komplexitaet-eines-x86-prozessordesigns%2F&edit-text=[/URL] |
[QUOTE=Aurum;420898]I found it in my history: [url]http://www.computerbase.de/2015-12/amd-zum-32c3-einblicke-in-die-komplexitaet-eines-x86-prozessordesigns/[/url][/QUOTE]
OK... But that article is talking about AMD CPUs. We are talking about Intel CPUs here. Please do try to keep up.... (Just to be clear, this is intentionally confrontational. Deal with it.) |
I managed to make the problem reproduce more quickly
I had an idea to use the Prime 95 v27.9 code to perform a 768K FFT per thread. The input of the FFT is always some fixed random data. So the hash result of the transformed fixed data should be always the same. And it is, most of the time on Skylake. Except that when the project is run from Visual Studio 2015 in Debug mode and the "Pause button is hit", then step through some lines of code then continue. Errors appear immediately. It is spectacular that on Ivy the Pause and step operations work fine and no error appears.
I have isolated this code below and exported it from a c++ dll. I need some expert advice from the developers that the following code performs a FFT of size 768K, exactly as in the v27.9 torture test:[INDENT][B]#define norm_routines 10[/B] [/INDENT][INDENT][B]#define gw_fft(h,a) (*(h)->GWPROCPTRS[0])(a) [/B] [/INDENT][INDENT] [B] _declspec(dllexport) void* __cdecl AllocPrime95Handle()[/B] [B] {[/B] [B] gwhandle *gwdata = new gwhandle();[/B] [B] unsigned long fftlen = 768 * 1024;[/B] [B] unsigned long p = 14942209; //does not matter, used just to initialize. (see the Prime95FFT function).[/B] [B] gwinit(gwdata);[/B] [B] gwset_specific_fftlen(gwdata, fftlen);[/B] [B] gwsetup(gwdata, 1.0, 2, p, -1);[/B] [B] return gwdata;[/B] [B] }[/B] [B] _declspec(dllexport) void __cdecl Prime95FFT(void *handle, __int64 *fastHashOutput) //fastHashOutput has 32 bytes[/B] [B] {[/B] [B] gwhandle *gwdata = (gwhandle*)handle;[/B] [B] int seed = 7; //Use the same calculations.[/B] [B] gwnum s = gwalloc(gwdata);[/B] [B] struct gwasm_data *asm_data = (struct gwasm_data *) gwdata->asm_data;[/B] [B] asm_data->NORMRTN = gwdata->GWPROCPTRS[norm_routines + gwdata->NORMNUM];[/B] [B] asm_data->DESTARG = s;[/B] [B] asm_data->DIST_TO_FFTSRCARG = 0;[/B] [B] asm_data->DIST_TO_MULSRCARG = 0;[/B] [B] asm_data->ffttype = 2; //type 2 = square.[/B] [B] double *startAddress = addr(gwdata, s, 0);[/B] [B] unsigned long dataSize = gwnum_datasize(gwdata);[/B] [B] int n = dataSize / sizeof(double); //n = 808952[/B] [B] __int64 v = (__int64)seed;[/B] [B] for (int i = n; --i >= 0; )[/B] [B] {[/B] [B] v = v * 0x2345987094395 + 1;[/B] [B] startAddress[i] = (double)v; //this is the fixed-random data that is written all the time.[/B] [B] }[/B] [B] gw_fft(gwdata, asm_data);[/B] [B] //sha1::calc(startAddress, dataSize, (unsigned char*)fastHashOutput);[/B] [B] __int64 *startAddressInt64 = (__int64*)startAddress;[/B] [B] __int64 hashResults[4] = { 10000, 20000, 30000, 40000 };[/B] [B] int shifts[4] = { 1, 2, 4, 8 }; //for primes: 3, 5, 17 and 256.[/B] [B] for (int i = n; --i >= 0; )[/B] [B] {[/B] [B] hashResults[i & 3] += (hashResults[i & 3] << shifts[i & 3]) + startAddressInt64[i];[/B] [B] }[/B] [B] memcpy(fastHashOutput, hashResults, sizeof(hashResults));[/B] [B] gwfree(gwdata, s);[/B] [B] }[/B] [/INDENT]I also need to know if I use the functions: [B]addr [/B]and[B] gwnum_datasize [/B]correctly so that this code does not touch memory zones outside the transform. I have linked gwnum64.lib (non-debug library) to this project and included all the *.h files from the prime95 v27.9 "gwnum" folder. Thank you! |
| All times are UTC. The time now is 23:23. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.