![]() |
|
|
#1 |
|
2·37·43 Posts |
I am trying to come up with such numbers for custom test that would allow me to fully test just the CPU. The way I understand it the Small FFTs test stresses the CPU a lot, but not fully due to different cache sizes (the numbers there haven't changed for years so there is probably some room left).
I don't understand the FFT size thing though. I read the number is not the memory it would use. How does it work then? Could anyone shed a bit of light into this please? Explanation: I am facing a weird problem where Small FFTs pass for some hours without problems and so does the somewhat customized Blend, but In-place one fails anywhere between less than five and 30 minutes. Thus I got the idea of trying to kind "isolate" the problem by making sure I really only test one thing at time. |
|
|
|
#2 |
|
Account Deleted
"Tim Sorbera"
Aug 2006
San Antonio, TX USA
17·251 Posts |
The FFT size indicates how many "words" are used in the FFT. 1 "word" =
Last fiddled with by Mini-Geek on 2013-03-01 at 00:31 |
|
|
|
|
|
#3 |
|
P90 years forever!
Aug 2002
Yeehaw, FL
165468 Posts |
The FFT size does refer to how many "words" are used. A "word" is one double precision float or 8 bytes. However, there is also some sin/cos data required by the FFT as well as some other constants that are way above the scope of this conversation. Also, there is some memory used for the code.
Thus a 4K FFT will use 32KB of data plus some for sin/cos and other data. It would probably fit in 64KB of L2 cache. |
|
|
|
|
|
#4 |
|
33×7×47 Posts |
Thanks a lot. Can you estimate the total memory usage by say 4K then? I guess I can just multiply that value as much as I like to check versus a particular CPU.
edit: Is this independant on each core? Meaning should I choose such FFT size that just fits into the cache of a core, and not take the amount of cores into consideration? Also, does hyperthreading change this at all? Since it creates two threads per core, shouldn't I use half the FFT size instead so it still fits? Last fiddled with by Octopuss on 2013-03-01 at 09:29 |
|
|
|
#5 |
|
24·3·7·19 Posts |
No answer? :(
|
|
|
|
#6 |
|
Jun 2003
49116 Posts |
As far as I'm aware, each physical core is fully independent of the others, so has its own Cache. Hyperthreaded cores share resources including cache with their respective physical core.
|
|
|
|
|
|
#7 |
|
"Bill Staffen"
Jan 2013
Pittsburgh, PA, USA
23×53 Posts |
Also, you may want to turn hyperthreading off for this. It doesn't help, really since any work they do takes work away from the thread it's sharing a core with. Just makes things more complicated. Also, though the level 1 cache is not shared between cores, each pair of cores in an intel setup does share it's level 2 cache with it's partner core.
Last fiddled with by Aramis Wyler on 2013-04-05 at 12:21 Reason: Core 2 duo architecture. |
|
|
|
|
|
#8 |
|
22·1,747 Posts |
I know it's not ideal that way, but I heard stability is vastly different with HT enabled and disabled. If that's true, testing with HT off would not help me at all when I normally do use it.
|
|
|
|
#9 |
|
Basketry That Evening!
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88
3×29×83 Posts |
Not really. The only difference is that sometimes HT can stress a core more than when it's off, but the difference isn't likely to make a difference. Stress test with HT on if you want to be extra sure.
|
|
|
|
|
|
#10 |
|
Oct 2008
Germany, Hamburg
5·13 Posts |
I'm running a i7-3770K. All my test didn't showed any advantage by using more then 4 cores. It worked best for me to use core 0,2,4,6, and leave the others. The HT Cores are interfering too much with the real cores.
If you start 4 worker thrads there isn't any problem, but as soon you try to attach some helper threads to them, you will normaly have to use the cores 1,3,5,7, because prime95 simple uses the next one. To avoid this I'm using this setting AffinityScramble2=02467531 in local.txt With this mask set, prime95 will use the cores 0,2 for the first workerwindow and 4,6 for the other, before the other cores. Core 1,3,5,7 are running mfaktc on my system and (because of having an old card which needs to sieve on the GPU) it has nearly no impact on the prime95 instances. Last fiddled with by Phantomas on 2013-04-06 at 10:00 |
|
|
|
|
|
#11 |
|
Just call me Henry
"David"
Sep 2007
Cambridge (GMT/BST)
23×3×5×72 Posts |
I think some people are talking about cross purposes here. Hyperthreading off is what most GIMPS people use as it doesn't provide any real benefit for us(as the code is so optimised) and we dislike having twice as many cores to handle, work half the speed etc.
Hyperthreading does provide a little more heat, power usage etc so I would recommend stress testing with it if that is what you will be running with. There isn't a huge difference on a lot of systems but some systems that are stable without hyperthreading produce errors when it is turned on. |
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Laymans explanation of RSA encryption | Fusion_power | Programming | 3 | 2013-11-04 20:50 |
| A simple explanation of NFS? | paul0 | Factoring | 5 | 2011-11-02 23:21 |
| Explanation for simpleton please. | Flatlander | Science & Technology | 15 | 2011-08-06 13:32 |
| Bounds explanation | Uncwilly | Lounge | 4 | 2011-04-01 19:15 |
| explanation on polynomial | firejuggler | Aliquot Sequences | 7 | 2010-05-29 02:46 |