mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2013-02-28, 23:09   #1
Octopuss
 

2·37·43 Posts
Default FFT size explanation needed

I am trying to come up with such numbers for custom test that would allow me to fully test just the CPU. The way I understand it the Small FFTs test stresses the CPU a lot, but not fully due to different cache sizes (the numbers there haven't changed for years so there is probably some room left).

I don't understand the FFT size thing though. I read the number is not the memory it would use. How does it work then? Could anyone shed a bit of light into this please?

Explanation: I am facing a weird problem where Small FFTs pass for some hours without problems and so does the somewhat customized Blend, but In-place one fails anywhere between less than five and 30 minutes. Thus I got the idea of trying to kind "isolate" the problem by making sure I really only test one thing at time.
  Reply With Quote
Old 2013-03-01, 00:15   #2
Mini-Geek
Account Deleted
 
Mini-Geek's Avatar
 
"Tim Sorbera"
Aug 2006
San Antonio, TX USA

17·251 Posts
Default

The FFT size indicates how many "words" are used in the FFT. 1 "word" = 16 8 bytes. So if you are running a single thread of an in-place FFT of size 1024K, you can expect Prime95 to use about 8MB+ of RAM for that FFT.

Last fiddled with by Mini-Geek on 2013-03-01 at 00:31
Mini-Geek is offline   Reply With Quote
Old 2013-03-01, 00:21   #3
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

165468 Posts
Default

The FFT size does refer to how many "words" are used. A "word" is one double precision float or 8 bytes. However, there is also some sin/cos data required by the FFT as well as some other constants that are way above the scope of this conversation. Also, there is some memory used for the code.

Thus a 4K FFT will use 32KB of data plus some for sin/cos and other data. It would probably fit in 64KB of L2 cache.
Prime95 is offline   Reply With Quote
Old 2013-03-01, 09:22   #4
Octopuss
 

33×7×47 Posts
Default

Thanks a lot. Can you estimate the total memory usage by say 4K then? I guess I can just multiply that value as much as I like to check versus a particular CPU.

edit:
Is this independant on each core? Meaning should I choose such FFT size that just fits into the cache of a core, and not take the amount of cores into consideration?
Also, does hyperthreading change this at all? Since it creates two threads per core, shouldn't I use half the FFT size instead so it still fits?

Last fiddled with by Octopuss on 2013-03-01 at 09:29
  Reply With Quote
Old 2013-04-05, 08:14   #5
Octopuss
 

24·3·7·19 Posts
Default

No answer? :(
  Reply With Quote
Old 2013-04-05, 08:57   #6
Mr. P-1
 
Mr. P-1's Avatar
 
Jun 2003

49116 Posts
Default

As far as I'm aware, each physical core is fully independent of the others, so has its own Cache. Hyperthreaded cores share resources including cache with their respective physical core.
Mr. P-1 is offline   Reply With Quote
Old 2013-04-05, 12:18   #7
Aramis Wyler
 
Aramis Wyler's Avatar
 
"Bill Staffen"
Jan 2013
Pittsburgh, PA, USA

23×53 Posts
Default

Also, you may want to turn hyperthreading off for this. It doesn't help, really since any work they do takes work away from the thread it's sharing a core with. Just makes things more complicated. Also, though the level 1 cache is not shared between cores, each pair of cores in an intel setup does share it's level 2 cache with it's partner core.

Last fiddled with by Aramis Wyler on 2013-04-05 at 12:21 Reason: Core 2 duo architecture.
Aramis Wyler is offline   Reply With Quote
Old 2013-04-05, 19:33   #8
Octopuss
 

22·1,747 Posts
Default

I know it's not ideal that way, but I heard stability is vastly different with HT enabled and disabled. If that's true, testing with HT off would not help me at all when I normally do use it.
  Reply With Quote
Old 2013-04-06, 05:33   #9
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3×29×83 Posts
Default

Quote:
Originally Posted by Octopuss View Post
I know it's not ideal that way, but I heard stability is vastly different with HT enabled and disabled. If that's true, testing with HT off would not help me at all when I normally do use it.
Not really. The only difference is that sometimes HT can stress a core more than when it's off, but the difference isn't likely to make a difference. Stress test with HT on if you want to be extra sure.
Dubslow is offline   Reply With Quote
Old 2013-04-06, 09:56   #10
Phantomas
 
Phantomas's Avatar
 
Oct 2008
Germany, Hamburg

5·13 Posts
Default

I'm running a i7-3770K. All my test didn't showed any advantage by using more then 4 cores. It worked best for me to use core 0,2,4,6, and leave the others. The HT Cores are interfering too much with the real cores.

If you start 4 worker thrads there isn't any problem, but as soon you try to attach some helper threads to them, you will normaly have to use the cores 1,3,5,7, because prime95 simple uses the next one.

To avoid this I'm using this setting
AffinityScramble2=02467531
in local.txt

With this mask set, prime95 will use the cores 0,2 for the first workerwindow and 4,6 for the other, before the other cores.

Core 1,3,5,7 are running mfaktc on my system and (because of having an old card which needs to sieve on the GPU) it has nearly no impact on the prime95 instances.

Last fiddled with by Phantomas on 2013-04-06 at 10:00
Phantomas is offline   Reply With Quote
Old 2013-04-06, 11:57   #11
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT/BST)

23×3×5×72 Posts
Default

I think some people are talking about cross purposes here. Hyperthreading off is what most GIMPS people use as it doesn't provide any real benefit for us(as the code is so optimised) and we dislike having twice as many cores to handle, work half the speed etc.
Hyperthreading does provide a little more heat, power usage etc so I would recommend stress testing with it if that is what you will be running with. There isn't a huge difference on a lot of systems but some systems that are stable without hyperthreading produce errors when it is turned on.
henryzz is online now   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
Laymans explanation of RSA encryption Fusion_power Programming 3 2013-11-04 20:50
A simple explanation of NFS? paul0 Factoring 5 2011-11-02 23:21
Explanation for simpleton please. Flatlander Science & Technology 15 2011-08-06 13:32
Bounds explanation Uncwilly Lounge 4 2011-04-01 19:15
explanation on polynomial firejuggler Aliquot Sequences 7 2010-05-29 02:46

All times are UTC. The time now is 16:52.


Fri Jul 16 16:52:00 UTC 2021 up 49 days, 14:39, 1 user, load averages: 0.76, 1.32, 1.55

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.