![]() |
![]() |
#199 |
"Curtis"
Feb 2005
Riverside, CA
17×271 Posts |
![]() |
![]() |
![]() |
![]() |
#200 | ||
Dec 2011
After milion nines:)
138910 Posts |
![]() Quote:
Many users here will say: put one candidate per core and run it. It is optimal. I will say it is all but optimal. Also there was post here or on Primegrid Quote:
|
||
![]() |
![]() |
![]() |
#201 | |
Quasi Admin Thing
May 2005
953 Posts |
![]() Quote:
![]() With LLR 3.8.20, thanks to Batalov, LLR became multithreaded, wich in reality means, that we ventured into a whole new area of unknows. What is know, is that most computers gain, from using more than 1 core per client, if the FFT length is large enough. What appears to make the big difference is that most of our clients still suffers bottlenecks, while the CPU waits for RAM to catch up. This bottleneck is severely reduced by running more cores per client. What works best at your test level on your machine, you have to try and figure out, by timing LLR. But most likely, you are loosing performance, if your are running one core per client. I can give you an example from my Sandy Bridge, it tested base 16 number at around 3999 sec/test at n=2.4M (base 2), now for the same k at n=2517108 it test on 2 cores at around 1899 seconds. This gives a difference between (pre-multithreading) most productive testing scheme and current testing scheme that looks like this: 3 clients running 1 core doing 3*86400s=259200CPU-seconds / 3999 sec/test = 64.82 tests/day 2 clients running 2 cores each doing 2*86400s=172800CPU-seconds / 1899 sec/test = 91,00 test/day So as you can see, even though I'm currently testing an n-value 5% larger than the one completing on a single core, I'm doing 40.1% more work a day (in count of completed tests)... if you count the amount of completed bits, I'll have an even higher productivity gain ![]() But in order for you to find out wether or not you gain from multithreading or not, relies completely on testing done locally on your own system. Take care KEP Ps. The line in llr.ini you need to add is as follows: ThreadsPerTest= ![]() |
|
![]() |
![]() |
![]() |
#202 |
A Sunny Moo
Aug 2007
USA (GMT-5)
11000011010012 Posts |
![]()
Got it. Probably worth trying, then, since (per pepi37's 8x rule for determining the working set of a given FFT) a 560K FFT x 8 = 4480 kB working set per core i.e. 8960 kB for two cores. Clearly much larger than my 4 MB L3 cache; even 1 core would still be larger, but perhaps less memory bandwidth pressure would still be a good thing.
I'll try this as soon as I get the chance - thanks for all the info! ![]() |
![]() |
![]() |
![]() |
#203 | |
A Sunny Moo
Aug 2007
USA (GMT-5)
3×2,083 Posts |
![]() Quote:
I say "at least" because normal variation in test times due to background processes, etc. makes it difficult to get an exact figure. I took a conservative estimate by taking the longest -t2 test time I observed, multiplying it by 2, and comparing that with the shortest -t1 test time I had on record from the last few days. So I'm probably getting more than 8% improvement on the whole. It's tough to get a more accurate measurement on this computer because it's also used for "real work" and doesn't run PRPnet continuously (just when I'm not using it). Hopefully I'll be able to get some more accurate numbers from some of my other boxes that crunch (sort of) full time. |
|
![]() |
![]() |
![]() |
#204 |
A Sunny Moo
Aug 2007
USA (GMT-5)
3·2,083 Posts |
![]()
After another day's worth of running with -t2, I have a better sample size of test timings to work with and it looks like I am getting a solid 16% reduction in average test times (averaged over the last 5 tests with -t2 compared with 5 tests on one of two single-threaded clients, normalized by multiplying the -t2 average time by 2).
Since even one client is still too big to fit in my 4 MB L3 cache (the x8 rule says that a 560K FFT = 4480 kB memory working set), it appears that this benefit is purely from reducing pressure on the memory bus. There should be a lot more to be gained for tests small enough to fit entirely within the cache when appropriately multithreaded. And, the benefit would be even greater for newer CPUs which are more prone to outrun their memory buses (my Sandy Bridge is relatively old at this point). Given this, I can totally see where those 40% and 70% productivity increases KEP cites are coming from! Thanks guys for pointing this out to me - I can see why everyone's talking about it as a big revolution! ![]() (And yes, this exchange should definitely be moved to the "Software/Instructions/Questions" thread....) Last fiddled with by mdettweiler on 2017-06-05 at 07:34 |
![]() |
![]() |
![]() |
#205 |
Quasi Admin Thing
May 2005
953 Posts |
![]() |
![]() |
![]() |
![]() |
#206 |
Jul 2016
1 Posts |
![]()
Dear Mr. Barnes!
Is it possible to reserve the following interval: Riesel b=247 k=1 to 469184 (all k, i want to start a new range) n=1 to 2^12=4096 I have only one PC and I want to check this interval with the program “Mathematica”. Should I send you the results in this forum? I want to transfer the results in an Excel-File for better reading. Or do you prefer another program (not Mathematica)? If yes, how can I start ist? With regards! |
![]() |
![]() |
![]() |
#207 | |
Banned
"Luigi"
Aug 2002
Team Italia
2·5·479 Posts |
![]() Quote:
Luigi |
|
![]() |
![]() |
![]() |
#208 | |
"Mark"
Apr 2003
Between here and the
616410 Posts |
![]() Quote:
Also, you need to reserve n to 10,000 in the minimum, but preferably to 25,000. |
|
![]() |
![]() |
![]() |
#209 |
May 2007
Kansas; USA
19×541 Posts |
![]()
Wikimax,
We cannot accept reservations for bases to n<10000. You will need to read our software thread. As discussed by others, you will need to use the appropriate software to test the bases. Mathematica will be very inefficient for doing these searches. Gary |
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Useless SSE instructions | __HRB__ | Programming | 41 | 2012-07-07 17:43 |
Questions about software licenses... | WraithX | GMP-ECM | 37 | 2011-10-28 01:04 |
Software/instructions/questions | gd_barnes | No Prime Left Behind | 48 | 2009-07-31 01:44 |
Instructions to manual LLR? | OmbooHankvald | PSearch | 3 | 2005-08-05 20:28 |
Instructions please? | jasong | Sierpinski/Riesel Base 5 | 10 | 2005-03-14 04:03 |