![]() |
![]() |
#23 | |
"Robert Gerbicz"
Oct 2005
Hungary
2×7×103 Posts |
![]() Quote:
But in practice you could be even right, not tried, I've the non-K version of that processor and different speed of Ram. Last fiddled with by R. Gerbicz on 2017-10-12 at 18:12 |
|
![]() |
![]() |
![]() |
#24 |
"Dana Jacobsen"
Feb 2011
Bangkok, TH
22·227 Posts |
![]()
My previous work runs have all been sb=24 on that machine because way back I tested and it was best. I should try looping the test again with 24 and 26 once the new job finishes.
Actually, I should be able to compare a long run with t=8 in the same 9e18 - 9.25e18 portion. I ran 4000-4500 with sb=24 and got 44.04e9. I'm running 7300-7600 now with sb=25. Oh wait, sigh, the earlier one was with gap10 and the new one gap11 so too many changes to properly compare. I'll just run the same test with different sb values. |
![]() |
![]() |
![]() |
#25 |
"Dana Jacobsen"
Feb 2011
Bangkok, TH
90810 Posts |
![]() sb=26 with 1,2,3,4,6,8 threads: sb-26/t1/nohup.out 7.78e9 n/sec.; time=11821 sec. sb-26/t2/nohup.out 16.31e9 n/sec.; time=5641 sec. sb-26/t3/nohup.out 26.83e9 n/sec.; time=3430 sec. sb-26/t4/nohup.out 29.28e9 n/sec.; time=3142 sec. sb-26/t6/nohup.out 34.80e9 n/sec.; time=2644 sec. sb-26/t8/nohup.out 33.82e9 n/sec.; time=2721 sec. sb=25 with 1,2,3,4,6,8 threads: sb-25/t1/nohup.out 10.95e9 n/sec.; time=8404 sec. sb-25/t2/nohup.out 21.07e9 n/sec.; time=4367 sec. sb-25/t3/nohup.out 30.72e9 n/sec.; time=2995 sec. sb-25/t4/nohup.out 38.68e9 n/sec.; time=2379 sec. sb-25/t6/nohup.out 37.28e9 n/sec.; time=2468 sec. sb-25/t8/nohup.out 43.73e9 n/sec.; time=2104 sec. sb=24 with 1,2,3,4,6,8 threads: sb-24/t1/nohup.out 11.03e9 n/sec.; time=8345 sec. sb-24/t2/nohup.out 21.55e9 n/sec.; time=4269 sec. sb-24/t3/nohup.out 31.54e9 n/sec.; time=2917 sec. sb-24/t4/nohup.out 40.36e9 n/sec.; time=2280 sec. sb-24/t6/nohup.out 35.93e9 n/sec.; time=2561 sec. sb-24/t8/nohup.out 42.58e9 n/sec.; time=2161 sec. gap11a, sb=24 with 1,2,3,4,6,8 threads (results match gap11 above): sb-24-11a/t1/nohup.out 11.76e9 n/sec.; time=7825 sec. sb-24-11a/t2/nohup.out 22.80e9 n/sec.; time=4035 sec. sb-24-11a/t3/nohup.out 33.48e9 n/sec.; time=2748 sec. sb-24-11a/t4/nohup.out 42.78e9 n/sec.; time=2151 sec. sb-24-11a/t6/nohup.out 37.46e9 n/sec.; time=2456 sec. sb-24-11a/t8/nohup.out 44.64e9 n/sec.; time=2061 sec. All show some dropoff from 1 to 4 threads, but the sb=24 is much better. 8 threads is just a little faster than 4 but not by much. I'm not sure if the 4 thread results would improve if hyperthreading was disabled in BIOS. I'd also wonder if the machine was used for other things if it'd be best to leave at 4 so other processing wouldn't interfere much. gap11a is gap11 with new asm. I am using the full 64-bit mulredc unconditionally as it doesn't seem worth it to conditionally use 63-bit / 64-bit considering our range. Small improvements to mulmod and addmod as well. |
![]() |
![]() |
![]() |
#26 |
"Carlos Pinho"
Oct 2011
Milton Keynes, UK
5×7×139 Posts |
![]()
Dana,
Have you tried to run two instances of the client using 4 threads each and understand if the overall output is higher than running one instance with 8 threads? Carlos |
![]() |
![]() |
![]() |
#27 | |
"Dana Jacobsen"
Feb 2011
Bangkok, TH
22×227 Posts |
![]() Quote:
Another interesting thing to try would be seeing if Windows runs are any different than Linux. I was very disappointed with my laptop runs in a VM. It doesn't have Cygwin so I couldn't easily run the pre-built Windows executables, and my Windows gcc was giving me issues with one of the timing headers gap11.c uses, so I didn't do anything with it. At the moment it's looking like hyperthreading is getting us very little. Something similar should be done for the large gap search. |
|
![]() |
![]() |
![]() |
#28 |
"Antonio Key"
Sep 2011
UK
32×59 Posts |
![]()
Dana,
Have you tried tuning bs for 8 threads? With hyperthreading on you effectively have two cores sharing the same L2 cache, so reducing bs may help. |
![]() |
![]() |
![]() |
#29 |
"Dana Jacobsen"
Feb 2011
Bangkok, TH
11100011002 Posts |
![]()
I did tuning on both sb and bs back in the 4e18 ranges, maybe early 5e18. Have not since. Once the next real range work finishes in ~4 days I'll try some more benchmarks on bs=...
For those looking at the trend above, I ran sb=23 with 8 threads and it came out slower than sb=24. Not unexpected, but I wanted to verify. |
![]() |
![]() |
![]() |
#30 |
"Carlos Pinho"
Oct 2011
Milton Keynes, UK
5×7×139 Posts |
![]()
Basically we need to buy more computers....
|
![]() |
![]() |
![]() |
#31 |
Jun 2015
Vallejo, CA/.
7×139 Posts |
![]() |
![]() |
![]() |
![]() |
#32 | |
"Dana Jacobsen"
Feb 2011
Bangkok, TH
22·227 Posts |
![]() Quote:
bs13/nohup.out:Search used 2014 sec. (Wall clock time), 15937.78 cpu sec. bs14/nohup.out:Search used 1993 sec. (Wall clock time), 15811.31 cpu sec. bs15/nohup.out:Search used 1992 sec. (Wall clock time), 15762.47 cpu sec. bs16/nohup.out:Search used 1982 sec. (Wall clock time), 15724.18 cpu sec. bs17/nohup.out:Search used 1980 sec. (Wall clock time), 15713.83 cpu sec. bs18/nohup.out:Search used 2061 sec. (Wall clock time), 16284.61 cpu sec. All with sb 24 and 8 threads. Looks like bs 17 is the winner. None of them are horribly off, but 4% faster is definitely worth it. |
|
![]() |
![]() |
![]() |
#33 |
"Dana Jacobsen"
Feb 2011
Bangkok, TH
22·227 Posts |
![]()
I ran sets of 10 residues in the 6300-6400 range with different threads on my laptop. g12_haswell Windows executable, sb=24, bs=13, 10GB (machine has 16GB, for these tests I checked there was plenty of free memory).
10.18 n/s 1 thread 18.17 n/s 2 threads 25.34 n/s 3 threads 30.38 n/s 4 threads 28.60 n/s 5 threads 30.86 n/s 6 threads 33.11 n/s 7 threads 35.03 n/s 8 threads Lenovo Y700 i7-6700HQ 2.6GHz dual channel 2x8GB DDR4-2400 In other news with latest range and g12: 26.57 n/s AWS c4.2xlarge 8 thr about $50/month 51.25 n/s AWS c4.4xlarge 16 thr about $100/month |
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
NAS hardware | VictordeHolland | Hardware | 5 | 2015-03-05 23:37 |
Possible hardware errors... | SverreMunthe | Hardware | 16 | 2013-08-19 14:39 |
GPU hardware problem | Prime95 | GPU Computing | 33 | 2013-07-12 05:25 |
Hardware error | Citrix | Prime Sierpinski Project | 12 | 2006-06-07 09:40 |
Hardware Problem! Help!! | matermoh | Hardware | 14 | 2004-12-09 05:19 |