![]() |
Latest test from dbaugh:
[QUOTE=dbaugh] 2097152 FCs copied in 1.35 ms (6201.88 MB/s), proc'd in 3.59 ms (584.32 M/s)[/QUOTE] Both the transfer rates of 6GB/s and the theoretical peak throughput of 584M/s are way beyond what I've seen so far with mfakto. It shows there is no immediate need for a GCN-version, but VectorSize=2 does the trick. His 7970 is overclocked from 925 to 1125MHz, but even at default clock this would be the fastest AMD GPU right now. Thanks for your tests, I hope you can get close to this throughput when running multiple instances ... |
mfakto switches
Is there a switch or something that will cause mfakto to include in the results file the datetime of when the result was written? This would be very useful for calculating wall time for different efforts and CPU/GPU loads.
|
[QUOTE=dbaugh;304065]Is there a switch or something that will cause mfakto to include in the results file the datetime of when the result was written? This would be very useful for calculating wall time for different efforts and CPU/GPU loads.[/QUOTE]
Try adding to mfakto.ini [code] TimeStampInResults=1[/code]How many instances do you currently run on your 7970? I assume to get to SievePrimes > 20k you'll need at least 4? |
Thanks for the info. I am at 67% on my CPU running 2x mfakto and "other stuff". With hyperthreading 50% would mean no idle physical cores. Two instances of mfakto on the 7970 averages 91% GPU load. The sieve primes have settled to around 500 on each. Setting the memory clock to 975 instead of 1575 cost me 0.03% in performance and saves me 3% in power usage.
|
[QUOTE=dbaugh;304114]Thanks for the info. I am at 67% on my CPU running 2x mfakto and "other stuff". With hyperthreading 50% would mean no idle physical cores. Two instances of mfakto on the 7970 averages 91% GPU load. The sieve primes have settled to around 500 on each. Setting the memory clock to 975 instead of 1575 cost me 0.03% in performance and saves me 3% in power usage.[/QUOTE]
500 is probably too low to be maximally efficient. I think once you go below 5k you are probably checking way too many non primes on your gfx card. Try raising that and see how it affects your throughput. |
[QUOTE=dbaugh;304114]Thanks for the info. I am at 67% on my CPU running 2x mfakto and "other stuff". With hyperthreading 50% would mean no idle physical cores. Two instances of mfakto on the 7970 averages 91% GPU load. The sieve primes have settled to around 500 on each. Setting the memory clock to 975 instead of 1575 cost me 0.03% in performance and saves me 3% in power usage.[/QUOTE]
[QUOTE=KyleAskine;304174]500 is probably too low to be maximally efficient. I think once you go below 5k you are probably checking way too many non primes on your gfx card. Try raising that and see how it affects your throughput.[/QUOTE] Yes, 500 probably makes it rather inefficient. If you're at 67% total CPU load, adding one or two more mfakto instances would probably increase the throughput, even if they shared a physical core. There are a couple of instructions that the hyper-threads of one core can really do in parallel, which can help here quite a bit. On the other hand, they share the L1-cache which is heavily used when sieving. I'll see that I can provide a binary using a little less cache and thus should be better with hyper-threads. |
Thanks Bdot, for your work on this!
[SIZE=1][COLOR=Gray](If only we had CLLucas.......... Just kidding :P)[/COLOR][/SIZE] |
1 Attachment(s)
(moving this over from the CUDALucas thread)
[QUOTE=kracker;306259]I would like to know how well the 7850 performs, since I have a 7770, (one step down), just curious how much of a performance increase going up to it (not that I'll get it, just curious):smile:[/QUOTE] I picked this card because it is the biggest AMD that can live on a 6-pin power-cable, so I don't need to upgrade my PSU. It's factory-OC'ed to 975MHz (from 860), resulting in 257M/s theoretical max throughput. At about the same power draw, my HD5770 delivered ~160M/s. Selftest OK, GPU temp 60C at 99% load. Regarding the comparison to the 7770, I've noted these figures: [code] BARRETT73_MUL15; // 165M/s on HD7770, 258M/s on HD7850 (975MHz) BARRETT79_MUL32; // 137M/s 212M/s BARRETT72_MUL24; // 135M/s 209M/s _71BIT_MUL24; // 115M/s 178M/s BARRETT92_MUL32; // 106M/s 163M/s [/code]Which is a bit more than 50% speedup. At default clock, that would be 35%, matching the GFLOPS ratio of 1,375. And finally, the reason why my old 5770 lately ran as hot as 87C. |
Nice... better performance than I thought.
My PSU only has 1 6pin, so... :D This is something I NEE*zip* I wish i had. P.S.: Have you tried out getting the dust out of that fan? I had one of my computers in which my cpu fan was full of dust, it was at 75C on full load, dropped to about (right now it is 57C) |
[QUOTE=kracker;306292]P.S.: Have you tried out getting the dust out of that fan? I had one of my computers in which my cpu fan was full of dust, it was at 75C on full load, dropped to about (right now it is 57C)[/QUOTE]
Consider filtering the air intake fan(s) for the case, too. |
[QUOTE=kladner;306311]Consider filtering the air intake fan(s) for the case, too.[/QUOTE]
Ahh, that's probably what I should do w mine :) P.S.: I have a little problem... I usually find one factor a day at the range and depth I'm on now.... but for the last three days, NO FACTOR!! I don't know what's wrong... :no: ... ... ... ... :cmd::cmd: |
| All times are UTC. The time now is 23:03. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.