![]() |
|
|
#1277 |
|
Nov 2010
Germany
3·199 Posts |
That is the new "GCN3" GPUType setting, coming with mfakto 0.15. In version 0.14, there is no way to let mfakto select your fastest kernel, cl_barrett32_76_gs for 73- or 74-bit-tests. By setting GPUType=VLIW5, you will get cl_barrett32_77_gs, which is already quite a bit faster than cl_barrett15_73_gs (for 73 bit) and a lot faster than cl_barrett15_82_gs (which would normally be used for 74-bit-tests).
|
|
|
|
|
|
#1278 |
|
Mar 2014
2×52 Posts |
Bit of noob question here.
I have just gotten a new computer at my work. I was using mfaktc on my previous computer without any difficulty. The new computer has the AMD 7570 video card (obviously its an office machine not a gaming machine, this is their standard video card in new machines still here) rather than the NVIDIA so I am using mfakto for the first time now. I was able to start and configure it without any trouble but GUI responsiveness is extremely bad. I changed NumStreams from 3 to 2 and 1, and GridSize all they way down to 0 one step at a time, but it made very little difference -- mfakto is still running almost as fast as before and the GUI is still very slow (~3 seconds to respond to a click, though the mouse position updates instantly.) I experienced some modest lag before using mfaktc but it was usually bearable. This is not. Do I have any other configuration options to improve responsiveness, or do I just need to plan on only running mfakto at night? mfakto is using cl_barret32_77_gs_4 and reporting 52 GHz/day progress on an exponent. |
|
|
|
|
|
#1279 | |
|
"Mike"
Aug 2002
22·29·71 Posts |
Quote:
|
|
|
|
|
|
|
#1280 |
|
Mar 2014
2×52 Posts |
Reading that thread called my attention to changing GPUSieveSize, which was just what was needed. Thank you!
The throughput only changed by ~2% when I lowered it enough to regain control of the machine, so no need for me to actually use the switching program. Last fiddled with by Siegmund on 2014-12-15 at 19:44 |
|
|
|
|
|
#1281 | |
|
Nov 2010
Germany
3·199 Posts |
Quote:
First of all, the settings you mentioned are ignored for the GPU sieve. When running the GPU sieve, you should tweak these parameters for better responsiveness: low but non-zero FlushInterval (3, 2, 1), lower GPUSieveSize and lower GPUSieveProcessSize should each help. I'd try tweaking them in this order for best responsiveness-gain per performance-loss ratio. On the other hand, depending on the CPU power you have available, you could also try switching to the CPU sieve (SieveOnGPU=0). If you have multiple CPU cores, you can run multiple mfakto instances, so that each of them can use a higher SievePrimes value, increasing the overall throughput. As most of the delays come from the GPU sieve kernel, this option may result in good responsiveness, even at higher NumStreams and GridSize values. Plus: the GPU sieve kernel is not very efficient on your GPU. To run multiple instances, you either create copies in different directories (necessary if you use MISFIT), or only use separate worktodo and ini files (see -i option). Last fiddled with by Bdot on 2014-12-15 at 19:47 Reason: Plus: the GPU sieve kernel is not very efficient on your GPU. |
|
|
|
|
|
|
#1282 |
|
"Mr. Meeseeks"
Jan 2012
California, USA
23·271 Posts |
Hmm... with the latest 14.12(Omega) drivers I'm getting a few errors on -st2.
Code:
Selftest statistics number of tests 335478 successful tests 335469 no factor found 9 selftest FAILED! ERROR: selftest failed, exiting. Code:
no factor for M67094119 from 2^81 to 2^82 [mfakto 0.15pre5-MGW cl_barrett15_82_gs_2] no factor for M45448679 from 2^81 to 2^82 [mfakto 0.15pre5-MGW cl_barrett15_82_gs_2] no factor for M30568231 from 2^81 to 2^82 [mfakto 0.15pre5-MGW cl_barrett15_82_gs_2] no factor for M71065531 from 2^81 to 2^82 [mfakto 0.15pre5-MGW cl_barrett15_82_gs_2] no factor for M72067427 from 2^82 to 2^83 [mfakto 0.15pre5-MGW cl_barrett15_83_gs_2] no factor for M52031087 from 2^82 to 2^83 [mfakto 0.15pre5-MGW cl_barrett15_83_gs_2] no factor for M49346867 from 2^82 to 2^83 [mfakto 0.15pre5-MGW cl_barrett15_83_gs_2] no factor for M45588523 from 2^87 to 2^88 [mfakto 0.15pre5-MGW cl_barrett15_88_gs_2] no factor for M71115521 from 2^87 to 2^88 [mfakto 0.15pre5-MGW cl_barrett15_88_gs_2] |
|
|
|
|
|
#1283 | |
|
Nov 2010
Germany
10010101012 Posts |
Quote:
But I guess I will need to make it work sooner or later. Hopefully they did not reduce the precision. The failing kernels are the ones that are closest to the limits of what "float" gives you. Can you try if forcing it to GCN2 would make it succeed? |
|
|
|
|
|
|
#1284 | |
|
"Mr. Meeseeks"
Jan 2012
California, USA
23·271 Posts |
Quote:
|
|
|
|
|
|
|
#1285 |
|
Nov 2010
Germany
3·199 Posts |
Thank you for the test, it kind of confirms a change in AMD's single precision calculations. I will still debug it to see where exactly the error gets too big, but the solution seems to point to double precision for these kernels.
|
|
|
|
|
|
#1286 |
|
Nov 2010
Germany
3×199 Posts |
Did you check the performance of the 14.12 driver with mfakto 0.14?
My HD7950 dropped from 430GHz with the older driver to 240GHz after installing 14.12! Still 100% load, similar power consumption. The GCN binary shows slightly higher register usage (from 52 to 59). The new binary code is even a bit smaller and does not have bad memory access patterns. Completely puzzled what's going on. But most importantly, I cannot reproduce the -st2 failure! 0.14 as well as 0.15pre5 do find all factors! With that I will do a change that I wanted to do since a long time (some MODBASECASE check error that only occurs in these 6x15-bit kernels). When done, you will need to test for me ... ![]() But I'll try to get rid of this driver as soon as possible! Maybe a little more checking with AMD's CodeXL - but if that cannot tell why the performance is so poor, I can only recommend to stay below 14.12. |
|
|
|
|
|
#1287 |
|
Nov 2010
Germany
3·199 Posts |
Hmm. Weird. Now it is at 424 GHz ... I'll monitor it. Looks like something ran in the background ...
|
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| gpuOwL: an OpenCL program for Mersenne primality testing | preda | GpuOwl | 2718 | 2021-07-06 18:30 |
| mfaktc: a CUDA program for Mersenne prefactoring | TheJudger | GPU Computing | 3497 | 2021-06-05 12:27 |
| LL with OpenCL | msft | GPU Computing | 433 | 2019-06-23 21:11 |
| OpenCL for FPGAs | TObject | GPU Computing | 2 | 2013-10-12 21:09 |
| Program to TF Mersenne numbers with more than 1 sextillion digits? | Stargate38 | Factoring | 24 | 2011-11-03 00:34 |