![]() |
[QUOTE=Bdot;377901]Oh-oh, things like these happen when in a hurry without checking ... even the smallest fix can introduce new bugs :gah:[/QUOTE]
Well a checked out as of today VS2013 x64 with full optimizations, AVX and LTO works fine on R7 260x's. I've been tempted to enable my motherboards VirtuMVP just to add some Intel HD 4000 to the mix but the free Virtu Asus download isn't 8.1 compatible and I'm not buying the $30 real Virtu software |
I've put the win-64 version of mfakto-0.15pre1 on the [URL="http://www.mersenneforum.org/mfakto/mfakto-0.15pre1/"]ftp[/URL]. It is [B]NOT YET FULLY TESTED FOR PRODUCTION[/B]!
This version should have all the fixes for IntelHD as suggested by George, however, lacking such a system I could not test that. It comes with runtime-modifiable settings: press 'm' to see this menu: [code] Settings menu Num Setting Current value (shortcut outside of the menu for de-/increasing this setting) 1 SievePrimes = 97990 (-/+) 2 SieveSize = 35 (s/S) 3 SieveProcessSize = 35 (p/P) 4 SievePrimesAdjust = 0 (a/A) 5 FlushInterval = 0 (f/F) 6 Verbosity = 1 (v/V) 7 PrintMode = 0 (r/R) 8 Kernel = cl_barrett15_73_2 (k/K) 0 Done (continue factoring) -1 Exit mfakto (q/Q) Change setting number: [/code]Factoring is paused while the menu is shown. While in the menu, select by number. Outside the menu, pressing the keys in parenthesis changes the respective value is steps without pausing TF. Keypresses are evaluated only between classes. Any required reinitializations are done automatically. Changing the kernel is not yet implemented. This feature is intended to let you find the best settings much easier. Please try to break it (and let me know what you did to break it). This includes messing up the settings while running the selftest - there must be no missed factors no matter what you try. Let me know if you see the need for other parameters to change at runtime (e.g. VectorSize, SieveOnGPU or MoreClasses - but they would require recompilation of the kernels, which I did not yet implement). I'm not yet convinced of the usability of this feature - let me know if you have ideas how to improve it. And of course, this version should succeed all self tests and not be slower than 0.14 (but also not faster - the kernels are unchanged, apart from the INTEL definitions). |
It works for me now. I get around 18GHzDays/Day throughput. Is there any way to increase this or does this sound around optimal?
|
Most likely, this is about the max you can get. You can try different VectorSize: from George's results I understood VectorSize=4 is fastest.
If it is only about speed for mfakto, then switch to CPU sieving (SieveOnGPU=0) and select a high SievePrimes (e.g. 200000). This will use a portion of a CPU core to help the HD4600. With GPU sieving, the other options are to play around while it is running; see my previous post. SievePrimes, SieveSize, SieveProcessSize are the adjustable values that affect performance, maybe also FlushInterval. Play around with it and tell us what the optimal settings are :smile:. |
1 Attachment(s)
I have no issue running the selftest now, though did still have to specify -d 11 for it to recognize the GPU. All tests were passed. All test still passed even when changing the various settings from the menu.
If I run with option --perftest, after a little bit of output the program generates a generic Windows error indicating that the program has stopped working. Attached are the selftest and perftest runs. |
Thanks a lot for your testing![QUOTE=potonono;378772]I have no issue running the selftest now, though did still have to specify -d 11 for it to recognize the GPU. All tests were passed. All test still passed even when changing the various settings from the menu.
[/QUOTE] I'm making a few changes now, maybe -d 11 will no longer be needed with the next version. [QUOTE=potonono;378772] If I run with option --perftest, after a little bit of output the program generates a generic Windows error indicating that the program has stopped working. Attached are the selftest and perftest runs.[/QUOTE] Oh, right. George already reported that but I did not yet act on this ... maybe next version :smile: If you already tested some real trial factoring, could you please report what the best values for SievePrimes, SieveSize, SieveProcessSize and maybe VectorSize are? |
Started st2. :smile:
|
Did anyone of you try the new feature on a real exponent to find more efficient settings than the defaults? If you try, then you'll notice that the best SievePrimes, SieveSize, SieveProcessSize may be different for different TF jobs ... and the good thing: any improvements you find by using this version can be applied to version 0.14 by writing them to the mfakto.ini file.
I'm interested to hear about any improvements and what you changed. |
[QUOTE=Bdot;378942]Did anyone of you try the new feature on a real exponent to find more efficient settings than the defaults? If you try, then you'll notice that the best SievePrimes, SieveSize, SieveProcessSize may be different for different TF jobs ... and the good thing: any improvements you find by using this version can be applied to version 0.14 by writing them to the mfakto.ini file.
I'm interested to hear about any improvements and what you changed.[/QUOTE] I'll do/try that. :smile: On another note... [code] Selftest statistics number of tests 287351 successful tests 287350 no factor found 1 selftest FAILED! ERROR: selftest failed, exiting. [/code] [code] ######### testcase 2584/32927 (M59000521[82-83]) ######### Starting trial factoring M59000521 from 2^82 to 2^83 (16600.99GHz-days) Using GPU kernel "cl_barrett15_83_gs_4" Date Time | class Pct | time ETA | GHz-d/day Sieve Wait Jul 22 22:31 | 3828 0.1% | 1.094 n.a. | n.a. 81205 0.00% no factor for M59000521 from 2^82 to 2^83 [mfakto 0.15pre1-Win cl_barrett15_83_gs_4] ERROR: selftest failed for M59000521 (cl_barrett15_83_gs) no factor found tf(): total time spent: 1.094s Starting trial factoring M59000521 from 2^82 to 2^83 (16600.99GHz-days) Using GPU kernel "cl_barrett15_88_gs_4" Date Time | class Pct | time ETA | GHz-d/day Sieve Wait Jul 22 22:31 | 3828 0.1% | 1.221 n.a. | n.a. 81205 0.00% M59000521 has a factor: 6190124149267876918004257 found 1 factor for M59000521 from 2^82 to 2^83 [mfakto 0.15pre1-Win cl_barrett15_88_gs_4] selftest for M59000521 passed (cl_barrett15_88_gs)! tf(): total time spent: 1.221s Starting trial factoring M59000521 from 2^82 to 2^83 (16600.99GHz-days) Using GPU kernel "cl_barrett32_87_gs_4" Date Time | class Pct | time ETA | GHz-d/day Sieve Wait Jul 22 22:31 | 3828 0.1% | 0.710 n.a. | n.a. 81205 0.00% M59000521 has a factor: 6190124149267876918004257 found 1 factor for M59000521 from 2^82 to 2^83 [mfakto 0.15pre1-Win cl_barrett32_87_gs_4] selftest for M59000521 passed (cl_barrett32_87_gs)! tf(): total time spent: 0.711s Starting trial factoring M59000521 from 2^82 to 2^83 (16600.99GHz-days) Using GPU kernel "cl_barrett32_88_gs_4" Date Time | class Pct | time ETA | GHz-d/day Sieve Wait Jul 22 22:31 | 3828 0.1% | 0.730 n.a. | n.a. 81205 0.00% M59000521 has a factor: 6190124149267876918004257 found 1 factor for M59000521 from 2^82 to 2^83 [mfakto 0.15pre1-Win cl_barrett32_88_gs_4] selftest for M59000521 passed (cl_barrett32_88_gs)! tf(): total time spent: 0.730s Starting trial factoring M59000521 from 2^82 to 2^83 (16600.99GHz-days) Using GPU kernel "cl_barrett32_92_gs_4" Date Time | class Pct | time ETA | GHz-d/day Sieve Wait Jul 22 22:31 | 3828 0.1% | 0.831 n.a. | n.a. 81205 0.00% M59000521 has a factor: 6190124149267876918004257 found 1 factor for M59000521 from 2^82 to 2^83 [mfakto 0.15pre1-Win cl_barrett32_92_gs_4] selftest for M59000521 passed (cl_barrett32_92_gs)! tf(): total time spent: 0.831s [/code] |
I haven't had much chance to test yet, except I can confirm VectorSize=4 is best on mine too.
|
VectorSize=4 increases throughput by about 1 GHz-d/Day on my end as well.
|
| All times are UTC. The time now is 23:04. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.