![]() |
|
|
#760 |
|
"Svein Johansen"
May 2013
Norway
C916 Posts |
I started 2 instances of Mfakto on my 6970, and I doubled the amount of Ghz days. I tweaked the ini file a little, and now produce around 90 ghz days with 2x instances on this card. It could be an old version of mfakto, ill check when I come home.
|
|
|
|
|
|
#761 | |
|
Nov 2010
Germany
25516 Posts |
Quote:
,Thanks a lot for this test, it shows that we should not submit "no factor" results with this version yet. I will most likely have to create a special debugging version for you as I cannot reproduce the error on my HW. Could be specific to Cayman, or the driver version (I will test that part next). It should not have any consequences for normal runs, as the 82-88 bit kernels will never be selected for testing 64-bit factor candidates, and the smaller kernels did find the factor. However, I need to understand what exactly was going wrong. Once again I'm really happy about the hugely extended -st2 selftest of almost 33k factors. Maybe I should keep them in the release versions ... some errors only show up under very special conditions - like the one you discovered here: one of 33k factors shows an error in 3 of 15 kernels ... And please, don't forget to send me the result of "Switch to CPU-sieve (SieveOnGPU=0) and run "mfakto-0.13pre4-pi-win64 -st > st-0.13pre4-pi-win64.log" on an otherwise idle machine." so I can optimize the kernel selection for Cayman - currently it is entirely based on assumptions. |
|
|
|
|
|
|
#762 | |
|
Nov 2010
Germany
11258 Posts |
Quote:
You should be running mfakto 0.12 if you want to report the results to primenet and get credit for it. 0.12 still uses a lot of CPU power to prepare the GPU calculations. When 0.13 is ready, that part should be improved and you'll get well above 200 GHz (which is short for GHz-days/day) without a lot of CPU load, and with only one mfakto instance. |
|
|
|
|
|
|
#763 |
|
Dec 2012
2·139 Posts |
Is there any point in having a 0.13pre4-var version at this point? When I regularly run mfakto I run mfakto-var because I get more GHzD with a SieveSizeLimit of 130 or 154. I tried mfakto-64k and it wasn't as good. Hope I haven't been doing something stupid all this time.
I have an HD6410D APU which is pretty mediocre. For the moment it looks like GPU sieve won't be worth it for me, but maybe I am jumping the gun. Without GPU sieve it typically takes 30-40% of a single core to put the GPU to max. I will play around some more to find the ideal GPU sieve settings. I've been running st2 for the past 9 hours and it looks like it will take 20-36 hours total. Hard to tell at this point. I hope it's not for nothing. Last fiddled with by Jayder on 2013-05-10 at 21:55 |
|
|
|
|
|
#764 | |
|
Nov 2010
Germany
3·199 Posts |
Quote:
You can Ctrl-C (once) the st2-selftest, and it will still print a summary. I need to find out about the problem Axelsson reported anyway - until that, 0.13pre4 should not be used to submit any "no factor found" results. If I need to provide a new version, then maybe spend the day again for the selftest ... Thanks a lot for your support. |
|
|
|
|
|
|
#765 | |
|
Nov 2010
Germany
59710 Posts |
Quote:
Result: my old HD5770 (main dev system) is rounding floats a bit differently by default. Cayman (and now my GCN as well) are sometimes giving a different result in the last digit. I'll run all my tests tomorrow (and also check the results this time ), and then provide a new beta version - maybe the last one.
|
|
|
|
|
|
|
#766 |
|
Jul 2012
Sweden
1010102 Posts |
Great! There is nothing worse than a non reproducible bug.
Just give me another beta and I'll break that one too! ![]() I just wish some moderator would add that description to my user, breaker of programs... I like it! ![]() /Göran |
|
|
|
|
|
#767 | |
|
"Mr. Meeseeks"
Jan 2012
California, USA
23·271 Posts |
Quote:
|
|
|
|
|
|
|
#768 | |
|
Nov 2010
Germany
3×199 Posts |
Quote:
Anyway, I fixed the rounding issue and posted version 0.13pre5. This one finds all factors of the extended selftest (-st2) on both my card models, tested on Catalyst 13.1 and 13.4. BTW, the high CPU load issue with 13.4 is Windows-specific. On Linux, mfakto stays at 0.1% CPU, but the screen becomes extremely laggy. I need to see if I can do something about that. Thanks to Kyle who was the first to send me performance data of a Cayman, I think I have the proper kernel selection for that platform as well. Cayman is very interesting: for almost all tests, VectorSize=4 leads by a big margin. However, if you plan to test anything with an upper bit level of 60 or below, then VectorSize=2 is about 35% faster .I still need the performance info for high-end GCN (Tahiti), as I assume that they don't suffer so much from 32-bit integer multiplications. So if anyone has an HD7870 XT or 79xx, I'd appreciate the results of this test:
Let's see if Göran is again the first to break this version .
Last fiddled with by Bdot on 2013-05-15 at 13:45 Reason: typso |
|
|
|
|
|
|
#769 |
|
"Mr. Meeseeks"
Jan 2012
California, USA
23·271 Posts |
pre5 working good...
Anything else you want me to test?
|
|
|
|
|
|
#770 |
|
Nov 2010
Germany
3·199 Posts |
Actually, no
.Once enough of you agree that this version is good (i.e. -st2 shows no errors, no serious other issues and it is not slower than the previous version), then I'll release it. In other words, it should be ready for production once I see a "good" message from a Cayman (to confirm my fix works there as well). |
|
|
|
![]() |
| Thread Tools | |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| gpuOwL: an OpenCL program for Mersenne primality testing | preda | GpuOwl | 2718 | 2021-07-06 18:30 |
| mfaktc: a CUDA program for Mersenne prefactoring | TheJudger | GPU Computing | 3497 | 2021-06-05 12:27 |
| LL with OpenCL | msft | GPU Computing | 433 | 2019-06-23 21:11 |
| OpenCL for FPGAs | TObject | GPU Computing | 2 | 2013-10-12 21:09 |
| Program to TF Mersenne numbers with more than 1 sextillion digits? | Stargate38 | Factoring | 24 | 2011-11-03 00:34 |