![]() |
[QUOTE=KyleAskine;282848]What is the MUL24 Kernal?
In .09 I was using the mfakto_cl_71 for my 6950s with the shaders unlocked (so basically 6970s). I was getting around 140 M/s With mfakto_cl_barrett79 I was getting around 120 M/s, so barrett was around 15-20% slower. With .10 I seem to be getting around 120 M/s with both, so the mfakto_cl_71 seems to have gotten slower for me. The barrett kernel still runs at the same speed. I am installing something ATM, but I can check again when I am done if you would like. Let me know if there are any screenshots or output files I can send you if that would help. Edit: I just confirmed that at the same load, I now run around 20% slower with the new version.[/QUOTE] mul24 kernel is the kernel "mfakto_cl_71". What type of CPU do you have? I've reduce the sieve size to fit most CPU's 32kb L1 cache. If you have a CPU with 64k L1 cache, then the siever might be slower ... I've lost my Phenom machine (again) therefore I could not test that. As most Intel CPUs have just 32k L1 data cache, I found the optimum sieve size to be ~24kB for those. If you have a 64k-L1-cache-machine, I can send you a special version and note for the next version to either adjust that automatically or make it configurable. Also, for bulldozer, I can create a 12kiB-siever-version. Can you confirm that you still see the line Using GPU kernel "mfakto_cl_71" if you select that kernel be be run? And can you see a difference in GPU utilization? |
[QUOTE=therealwebs;282875]I'm running both mfakto win 0.09 and 0.10 on different PCs. I've noticed that mfakto 0.10 (x64) seems to crash fairly regularly. I'm using cat 11.12 with 2x5870s. Since I'm running remote, I haven't been able to monitor the circumstance of the crashes. Event viewer doesn't have anything helpful to add at the moment. I'll update if I can find a set of circumstances that cause the crash.[/QUOTE]
Make sure to not have AMD APP SDK 2.4 on your box. |
[QUOTE=Bdot;282880]Also, for bulldozer, I can create a 12kiB-siever-version.
[/QUOTE] Well, I've built a mfaktc executable for nucleons Bulldozer with a smaller sieve. It helps a little bit but my sieve code really runs bad on Bulldozer. Per clock something like 1/4 to 1/3 of a current Intel CPU. :sad: Oliver |
[QUOTE=Bdot;282880]mul24 kernel is the kernel "mfakto_cl_71".
What type of CPU do you have? I've reduce the sieve size to fit most CPU's 32kb L1 cache. If you have a CPU with 64k L1 cache, then the siever might be slower ... I've lost my Phenom machine (again) therefore I could not test that. As most Intel CPUs have just 32k L1 data cache, I found the optimum sieve size to be ~24kB for those. If you have a 64k-L1-cache-machine, I can send you a special version and note for the next version to either adjust that automatically or make it configurable. Also, for bulldozer, I can create a 12kiB-siever-version. Can you confirm that you still see the line Using GPU kernel "mfakto_cl_71" if you select that kernel be be run? And can you see a difference in GPU utilization?[/QUOTE] I have an i5-2500k. I don't think it is a siever issue... my utilization is the same (around 90%) with both .09 and .10. I confirmed that it does say that it is using mfakto_cl_71. |
[QUOTE=TheJudger;282897]Well, I've built a mfaktc executable for nucleons Bulldozer with a smaller sieve. It helps a little bit but my sieve code really runs bad on Bulldozer. Per clock something like 1/4 to 1/3 of a current Intel CPU. :sad:
Oliver[/QUOTE] Yes, I've seen that reducing the sieve size any further dramatically reduces speed. In so far, the Phenoms (64kiB L1) should be best at sieving, if they get a 60kiB siever ... |
[QUOTE=KyleAskine;282898]I have an i5-2500k.
I don't think it is a siever issue... my utilization is the same (around 90%) with both .09 and .10. I confirmed that it does say that it is using mfakto_cl_71.[/QUOTE] That is really sad, and it seems to depend on your GPU's - mfakto_cl_71 v0.10 on my box is faster than v0.09 ... Can you please pm me your email address? I'd like to send you something to test ... |
yep, don't have APP SDK 2.4 installed AFAIK. i wanted to install 2.6, but the download link was corrupted so i'm using 2.5.
in terms of stability, mfakto hasn't crashed in the last 10 or so hours. this is coinciding with changing my usage pattern from 2 instances+1 instance to running only 1 instance on each card (so 1+1). from a resource standpoint, i'm using 3 cores of my i5 to feed the cards and 1 core to run prime95. if i allow 2 cores of primes to run, i get a major throughput hit in mfakto. thanks for this version! i didn't want to have to do a driver rollback to run this on my main machine :) |
[QUOTE=therealwebs;282946]yep, don't have APP SDK 2.4 installed AFAIK. i wanted to install 2.6, but the download link was corrupted so i'm using 2.5.
in terms of stability, mfakto hasn't crashed in the last 10 or so hours. this is coinciding with changing my usage pattern from 2 instances+1 instance to running only 1 instance on each card (so 1+1). from a resource standpoint, i'm using 3 cores of my i5 to feed the cards and 1 core to run prime95. if i allow 2 cores of primes to run, i get a major throughput hit in mfakto. thanks for this version! i didn't want to have to do a driver rollback to run this on my main machine :)[/QUOTE] If you could enable userdumper or some other tool to get a crash dump when it aborts next time, that would be really helpful. But of course I hope it does not crash again ;-) And another note: the aforementioned performance issue seems resolved. kyleaskine and flashjh are helping me test it, so I'll probably release a fix for it tomorrow - together with the linux binary. |
When using CheckpointDelay=0 and PrintMode=1, the first column (class) of the output is always overwritten by the text 'CP written.', makes it impossible to see which class is being tested.
|
1 Attachment(s)
[QUOTE=BigBrother;283030]When using CheckpointDelay=0 and PrintMode=1, the first column (class) of the output is always overwritten by the text 'CP written.', makes it impossible to see which class is being tested.[/QUOTE]
Hehe, that's a use-case that was not intended ... now that mfakto can delay writing the checkpoints, the idea is that they are written only occasionally. I'll think of some better way to tell that a checkpoint was written. Thanks for the report. Here's the fix for the performance issues. It just contains 2 kernel files that need to replace original files from the 0.10 package. |
mfakto 0.10 - Linux version
1 Attachment(s)
Here comes the linux version of mfakto 0.10. It has the performance issues resolved, but is otherwise unchanged (also 32kiB sieve limit).
|
| All times are UTC. The time now is 22:42. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.