![]() |
Thanks to all testers who responded so far! I'm happy no new bugs have been discovered so far (apart from a cooling issue at Axelsson's HD 6970, that the CPU-sieve versions never reached :smile:).
kracker reported that even the GPU-sieve version would consume one CPU core. Could you all please have a look again on your machines - for me, mfakto sits at 0.2% CPU. I think his Catalyst 1124.2 is the 13.3 beta driver (?) -maybe some issue with that, or because of the two active GPUs (again, a driver issue). Apart from that, everything seems to work well, but performance does not seem to keep up to the expectations for many (5-10% slower than the CPU sieve, if the CPU could saturate the GPU). Let's see. I also got word from AMD that the 13.4 Catalyst version will have a fix for a compiler bug. When I can remove the workaround for that bug, I expect a 5% speedup. The performance on VLIW4/5 is rather bad because only vector-size 2 is working at the moment, the performance on GCN suffers from having to use the "second best" kernel. Tradeoffs for the prototype, there are still a few things left for me to do ... Today I tested the Linux64 version with identical results to Win64. Edit: I just added VectorSize=4 on my HD 5770 (only for the TF kernel, not (yet) the sieve): 2.5% faster. |
1 Attachment(s)
[QUOTE=Bdot;336577]
kracker reported that even the GPU-sieve version would consume one CPU core. Could you all please have a look again on your machines - for me, mfakto sits at 0.2% CPU. I think his Catalyst 1124.2 is the 13.3 beta driver (?) -maybe some issue with that, or because of the two active GPUs (again, a driver issue).[/QUOTE] Here is another screenshot. It is a quad core so 25% 1 core(of course) And yes, it is 13.3, I might try the stable driver later, when I have time. |
[QUOTE=kracker;336580]Here is another screenshot. It is a quad core so 25% 1 core(of course)
And yes, it is 13.3, I might try the stable driver later, when I have time.[/QUOTE] Hey, April 1st is long over. Try enabling GPU sieving :no: Edit: Now I also see it in your email-report: this was the CPU sieve, not GPU ... |
[QUOTE=Bdot;336584]Hey, April 1st is long over. Try enabling GPU sieving :no:
Edit: Now I also see it in your email-report: this was the CPU sieve, not GPU ...[/QUOTE] No, it IS GPU sieving... [code] # The barrett15_75 kernel is 1-2% faster if we can limit the exponent to # 2^29 and k<2^60, using this switch (no effect on other kernels). The default # keeps the original limits of exp<2^32 and k<2^64 # # Default: SmallExp=0 SmallExp=0 # move the sieving to the GPU. This will free most of the CPU resources # # SieveOnGPU=1 [/code] |
[QUOTE=kracker;336589]No, it IS GPU sieving...
[/QUOTE] Oh, you're right, I'm sorry. For the real tests (not any selftest), it is reporting "Using GPU kernel ..." and shows the kernel it would use for CPU sieving. Later, it switches to the "cl_barrett32_77_gs" kernel(hardcoded). I tricked myself. However, it is really only [B]reporting[/B] the wrong kernel, it is using the correct one. So we're back at "potential driver issue" in your case. Also, that 12.10 did not work is something I likely should be testing as well. Even if it was just for documenting that we now need 13.x. Has anyone a driver (catalyst) version below 13.1 working with the new prototype? mfakto reports the version like this: device (driver) version OpenCL 1.2 AMD-APP (1084.4) (1084.4) Anyone below 1084.4? |
[QUOTE=Bdot;336625]Oh, you're right, I'm sorry. For the real tests (not any selftest), it is reporting "Using GPU kernel ..." and shows the kernel it would use for CPU sieving. Later, it switches to the "cl_barrett32_77_gs" kernel(hardcoded).
I tricked myself. However, it is really only [B]reporting[/B] the wrong kernel, it is using the correct one. So we're back at "potential driver issue" in your case. Also, that 12.10 did not work is something I likely should be testing as well. Even if it was just for documenting that we now need 13.x. Has anyone a driver (catalyst) version below 13.1 working with the new prototype? mfakto reports the version like this: device (driver) version OpenCL 1.2 AMD-APP (1084.4) (1084.4) Anyone below 1084.4?[/QUOTE] I removed the 13.3 beta drivers completely and installed the stable 13.1, mfakto wouldn't even load... it froze |
[QUOTE=kracker;336643]I removed the 13.3 beta drivers completely and installed the stable 13.1, mfakto wouldn't even load... it froze[/QUOTE]
At "Select Device -"? Does mfakto 0.12 still load? How about clinfo - also freezing? Most likely, the 13.3 deinstallation did not remove everything. Often, they forget to remove amdocl[64].dll from the system32 folder. Try removing Catalyst again, then remove amdocl.dll and amdocl64.dll and reinstall Catalyst. AMD even provides an [URL="http://sites.amd.com/us/game/downloads/Pages/catalyst-uninstall-utility.aspx"]extra utility[/URL] to do a clean uninstall to prepare a downgrade. |
1 Attachment(s)
[QUOTE=Bdot;336669]At "Select Device -"? Does mfakto 0.12 still load? How about clinfo - also freezing? Most likely, the 13.3 deinstallation did not remove everything. Often, they forget to remove amdocl[64].dll from the system32 folder.
Try removing Catalyst again, then remove amdocl.dll and amdocl64.dll and reinstall Catalyst. AMD even provides an [URL="http://sites.amd.com/us/game/downloads/Pages/catalyst-uninstall-utility.aspx"]extra utility[/URL] to do a clean uninstall to prepare a downgrade.[/QUOTE] What I mean is that mfakto was working in a way, with cpu usage. When I downgraded to 13.1 it froze at "Select Device", but I'll try what you suggested above. :smile: EDIT: Finally... it works! The "cpu usage" is gone too... thanks Bdot :smile: |
[QUOTE=Bdot;336625]Also, that 12.10 did not work is something I likely should be testing as well. Even if it was just for documenting that we now need 13.x.
Has anyone a driver (catalyst) version below 13.1 working with the new prototype? mfakto reports the version like this: device (driver) version OpenCL 1.2 AMD-APP (1084.4) (1084.4) Anyone below 1084.4?[/QUOTE] Yes, I haven't upgraded to the new driver yet ... :blush: [CODE]OpenCL device info name Cayman (Advanced Micro Devices, Inc.) device (driver) version OpenCL 1.2 AMD-APP (1016.4) (1016.4 (VM)) maximum threads per block 256 maximum threads per grid 16777216 number of multiprocessors 24 (1536 compute elements) clock rate 880MHz[/CODE]Works... but I get slow response from my machine while running the new version. I tried to see what happened but task monitor also slows down. :bangheadonwall: If I get some free time I'll try to upgrade the drivers and see if the CPU usage goes down. But so far I like it a lot even with my issues and I would run it for production whenever I'm not using my computer, keeping the CPU sievers for when I'm using it. The most I got from my system before were when running four instances and doing 120 GHz-days/day and now it's doing 160 GHz-days/day straight out of the box... :tu: [CODE]running a simple selftest ... got assignment: exp=66065887 bit_min=73 bit_max=74 (28.96 GHz-days) Starting trial factoring M66065887 from 2^73 to 2^74 (28.96GHz-days) Using GPU kernel "cl_barrett32_77" No checkpoint file "M66065887.ckp" found. Date Time | class Pct | time ETA | GHz-d/day Sieve Wait Apr 13 01:30 | 4049 87.8% | 16.276 31m44s | 160.12 82485 0.00% M66065887 has a factor: 17587853595837070511807 found 1 factor for M66065887 from 2^73 to 2^74 (partially tested) [mfaktc mfakto 0.13pre3-Win cl_barrett32_77] tf(): total time spent: 3h 49m 36.682s (181.60 GHz-days / day)[/CODE] |
Is that a HD 6970? (based on 1536 cores) Shouldn't it do more?
My 7770 does around ~120 GHz a day and it's a low-med end model... |
[QUOTE=kracker;337351]Is that a HD 6970? (based on 1536 cores) Shouldn't it do more?
My 7770 does around ~120 GHz a day and it's a low-med end model...[/QUOTE] Well, Cayman has always been a challenge for computing. The nominal power can hardly be used. The kernel that is now working with the GPU sieve is about the worst case for it (lots of 32-bit multiplications which basically make it a 384-core GPU). GCN-based cards also have a problem with these multiplications, but can handle other stuff way more efficiently. Axelsson, regarding the slow response: try reducing GPUSieveSize and especially GPUSieveProcessSize in mfakto.ini - this should make it more responsive. And then, please, have a look if it really is CPU usage by mfakto, or if it is just the GPU at its limit (which will also lead to slow screen responses). |
| All times are UTC. The time now is 23:08. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.