mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   mfakto: an OpenCL program for Mersenne prefactoring (https://www.mersenneforum.org/showthread.php?t=15646)

Bdot 2013-04-09 22:23

Thanks to all testers who responded so far! I'm happy no new bugs have been discovered so far (apart from a cooling issue at Axelsson's HD 6970, that the CPU-sieve versions never reached :smile:).

kracker reported that even the GPU-sieve version would consume one CPU core. Could you all please have a look again on your machines - for me, mfakto sits at 0.2% CPU. I think his Catalyst 1124.2 is the 13.3 beta driver (?) -maybe some issue with that, or because of the two active GPUs (again, a driver issue).

Apart from that, everything seems to work well, but performance does not seem to keep up to the expectations for many (5-10% slower than the CPU sieve, if the CPU could saturate the GPU). Let's see. I also got word from AMD that the 13.4 Catalyst version will have a fix for a compiler bug. When I can remove the workaround for that bug, I expect a 5% speedup.

The performance on VLIW4/5 is rather bad because only vector-size 2 is working at the moment, the performance on GCN suffers from having to use the "second best" kernel. Tradeoffs for the prototype, there are still a few things left for me to do ...

Today I tested the Linux64 version with identical results to Win64.

Edit: I just added VectorSize=4 on my HD 5770 (only for the TF kernel, not (yet) the sieve): 2.5% faster.

kracker 2013-04-09 23:12

1 Attachment(s)
[QUOTE=Bdot;336577]
kracker reported that even the GPU-sieve version would consume one CPU core. Could you all please have a look again on your machines - for me, mfakto sits at 0.2% CPU. I think his Catalyst 1124.2 is the 13.3 beta driver (?) -maybe some issue with that, or because of the two active GPUs (again, a driver issue).[/QUOTE]

Here is another screenshot. It is a quad core so 25% 1 core(of course)
And yes, it is 13.3, I might try the stable driver later, when I have time.

Bdot 2013-04-09 23:56

[QUOTE=kracker;336580]Here is another screenshot. It is a quad core so 25% 1 core(of course)
And yes, it is 13.3, I might try the stable driver later, when I have time.[/QUOTE]
Hey, April 1st is long over. Try enabling GPU sieving :no:
Edit: Now I also see it in your email-report: this was the CPU sieve, not GPU ...

kracker 2013-04-10 00:33

[QUOTE=Bdot;336584]Hey, April 1st is long over. Try enabling GPU sieving :no:
Edit: Now I also see it in your email-report: this was the CPU sieve, not GPU ...[/QUOTE]

No, it IS GPU sieving...

[code]
# The barrett15_75 kernel is 1-2% faster if we can limit the exponent to
# 2^29 and k<2^60, using this switch (no effect on other kernels). The default
# keeps the original limits of exp<2^32 and k<2^64
#
# Default: SmallExp=0

SmallExp=0


# move the sieving to the GPU. This will free most of the CPU resources
#
#

SieveOnGPU=1
[/code]

Bdot 2013-04-10 08:57

[QUOTE=kracker;336589]No, it IS GPU sieving...
[/QUOTE]
Oh, you're right, I'm sorry. For the real tests (not any selftest), it is reporting "Using GPU kernel ..." and shows the kernel it would use for CPU sieving. Later, it switches to the "cl_barrett32_77_gs" kernel(hardcoded).

I tricked myself.

However, it is really only [B]reporting[/B] the wrong kernel, it is using the correct one. So we're back at "potential driver issue" in your case.

Also, that 12.10 did not work is something I likely should be testing as well. Even if it was just for documenting that we now need 13.x.

Has anyone a driver (catalyst) version below 13.1 working with the new prototype? mfakto reports the version like this:
device (driver) version OpenCL 1.2 AMD-APP (1084.4) (1084.4)
Anyone below 1084.4?

kracker 2013-04-10 14:34

[QUOTE=Bdot;336625]Oh, you're right, I'm sorry. For the real tests (not any selftest), it is reporting "Using GPU kernel ..." and shows the kernel it would use for CPU sieving. Later, it switches to the "cl_barrett32_77_gs" kernel(hardcoded).

I tricked myself.

However, it is really only [B]reporting[/B] the wrong kernel, it is using the correct one. So we're back at "potential driver issue" in your case.

Also, that 12.10 did not work is something I likely should be testing as well. Even if it was just for documenting that we now need 13.x.

Has anyone a driver (catalyst) version below 13.1 working with the new prototype? mfakto reports the version like this:
device (driver) version OpenCL 1.2 AMD-APP (1084.4) (1084.4)
Anyone below 1084.4?[/QUOTE]

I removed the 13.3 beta drivers completely and installed the stable 13.1, mfakto wouldn't even load... it froze

Bdot 2013-04-10 22:33

[QUOTE=kracker;336643]I removed the 13.3 beta drivers completely and installed the stable 13.1, mfakto wouldn't even load... it froze[/QUOTE]

At "Select Device -"? Does mfakto 0.12 still load? How about clinfo - also freezing? Most likely, the 13.3 deinstallation did not remove everything. Often, they forget to remove amdocl[64].dll from the system32 folder.

Try removing Catalyst again, then remove amdocl.dll and amdocl64.dll and reinstall Catalyst.

AMD even provides an [URL="http://sites.amd.com/us/game/downloads/Pages/catalyst-uninstall-utility.aspx"]extra utility[/URL] to do a clean uninstall to prepare a downgrade.

kracker 2013-04-11 01:40

1 Attachment(s)
[QUOTE=Bdot;336669]At "Select Device -"? Does mfakto 0.12 still load? How about clinfo - also freezing? Most likely, the 13.3 deinstallation did not remove everything. Often, they forget to remove amdocl[64].dll from the system32 folder.

Try removing Catalyst again, then remove amdocl.dll and amdocl64.dll and reinstall Catalyst.

AMD even provides an [URL="http://sites.amd.com/us/game/downloads/Pages/catalyst-uninstall-utility.aspx"]extra utility[/URL] to do a clean uninstall to prepare a downgrade.[/QUOTE]

What I mean is that mfakto was working in a way, with cpu usage. When I downgraded to 13.1 it froze at "Select Device", but I'll try what you suggested above. :smile:

EDIT: Finally... it works! The "cpu usage" is gone too... thanks Bdot :smile:

Axelsson 2013-04-16 22:44

[QUOTE=Bdot;336625]Also, that 12.10 did not work is something I likely should be testing as well. Even if it was just for documenting that we now need 13.x.

Has anyone a driver (catalyst) version below 13.1 working with the new prototype? mfakto reports the version like this:
device (driver) version OpenCL 1.2 AMD-APP (1084.4) (1084.4)
Anyone below 1084.4?[/QUOTE]

Yes, I haven't upgraded to the new driver yet ... :blush:
[CODE]OpenCL device info
name Cayman (Advanced Micro Devices, Inc.)
device (driver) version OpenCL 1.2 AMD-APP (1016.4) (1016.4 (VM))
maximum threads per block 256
maximum threads per grid 16777216
number of multiprocessors 24 (1536 compute elements)
clock rate 880MHz[/CODE]Works... but I get slow response from my machine while running the new version. I tried to see what happened but task monitor also slows down. :bangheadonwall:
If I get some free time I'll try to upgrade the drivers and see if the CPU usage goes down.

But so far I like it a lot even with my issues and I would run it for production whenever I'm not using my computer, keeping the CPU sievers for when I'm using it.

The most I got from my system before were when running four instances and doing 120 GHz-days/day and now it's doing 160 GHz-days/day straight out of the box... :tu:
[CODE]running a simple selftest ...
got assignment: exp=66065887 bit_min=73 bit_max=74 (28.96 GHz-days)
Starting trial factoring M66065887 from 2^73 to 2^74 (28.96GHz-days)
Using GPU kernel "cl_barrett32_77"
No checkpoint file "M66065887.ckp" found.
Date Time | class Pct | time ETA | GHz-d/day Sieve Wait
Apr 13 01:30 | 4049 87.8% | 16.276 31m44s | 160.12 82485 0.00%
M66065887 has a factor: 17587853595837070511807

found 1 factor for M66065887 from 2^73 to 2^74 (partially tested) [mfaktc mfakto 0.13pre3-Win cl_barrett32_77]
tf(): total time spent: 3h 49m 36.682s (181.60 GHz-days / day)[/CODE]

kracker 2013-04-16 22:49

Is that a HD 6970? (based on 1536 cores) Shouldn't it do more?
My 7770 does around ~120 GHz a day and it's a low-med end model...

Bdot 2013-04-17 20:58

[QUOTE=kracker;337351]Is that a HD 6970? (based on 1536 cores) Shouldn't it do more?
My 7770 does around ~120 GHz a day and it's a low-med end model...[/QUOTE]
Well, Cayman has always been a challenge for computing. The nominal power can hardly be used. The kernel that is now working with the GPU sieve is about the worst case for it (lots of 32-bit multiplications which basically make it a 384-core GPU). GCN-based cards also have a problem with these multiplications, but can handle other stuff way more efficiently.

Axelsson, regarding the slow response: try reducing GPUSieveSize and especially GPUSieveProcessSize in mfakto.ini - this should make it more responsive. And then, please, have a look if it really is CPU usage by mfakto, or if it is just the GPU at its limit (which will also lead to slow screen responses).


All times are UTC. The time now is 23:08.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.