mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2013-04-09, 22:23   #716
Bdot
 
Bdot's Avatar
 
Nov 2010
Germany

11258 Posts
Default

Thanks to all testers who responded so far! I'm happy no new bugs have been discovered so far (apart from a cooling issue at Axelsson's HD 6970, that the CPU-sieve versions never reached ).

kracker reported that even the GPU-sieve version would consume one CPU core. Could you all please have a look again on your machines - for me, mfakto sits at 0.2% CPU. I think his Catalyst 1124.2 is the 13.3 beta driver (?) -maybe some issue with that, or because of the two active GPUs (again, a driver issue).

Apart from that, everything seems to work well, but performance does not seem to keep up to the expectations for many (5-10% slower than the CPU sieve, if the CPU could saturate the GPU). Let's see. I also got word from AMD that the 13.4 Catalyst version will have a fix for a compiler bug. When I can remove the workaround for that bug, I expect a 5% speedup.

The performance on VLIW4/5 is rather bad because only vector-size 2 is working at the moment, the performance on GCN suffers from having to use the "second best" kernel. Tradeoffs for the prototype, there are still a few things left for me to do ...

Today I tested the Linux64 version with identical results to Win64.

Edit: I just added VectorSize=4 on my HD 5770 (only for the TF kernel, not (yet) the sieve): 2.5% faster.

Last fiddled with by Bdot on 2013-04-09 at 22:40
Bdot is offline   Reply With Quote
Old 2013-04-09, 23:12   #717
kracker
 
kracker's Avatar
 
"Mr. Meeseeks"
Jan 2012
California, USA

23·271 Posts
Default

Quote:
Originally Posted by Bdot View Post
kracker reported that even the GPU-sieve version would consume one CPU core. Could you all please have a look again on your machines - for me, mfakto sits at 0.2% CPU. I think his Catalyst 1124.2 is the 13.3 beta driver (?) -maybe some issue with that, or because of the two active GPUs (again, a driver issue).
Here is another screenshot. It is a quad core so 25% 1 core(of course)
And yes, it is 13.3, I might try the stable driver later, when I have time.
Attached Thumbnails
Click image for larger version

Name:	usage.jpg
Views:	104
Size:	274.8 KB
ID:	9658  
kracker is offline   Reply With Quote
Old 2013-04-09, 23:56   #718
Bdot
 
Bdot's Avatar
 
Nov 2010
Germany

3×199 Posts
Default

Quote:
Originally Posted by kracker View Post
Here is another screenshot. It is a quad core so 25% 1 core(of course)
And yes, it is 13.3, I might try the stable driver later, when I have time.
Hey, April 1st is long over. Try enabling GPU sieving
Edit: Now I also see it in your email-report: this was the CPU sieve, not GPU ...

Last fiddled with by Bdot on 2013-04-09 at 23:59
Bdot is offline   Reply With Quote
Old 2013-04-10, 00:33   #719
kracker
 
kracker's Avatar
 
"Mr. Meeseeks"
Jan 2012
California, USA

87816 Posts
Default

Quote:
Originally Posted by Bdot View Post
Hey, April 1st is long over. Try enabling GPU sieving
Edit: Now I also see it in your email-report: this was the CPU sieve, not GPU ...
No, it IS GPU sieving...

Code:
# The barrett15_75 kernel is 1-2% faster if we can limit the exponent to
# 2^29 and k<2^60, using this switch (no effect on other kernels). The default
# keeps the original limits of exp<2^32 and k<2^64
#
# Default: SmallExp=0

SmallExp=0


# move the sieving to the GPU. This will free most of the CPU resources
# 
# 

SieveOnGPU=1
kracker is offline   Reply With Quote
Old 2013-04-10, 08:57   #720
Bdot
 
Bdot's Avatar
 
Nov 2010
Germany

10010101012 Posts
Default

Quote:
Originally Posted by kracker View Post
No, it IS GPU sieving...
Oh, you're right, I'm sorry. For the real tests (not any selftest), it is reporting "Using GPU kernel ..." and shows the kernel it would use for CPU sieving. Later, it switches to the "cl_barrett32_77_gs" kernel(hardcoded).

I tricked myself.

However, it is really only reporting the wrong kernel, it is using the correct one. So we're back at "potential driver issue" in your case.

Also, that 12.10 did not work is something I likely should be testing as well. Even if it was just for documenting that we now need 13.x.

Has anyone a driver (catalyst) version below 13.1 working with the new prototype? mfakto reports the version like this:
device (driver) version OpenCL 1.2 AMD-APP (1084.4) (1084.4)
Anyone below 1084.4?
Bdot is offline   Reply With Quote
Old 2013-04-10, 14:34   #721
kracker
 
kracker's Avatar
 
"Mr. Meeseeks"
Jan 2012
California, USA

23·271 Posts
Default

Quote:
Originally Posted by Bdot View Post
Oh, you're right, I'm sorry. For the real tests (not any selftest), it is reporting "Using GPU kernel ..." and shows the kernel it would use for CPU sieving. Later, it switches to the "cl_barrett32_77_gs" kernel(hardcoded).

I tricked myself.

However, it is really only reporting the wrong kernel, it is using the correct one. So we're back at "potential driver issue" in your case.

Also, that 12.10 did not work is something I likely should be testing as well. Even if it was just for documenting that we now need 13.x.

Has anyone a driver (catalyst) version below 13.1 working with the new prototype? mfakto reports the version like this:
device (driver) version OpenCL 1.2 AMD-APP (1084.4) (1084.4)
Anyone below 1084.4?
I removed the 13.3 beta drivers completely and installed the stable 13.1, mfakto wouldn't even load... it froze
kracker is offline   Reply With Quote
Old 2013-04-10, 22:33   #722
Bdot
 
Bdot's Avatar
 
Nov 2010
Germany

3×199 Posts
Default

Quote:
Originally Posted by kracker View Post
I removed the 13.3 beta drivers completely and installed the stable 13.1, mfakto wouldn't even load... it froze
At "Select Device -"? Does mfakto 0.12 still load? How about clinfo - also freezing? Most likely, the 13.3 deinstallation did not remove everything. Often, they forget to remove amdocl[64].dll from the system32 folder.

Try removing Catalyst again, then remove amdocl.dll and amdocl64.dll and reinstall Catalyst.

AMD even provides an extra utility to do a clean uninstall to prepare a downgrade.
Bdot is offline   Reply With Quote
Old 2013-04-11, 01:40   #723
kracker
 
kracker's Avatar
 
"Mr. Meeseeks"
Jan 2012
California, USA

23×271 Posts
Default

Quote:
Originally Posted by Bdot View Post
At "Select Device -"? Does mfakto 0.12 still load? How about clinfo - also freezing? Most likely, the 13.3 deinstallation did not remove everything. Often, they forget to remove amdocl[64].dll from the system32 folder.

Try removing Catalyst again, then remove amdocl.dll and amdocl64.dll and reinstall Catalyst.

AMD even provides an extra utility to do a clean uninstall to prepare a downgrade.
What I mean is that mfakto was working in a way, with cpu usage. When I downgraded to 13.1 it froze at "Select Device", but I'll try what you suggested above.

EDIT: Finally... it works! The "cpu usage" is gone too... thanks Bdot
Attached Thumbnails
Click image for larger version

Name:	finally.jpg
Views:	92
Size:	125.2 KB
ID:	9660  

Last fiddled with by kracker on 2013-04-11 at 02:11
kracker is offline   Reply With Quote
Old 2013-04-16, 22:44   #724
Axelsson
 
Jul 2012
Sweden

2·3·7 Posts
Default

Quote:
Originally Posted by Bdot View Post
Also, that 12.10 did not work is something I likely should be testing as well. Even if it was just for documenting that we now need 13.x.

Has anyone a driver (catalyst) version below 13.1 working with the new prototype? mfakto reports the version like this:
device (driver) version OpenCL 1.2 AMD-APP (1084.4) (1084.4)
Anyone below 1084.4?
Yes, I haven't upgraded to the new driver yet ...
Code:
OpenCL device info
  name                      Cayman (Advanced Micro Devices, Inc.)
  device (driver) version   OpenCL 1.2 AMD-APP (1016.4) (1016.4 (VM))
  maximum threads per block 256
  maximum threads per grid  16777216
  number of multiprocessors 24 (1536 compute elements)
  clock rate                880MHz
Works... but I get slow response from my machine while running the new version. I tried to see what happened but task monitor also slows down.
If I get some free time I'll try to upgrade the drivers and see if the CPU usage goes down.

But so far I like it a lot even with my issues and I would run it for production whenever I'm not using my computer, keeping the CPU sievers for when I'm using it.

The most I got from my system before were when running four instances and doing 120 GHz-days/day and now it's doing 160 GHz-days/day straight out of the box...
Code:
running a simple selftest ...
got assignment: exp=66065887 bit_min=73 bit_max=74 (28.96 GHz-days)
Starting trial factoring M66065887 from 2^73 to 2^74 (28.96GHz-days)
Using GPU kernel "cl_barrett32_77"
No checkpoint file "M66065887.ckp" found.
Date    Time | class   Pct |   time     ETA | GHz-d/day    Sieve     Wait
Apr 13 01:30 | 4049  87.8% | 16.276  31m44s |    160.12    82485    0.00%
M66065887 has a factor: 17587853595837070511807

found 1 factor for M66065887 from 2^73 to 2^74 (partially tested) [mfaktc mfakto 0.13pre3-Win cl_barrett32_77]
tf(): total time spent:  3h 49m 36.682s (181.60 GHz-days / day)
Axelsson is offline   Reply With Quote
Old 2013-04-16, 22:49   #725
kracker
 
kracker's Avatar
 
"Mr. Meeseeks"
Jan 2012
California, USA

216810 Posts
Default

Is that a HD 6970? (based on 1536 cores) Shouldn't it do more?
My 7770 does around ~120 GHz a day and it's a low-med end model...
kracker is offline   Reply With Quote
Old 2013-04-17, 20:58   #726
Bdot
 
Bdot's Avatar
 
Nov 2010
Germany

3×199 Posts
Default

Quote:
Originally Posted by kracker View Post
Is that a HD 6970? (based on 1536 cores) Shouldn't it do more?
My 7770 does around ~120 GHz a day and it's a low-med end model...
Well, Cayman has always been a challenge for computing. The nominal power can hardly be used. The kernel that is now working with the GPU sieve is about the worst case for it (lots of 32-bit multiplications which basically make it a 384-core GPU). GCN-based cards also have a problem with these multiplications, but can handle other stuff way more efficiently.

Axelsson, regarding the slow response: try reducing GPUSieveSize and especially GPUSieveProcessSize in mfakto.ini - this should make it more responsive. And then, please, have a look if it really is CPU usage by mfakto, or if it is just the GPU at its limit (which will also lead to slow screen responses).
Bdot is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
gpuOwL: an OpenCL program for Mersenne primality testing preda GpuOwl 2718 2021-07-06 18:30
mfaktc: a CUDA program for Mersenne prefactoring TheJudger GPU Computing 3497 2021-06-05 12:27
LL with OpenCL msft GPU Computing 433 2019-06-23 21:11
OpenCL for FPGAs TObject GPU Computing 2 2013-10-12 21:09
Program to TF Mersenne numbers with more than 1 sextillion digits? Stargate38 Factoring 24 2011-11-03 00:34

All times are UTC. The time now is 03:14.


Mon Aug 2 03:14:42 UTC 2021 up 9 days, 21:43, 0 users, load averages: 1.49, 1.39, 1.39

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.