mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2012-01-23, 06:37   #353
RMAC9.5
 
RMAC9.5's Avatar
 
Jun 2003

2318 Posts
Default ATI 6970 & Windows Server 2003 64 bit O/S

Hi Bdot,
Thanks for your quick reply. I have some progress to report.
I copied the MSVCR100 and MSVCP100 dlls from C:\ATI\Support\11-12_xp64_dd_ccc_ocl\Bin64 (default install folder) to the C:\Mfakto folder that I created for the mfakto-0.10p1 zip file that I downloaded. Mfakto-64.exe opened a black background message window and told me the following when I ran it:
  • select device -gpu not found, fall back to cpu
    then a simple test started from 1 - 15
    when it completed the black background message window closed and nothing else appeared to happen
Two more pieces of information that might be important:
  • The AMDDriverDownloader.exe that I downloaded from AMD's web site recommended the 11-12_xp64 file above as the best match for my Windows Server 2003 O/S.
    This PC is still using a plain vanilla VGA device driver for my CRT because it was NOT replaced when the CCC driver completed its successful install. My video resolution is fine because I don't game on this PC and I only installed my ATI video card and CCC driver for Folding or (my first love) GIMPS.
RMAC9.5 is offline   Reply With Quote
Old 2012-01-23, 19:10   #354
Bdot
 
Bdot's Avatar
 
Nov 2010
Germany

3·199 Posts
Default

Quote:
Originally Posted by RMAC9.5 View Post
Hi Bdot,
Thanks for your quick reply. I have some progress to report.
I copied the MSVCR100 and MSVCP100 dlls from C:\ATI\Support\11-12_xp64_dd_ccc_ocl\Bin64 (default install folder) to the C:\Mfakto folder that I created for the mfakto-0.10p1 zip file that I downloaded. Mfakto-64.exe opened a black background message window and told me the following when I ran it:
  • select device -gpu not found, fall back to cpu
    then a simple test started from 1 - 15
    when it completed the black background message window closed and nothing else appeared to happen
This means, that copying in the dlls allowed mfakto to start. "GPU not found" is not what you want, and it will always happen, if the AMD device driver is not used for the GPU. There can be other reasons, but as you already found out that the AMD driver is not used, that needs to be resolved first.

Quote:
Originally Posted by RMAC9.5 View Post
Two more pieces of information that might be important:
  • The AMDDriverDownloader.exe that I downloaded from AMD's web site recommended the 11-12_xp64 file above as the best match for my Windows Server 2003 O/S.
    This PC is still using a plain vanilla VGA device driver for my CRT because it was NOT replaced when the CCC driver completed its successful install. My video resolution is fine because I don't game on this PC and I only installed my ATI video card and CCC driver for Folding or (my first love) GIMPS.
The choice of the driver is correct, 2003 is the server OS matching XP. If this driver is not working, mfakto will not be able to use the GPU and revert to the CPU. (Am I repeating myself ? )
I suggest completely removing the whole CCC (via Add/Remove Programs), and reinstall the MSI package. I don't have 2003 around, but maybe some forum will know if 64-bit 2003 is supposed to work with CCC (or vice versa).
Bdot is offline   Reply With Quote
Old 2012-01-23, 19:35   #355
Bdot
 
Bdot's Avatar
 
Nov 2010
Germany

3·199 Posts
Default

While reading the Release Notes of the just-released AMD APP SDK 2.6 I found this here:
Quote:
Async copies preview (set environment variable GPU_ASYNC_MEM_COPY=2 to enable).
As this is a runtime thing, I tested it on a Windows and a Linux box, and indeed: When using a single instance of mfakto, the memory transfer to the GPU will be hidden. Performance gain: 10-20%. When using more than one mfakto instance, the gain is less, but still measurable on my systems. This is mainly caused by a higher GPU utilisation.

However, be cautious: AMD called it a "preview". I guess it is not well-tested. If you decide to give it a try, please run the full selftest (mfakto -st2) before real trial-factoring. Prerequisite is Catalyst 11.12 which has the same OpenCL runtime as APP SDK 2.6.

(In case anyone is not sure how to set that on Windows: Either set it in Control Panel -> System -> Advanced -> Environment variables as a new System variable, and then restart all cmd-prompts, or in a cmd-prompt, run "set GPU_ASYNC_MEM_COPY=2" before starting mfakto.)

I'd be very interested to hear about your experiences should you give it a try. I think, this 10-20% speed-up will also affect entry-level cards. Is there anyone around who could test that on one of those?

Last fiddled with by Bdot on 2012-01-23 at 20:00
Bdot is offline   Reply With Quote
Old 2012-01-23, 20:15   #356
KyleAskine
 
KyleAskine's Avatar
 
Oct 2011
Maryland

2·5·29 Posts
Default

Quote:
Originally Posted by Bdot View Post
While reading the Release Notes of the just-released AMD APP SDK 2.6 I found this here:
As this is a runtime thing, I tested it on a Windows and a Linux box, and indeed: When using a single instance of mfakto, the memory transfer to the GPU will be hidden. Performance gain: 10-20%. When using more than one mfakto instance, the gain is less, but still measurable on my systems. This is mainly caused by a higher GPU utilisation.

However, be cautious: AMD called it a "preview". I guess it is not well-tested. If you decide to give it a try, please run the full selftest (mfakto -st2) before real trial-factoring. Prerequisite is Catalyst 11.12 which has the same OpenCL runtime as APP SDK 2.6.

(In case anyone is not sure how to set that on Windows: Either set it in Control Panel -> System -> Advanced -> Environment variables as a new System variable, and then restart all cmd-prompts, or in a cmd-prompt, run "set GPU_ASYNC_MEM_COPY=2" before starting mfakto.)

I'd be very interested to hear about your experiences should you give it a try. I think, this 10-20% speed-up will also affect entry-level cards. Is there anyone around who could test that on one of those?
I upgraded to 12.1 preview recently on my entry level (6570) card, and this must have automatically been set, because my performance increased around 20% and my utilization went up around 7% (from around 88% to 95%). I tried typing that, and it didn't change, which is why I speculate that this gets set automatically in 12.1.
KyleAskine is offline   Reply With Quote
Old 2012-01-24, 03:00   #357
flashjh
 
flashjh's Avatar
 
"Jerry"
Nov 2011
Vancouver, WA

1,123 Posts
Default

Quote:
Originally Posted by Bdot View Post
While reading the Release Notes of the just-released AMD APP SDK 2.6 I found this here:
As this is a runtime thing, I tested it on a Windows and a Linux box, and indeed: When using a single instance of mfakto, the memory transfer to the GPU will be hidden. Performance gain: 10-20%. When using more than one mfakto instance, the gain is less, but still measurable on my systems. This is mainly caused by a higher GPU utilisation.

However, be cautious: AMD called it a "preview". I guess it is not well-tested. If you decide to give it a try, please run the full selftest (mfakto -st2) before real trial-factoring. Prerequisite is Catalyst 11.12 which has the same OpenCL runtime as APP SDK 2.6.

(In case anyone is not sure how to set that on Windows: Either set it in Control Panel -> System -> Advanced -> Environment variables as a new System variable, and then restart all cmd-prompts, or in a cmd-prompt, run "set GPU_ASYNC_MEM_COPY=2" before starting mfakto.)

I'd be very interested to hear about your experiences should you give it a try. I think, this 10-20% speed-up will also affect entry-level cards. Is there anyone around who could test that on one of those?
Amazing increase in speed! I run two instances (one per GPU) on my test machine. Times drppoed from 2.3 sec/iter to 1.5. It's definitely not as good with more than one instance per GPU. The SievePrimes dropped to 5000, so I know it increased the GPU efficiency. I just wish it was as efficient with more than one instance.
flashjh is offline   Reply With Quote
Old 2012-01-24, 08:39   #358
Bdot
 
Bdot's Avatar
 
Nov 2010
Germany

3×199 Posts
Default

Quote:
Originally Posted by flashjh View Post
Amazing increase in speed! I run two instances (one per GPU) on my test machine. Times drppoed from 2.3 sec/iter to 1.5. It's definitely not as good with more than one instance per GPU. The SievePrimes dropped to 5000, so I know it increased the GPU efficiency. I just wish it was as efficient with more than one instance.
Hmm, what do you get with two instances? Due to better sieving it should be below 3 secs/class for each instance ... What is the CPU load with two instances? If two instances need to share a CPU core, then you will not see any additional throughput - maybe even a decrease as the async copy seems to require a little more CPU.
Bdot is offline   Reply With Quote
Old 2012-01-24, 08:42   #359
Bdot
 
Bdot's Avatar
 
Nov 2010
Germany

10010101012 Posts
Default

Quote:
Originally Posted by KyleAskine View Post
I upgraded to 12.1 preview recently on my entry level (6570) card, and this must have automatically been set, because my performance increased around 20% and my utilization went up around 7% (from around 88% to 95%). I tried typing that, and it didn't change, which is why I speculate that this gets set automatically in 12.1.
Thanks, that is good to know - when that feature leaves "preview" status and is enabled everywhere, more people will benefit from it.
Bdot is offline   Reply With Quote
Old 2012-01-24, 13:44   #360
flashjh
 
flashjh's Avatar
 
"Jerry"
Nov 2011
Vancouver, WA

1,123 Posts
Default

Quote:
Originally Posted by Bdot View Post
Hmm, what do you get with two instances? Due to better sieving it should be below 3 secs/class for each instance ... What is the CPU load with two instances? If two instances need to share a CPU core, then you will not see any additional throughput - maybe even a decrease as the async copy seems to require a little more CPU.
Yes it is, sorry for the unnecessarily discouraging report . I was getting about 2.3 sec/class when running two instances per card, which is very good! I stopped the extra instance because the other factoring on the computer slowed down too much for now. I'll start other instances up later.
flashjh is offline   Reply With Quote
Old 2012-01-28, 19:39   #361
Bdot
 
Bdot's Avatar
 
Nov 2010
Germany

10010101012 Posts
Default Catalyst 12.1 released, but wait!

Kyle has tried the freshly released Catalyst 12.1 drivers and now mfakto is aborting (SIGSEGV) in Linux. You may want to wait a little until I had time to investigate ... probably during coming week. I may have a chance to test 12.1 on W7 tomorrow.
Bdot is offline   Reply With Quote
Old 2012-01-28, 21:06   #362
flashjh
 
flashjh's Avatar
 
"Jerry"
Nov 2011
Vancouver, WA

46316 Posts
Default

Quote:
Originally Posted by Bdot View Post
Kyle has tried the freshly released Catalyst 12.1 drivers and now mfakto is aborting (SIGSEGV) in Linux. You may want to wait a little until I had time to investigate ... probably during coming week. I may have a chance to test 12.1 on W7 tomorrow.
I have been using 12.1 on WinXP 32 and Win 7 64 for several days with really good results.
flashjh is offline   Reply With Quote
Old 2012-01-29, 03:22   #363
kracker
 
kracker's Avatar
 
"Mr. Meeseeks"
Jan 2012
California, USA

23·271 Posts
Default

Quote:
Originally Posted by flashjh View Post
I have been using 12.1 on WinXP 32 and Win 7 64 for several days with really good results.
+1 Here also.
kracker is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
mfaktc: a CUDA program for Mersenne prefactoring TheJudger GPU Computing 3498 2021-08-06 21:07
gpuOwL: an OpenCL program for Mersenne primality testing preda GpuOwl 2719 2021-08-05 22:43
LL with OpenCL msft GPU Computing 433 2019-06-23 21:11
OpenCL for FPGAs TObject GPU Computing 2 2013-10-12 21:09
Program to TF Mersenne numbers with more than 1 sextillion digits? Stargate38 Factoring 24 2011-11-03 00:34

All times are UTC. The time now is 22:00.


Fri Aug 6 22:00:27 UTC 2021 up 14 days, 16:29, 1 user, load averages: 2.77, 2.79, 2.69

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.