mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2015-08-24, 01:02   #1321
airsquirrels
 
airsquirrels's Avatar
 
"David"
Jul 2015
Ohio

11×47 Posts
Default

Quote:
Originally Posted by axn View Post
I guess it doesn't hurt to ask. Are you using GPU Sieve or CPU Sieve?
I am using the GPU sieve. It is possible this is an issue with AMD Catalyst 15.7 beta driver, when I get a moment I will be testing using a Windows image for comparison.
airsquirrels is offline   Reply With Quote
Old 2015-08-24, 21:25   #1322
Bdot
 
Bdot's Avatar
 
Nov 2010
Germany

25516 Posts
Default

The only thing that I could imagine here is that the delay between sending results for a block from the GPU to the CPU and receiving the order for the next block is increasing a lot when the PCIe speed drops. As you are running very short assignments, this may have a big impact.

If that is the case, the GPU load should be rather low - scaled down with the GHzDays/day.

Could you please run two separate tests:
  1. use longer tests, like 72 to 73 bits. This should reduce the effect (note that a different (slower) kernel will be used - you may not get 1000 GHz even on the fast PCIe).
  2. run multiple instances of mfakto for the same device (e.g. from different directories using different worktodo.txt files). Maybe in total you can achieve 1000 GHz this way even on lower PCIe speeds?
BTW, the number of streams only plays a role when using the CPU sieve. When mfakto starts with GPU sieving, then it will display the settings that it regards in this mode.


I would also suggest to play around with FlushInterval: If the card is so fast, then the queue may run empty. Setting higher values (or zero to disable chunking) should help.
Bdot is offline   Reply With Quote
Old 2015-09-06, 01:08   #1323
UBR47K
 
UBR47K's Avatar
 
Aug 2015

22×17 Posts
Default

Is Fury X supported by the latest mfakto (0.15pre5)? Or in linux with (0.14) ?
UBR47K is offline   Reply With Quote
Old 2015-09-06, 10:29   #1324
Bdot
 
Bdot's Avatar
 
Nov 2010
Germany

3×199 Posts
Default

Quote:
Originally Posted by UBR47K View Post
Is Fury X supported by the latest mfakto (0.15pre5)? Or in linux with (0.14) ?
0.15 does not support anything yet - it's not ready to do real tasks.
0.14 can run on Fury X, though it will say that it does not know the chip. It will select GCN optimization which fits well (as far as I can tell - I have not yet had a chance to play around with that card).
Bdot is offline   Reply With Quote
Old 2015-09-06, 12:35   #1325
UBR47K
 
UBR47K's Avatar
 
Aug 2015

1048 Posts
Default

Quote:
Originally Posted by Bdot View Post
0.15 does not support anything yet - it's not ready to do real tasks.
0.14 can run on Fury X, though it will say that it does not know the chip. It will select GCN optimization which fits well (as far as I can tell - I have not yet had a chance to play around with that card).
I can offer to run tests on it if needed.
UBR47K is offline   Reply With Quote
Old 2015-09-07, 21:37   #1326
VictordeHolland
 
VictordeHolland's Avatar
 
"Victor de Hollander"
Aug 2011
the Netherlands

23×3×72 Posts
Default

After more than 1.5 years on AMD Catalyst 13.12 I finally updated my drivers to 15.7.1
Win7 64bit HD7950 -st:
Code:
Selftest statistics
  number of tests           3092
  successful tests          3092

selftest PASSED!
Always a relief to see it working after an update.
VictordeHolland is offline   Reply With Quote
Old 2015-09-10, 20:16   #1327
Bdot
 
Bdot's Avatar
 
Nov 2010
Germany

10010101012 Posts
Default

Quote:
Originally Posted by UBR47K View Post
I can offer to run tests on it if needed.
That would indeed be very helpful. If you have windows, please run the perftestmfakto.cmd from http://mersenneforum.org/mfakto/mfakto-0.15pre5/
If you have Linux, I'd need to prepare the binary for it first ...
Bdot is offline   Reply With Quote
Old 2015-09-10, 20:20   #1328
Bdot
 
Bdot's Avatar
 
Nov 2010
Germany

3×199 Posts
Default

Quote:
Originally Posted by VictordeHolland View Post
Always a relief to see it working after an update.
yes, we've seen surprises.
I run a windows box on 12.8 (the last version where I can use assembly language for the GPU), another on on some 13.x and a Linux box currently on latest and greatest ...
Bdot is offline   Reply With Quote
Old 2015-09-11, 01:44   #1329
UBR47K
 
UBR47K's Avatar
 
Aug 2015

22×17 Posts
Default

Quote:
Originally Posted by Bdot View Post
That would indeed be very helpful. If you have windows, please run the perftestmfakto.cmd from http://mersenneforum.org/mfakto/mfakto-0.15pre5/
If you have Linux, I'd need to prepare the binary for it first ...
I am using Linux here
UBR47K is offline   Reply With Quote
Old 2015-09-13, 23:59   #1330
airsquirrels
 
airsquirrels's Avatar
 
"David"
Jul 2015
Ohio

10000001012 Posts
Default

I've been using a specific stable commit build of 0.15 on Fury X cards on Linux to great success. I ran side by side with 0.14 and the 0.15pre5 build and both pass both normal and extended self tests and hit pretty close to the expected found factor percentages. 0.15 is definitely a bit faster though :)

I'm still having trouble on systems with less PCIe lanes but I have been too busy with work to investigate further yet.
airsquirrels is offline   Reply With Quote
Old 2015-09-30, 19:29   #1331
Bdot
 
Bdot's Avatar
 
Nov 2010
Germany

3×199 Posts
Default

My adventure into GPU assembly programming is over before it really started. My 7950 died (the VRMs did, to be specific). As a replacement I ordered an R9 380, also as an incentive to do some model-refresh in mfakto. The new card, however, is not recognized by the ancient driver, so I moved to 15.7.1 as well - no bad surprises so far.

However, I cannot see the int32 improvements that some owners of an R9 285 (which should be the same "Tonga"-chip) have reported. For me, the usual GCN selection works well.

To be sure about that I created a version that will performance-test each kernel for each TF job to find the fastest one for the exponent and bitlevel. I think I will keep this test as an option, to persist and re-use the results. That should allow to adapt to any upcoming development of APUs, GPUs, and whatever, across vendors.

BTW, the selftest failure of the latest code is caused by an incomplete merge from mfaktc. I the test case of very small exponents, but the code to handle them correctly is not yet in.
Bdot is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
gpuOwL: an OpenCL program for Mersenne primality testing preda GpuOwl 2718 2021-07-06 18:30
mfaktc: a CUDA program for Mersenne prefactoring TheJudger GPU Computing 3497 2021-06-05 12:27
LL with OpenCL msft GPU Computing 433 2019-06-23 21:11
OpenCL for FPGAs TObject GPU Computing 2 2013-10-12 21:09
Program to TF Mersenne numbers with more than 1 sextillion digits? Stargate38 Factoring 24 2011-11-03 00:34

All times are UTC. The time now is 17:23.


Mon Aug 2 17:23:34 UTC 2021 up 10 days, 11:52, 0 users, load averages: 1.99, 2.22, 2.23

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.