mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   mfakto: an OpenCL program for Mersenne prefactoring (https://www.mersenneforum.org/showthread.php?t=15646)

glebm 2015-07-25 22:26

Fixing int sign change warnings had no effect. Since GPU sieve works, I set GWDEBUG in gpusieve.cl and got lots of output like this:
[CODE]sloppy_mod_p: p doesn't match pinv!! p = 713509, pinv = 6018 (should be 6019)[/CODE]

When running without any arguments (a simple selftest) I get:
[CODE]
ERROR: selftest failed for M1031831 (cl_barrett15_69_gs)
no factor found
Selftest statistics
number of tests 30
successful tests 29
no factor found 1

selftest FAILED![/CODE]

maxa 2015-08-07 04:09

Fiji GPU
 
When I try to run mfakto on a R9 Fury X it tells me:

[CODE]
Select device - Get device info - Loading binary kernel file mfakto_Kernels.elf
Compiling kernels.
WARNING: Unknown GPU name, assuming GCN. Please post the device name "Fiji (Advanced Micro Devices, Inc.)" to [URL]http://www.mersenneforum.org/showthread.php?t=15646[/URL] to have it added to mfakto. Set GPUType in mfakto1.ini to select a GPU type yourself to avoid this
warning.
OpenCL device info
name Fiji (Advanced Micro Devices, Inc.)
device (driver) version OpenCL 2.0 AMD-APP (1800.8) (1800.8 (VM))
maximum threads per block 256
maximum threads per grid 16777216
number of multiprocessors 64 (4096 compute elements)
clock rate 1050MHz
Automatic parameters
threads per grid 256
optimizing kernels for GCN
Started a simple selftest ...
######### testcase 4/17 (M50863909[69-70]) #########
mfakto will exit once the current test is finished.
[/CODE]and then hangs in testcase 4/17.
Is there any setting I can change to make it work or does it require an updated version of mfakto?

Bdot 2015-08-09 19:23

[QUOTE=glebm;406481]The latest source from GitHub, compiled Visual Studio 2013 and App SDK 3.0 Beta running on Tonga (R9 380) fails both selftests (0.14) works.
[/QUOTE]
Version 0.14 is currently the latest stable one. Many of the features for 0.15 are unfinished yet; the version from github cannot be used right now (and it's not because of the unsigned warnings).

I need to find some time to fix the new features ...

Bdot 2015-08-09 19:30

[QUOTE=maxa;407377]When I try to run mfakto on a R9 Fury X it tells me:

[CODE]
Select device - Get device info - Loading binary kernel file mfakto_Kernels.elf
Compiling kernels.
WARNING: Unknown GPU name, assuming GCN. Please post the device name "Fiji (Advanced Micro Devices, Inc.)" to [URL]http://www.mersenneforum.org/showthread.php?t=15646[/URL] to have it added to mfakto. Set GPUType in mfakto1.ini to select a GPU type yourself to avoid this
warning.
OpenCL device info
name Fiji (Advanced Micro Devices, Inc.)
device (driver) version OpenCL 2.0 AMD-APP (1800.8) (1800.8 (VM))
maximum threads per block 256
maximum threads per grid 16777216
number of multiprocessors 64 (4096 compute elements)
clock rate 1050MHz
Automatic parameters
threads per grid 256
optimizing kernels for GCN
Started a simple selftest ...
######### testcase 4/17 (M50863909[69-70]) #########
mfakto will exit once the current test is finished.
[/CODE]and then hangs in testcase 4/17.
Is there any setting I can change to make it work or does it require an updated version of mfakto?[/QUOTE]
I'll add Fiji to the list of known GCN chips. It would be good to have some performance test so I know how to best use the chip, but I guess for that the other issue needs to be fixed first:

Could you please run
[code] mfakto -i mfakto1.ini -st2[/code]this should show exactly at which kernel it hangs,

And maybe give the perftestmfakto.cmd from [URL]http://mersenneforum.org/mfakto/mfakto-0.15pre5/mfakto-0.15pre5.zip[/URL] a chance - if it does not stop right away, the output might be helpful.

Bdot 2015-08-09 19:54

[QUOTE=Ethan (EO);403703]I've pulled the card now, but did manage to run the 0.15pre5 benchmarking script first; results attached.[/QUOTE]
Thank you for the HD6950 Cayman results.
[code]Resulting speed for M66362159:
bit_min - bit_max GHz-days/day kernelname
60 - 69 209.504 cl_barrett15_69_gs
69 - 70 196.994 cl_barrett15_71_gs
70 - 74 185.303 cl_barrett15_74_gs
74 - 77 162.719 cl_barrett32_77_gs
77 - 88 147.887 cl_barrett32_88_gs
88 - 92 117.277 cl_barrett32_92_gs
[/code]They confirm that the burnt electricity is probably not worth it - there are way faster and more efficient cards these days. The 205 GHz-days/day that are listed on [url]http://www.mersenne.ca/mfaktc.php[/url] for the 6950 are probably a bit optimistic and can only be achieved when factoring up to 69 bits.

airsquirrels 2015-08-22 14:03

I am seeing something interested and unexpected with mfakto and multiple GPUs. On one much older host system, the PCIe links are reducing to 5GT/s x16 for the primary card and 5GT/s x4 for the secondary. What is surprising is this has a pretty massive effect on mfakto's performance. This is strange because mfakto should not need much bandwidth to the cards at all, and to my understanding of the code should only be making a (trivial) call for each class.

These are Fury X cards which normally get around 1000GhzDay/Day at 8GT/s x16 PCIe speeds.

I will see a few different tiers of speeds that directly correlate to negotiated PCIe speeds, all factoring using cl_barret15_73_gs_2.

PCIe Speed - GhzDay/day
8GT/s x16 - 1000
5GT/s x16 - 890
5GT/s x8 - 690
5GT/s x4 - 480
2.5GT/s x4 - 190

Increasing the number of streams did not seem to have any appreciable effect.

LaurV 2015-08-22 17:35

That is strange! I always have a mixture of x8 and x16 cards and don't see any difference in output for my gtx580s...
Are you sure you call mfaktX with appropriate "-d gpu_number"?

airsquirrels 2015-08-22 19:01

Yes, I can verify the correct card is initialized and the performance matches the PCIe lane speed even if the other card is idle. With my slower cards this is not noticeable until 5GT/4x or even 2.5GT/4x but the Fury X sees the penalty much earlier. I might try Windows instead of Debian to see about the Catalyst driver version.

I have switched the two cards and the performance definitely follows the slot, this is true in two different systems as well so that rules out the motherboard/chipset. I will say both of these systems are AMD CPUs, but my Intel systems are all much faster systems overall with plenty of PCIe lanes.

I did notice that we do actually do multiple kernel schedulings per class as it works through the K range, but adjusting the sieve size parameters changed performance an order of magnitude less than the PCIe slot.

chalsall 2015-08-22 19:21

[QUOTE=airsquirrels;408552]I have switched the two cards and the performance definitely follows the slot, this is true in two different systems as well so that rules out the motherboard/chipset.[/QUOTE]

"The most exciting phrase to hear in science, the one that heralds new discoveries, is not 'Eureka!' but 'That's funny...'" -- Isaac Asimov

axn 2015-08-23 05:53

[QUOTE=airsquirrels;408552]Yes, I can verify the correct card is initialized and the performance matches the PCIe lane speed even if the other card is idle. With my slower cards this is not noticeable until 5GT/4x or even 2.5GT/4x but the Fury X sees the penalty much earlier. I might try Windows instead of Debian to see about the Catalyst driver version.

I have switched the two cards and the performance definitely follows the slot, this is true in two different systems as well so that rules out the motherboard/chipset. I will say both of these systems are AMD CPUs, but my Intel systems are all much faster systems overall with plenty of PCIe lanes.

I did notice that we do actually do multiple kernel schedulings per class as it works through the K range, but adjusting the sieve size parameters changed performance an order of magnitude less than the PCIe slot.[/QUOTE]

I guess it doesn't hurt to ask. Are you using GPU Sieve or CPU Sieve?

LaurV 2015-08-23 14:37

[QUOTE=axn;408581]I guess it doesn't hurt to ask. Are you using GPU Sieve or CPU Sieve?[/QUOTE]
Bingo, good question! (as usual axn to the point!).


All times are UTC. The time now is 22:59.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.