mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   mfakto: an OpenCL program for Mersenne prefactoring (https://www.mersenneforum.org/showthread.php?t=15646)

Bdot 2011-10-05 21:10

mfakto 0.09 - Linux version
 
1 Attachment(s)
Linux 64-bit

Bdot 2011-10-05 21:12

mfakto 0.09 - sources
 
1 Attachment(s)
... and the source code

KyleAskine 2011-10-27 00:05

Hi!

I am new here, and I might have missed a point of discussion earlier in the thread. If that is the case, I am sorry.

Anyway, I have two GPUs in my current PC (6950s flashed as 6970s), but it looks like mfakto only uses one of them. Is this a known issue, or could there be a problem with my setup?

Also, I have 11.9, but to get one GPU to 90%, it took me two cores.

Thanks for your help!

Bdot 2011-10-27 20:07

[QUOTE=KyleAskine;275878]
Anyway, I have two GPUs in my current PC (6950s flashed as 6970s), but it looks like mfakto only uses one of them.
[/QUOTE]
Did you already play around with the -d <dev-num> switch? This is supposed to let you decide which device an mfakto instance will use.

If that does not work, please send me the clinfo output (e.g. in C:\Program Files (x86)\AMD APP\bin\x86_64\clinfo.exe).

[QUOTE=KyleAskine;275878]
Also, I have 11.9, but to get one GPU to 90%, it took me two cores.
[/QUOTE]

That is expected with higher-end cards. One mfakto process will always use only one device, so you may need 4 instances in total to get both GPUs to 90%.

KyleAskine 2011-10-28 12:01

[QUOTE=Bdot;276017]Did you already play around with the -d <dev-num> switch? This is supposed to let you decide which device an mfakto instance will use.

If that does not work, please send me the clinfo output (e.g. in C:\Program Files (x86)\AMD APP\bin\x86_64\clinfo.exe).



That is expected with higher-end cards. One mfakto process will always use only one device, so you may need 4 instances in total to get both GPUs to 90%.[/QUOTE]

Thanks for your answers! I will play around with it today!

Ethan (EO) 2011-11-07 19:19

Upgrading to the 11.10 driver on x64 Windows broke the mfakto 0.09 executable for me, because the kernel compiler in this driver version is hung up on calls to mad24 with mixed argument types.

Casting all of the integer constants in the mad24 calls to (uint) fixed this for me!

Edit: No it didn't -- this lets the executable run but it's failing the selftest. I've used up the time I can spend on this today unfortunately but there it is.

ReEdit: Only the 64bit build is failing the selftest.

i.e.
[CODE]
nn.d1 = mad24(mul_hi(n.d0, qi), (uint)256, tmp >> 24);
[/CODE]

Also of note, I had to change the Platform Toolset setting to Windows7.1SDK from v100 to get this to build in Visual Studio Express.

Bdot 2011-11-07 19:47

[QUOTE=Ethan (EO);277464]Upgrading to the 11.10 driver on x64 Windows broke the mfakto 0.09 executable for me, because the kernel compiler in this driver version is hung up on calls to mad24 with mixed argument types.
[/QUOTE]

Uh-oh ... every new version adds new surprises ... With that I'm afraid to upgrade my drivers ;-)

I'll see if I can do something about it ...

Dubslow 2011-11-07 21:58

On another note Bdot, on the FAQ PDF available in the FAQ threads, it says not to report no factor results from mfakto. I don't know why it says that, but someone somewhere said it was the factors < 2^48, which has been fixed. If that was the reason why, please tell Brain to fix the PDF. I just hope we haven't lost too much work.

Brain 2011-11-08 06:12

PDF going to be updated... Do submit all results.

Bdot 2011-11-10 15:37

Fix for 11.10?
 
1 Attachment(s)
[QUOTE=Ethan (EO);277464]Upgrading to the 11.10 driver on x64 Windows broke the mfakto 0.09 executable for me.

[CODE]
nn.d1 = mad24(mul_hi(n.d0, qi), (uint)256, tmp >> 24);
[/CODE][/QUOTE]

I've replaced all of those constants by their uint equivalent (256 => 256u), and on my slow test box (W7-64) this seems to work, the small selftest succeeded, and so far it found all factors of the full selftest - still running.

[CODE]
nn.d1 = mad24(mul_hi(n.d0, qi), 256u, tmp >> 24);
[/CODE]I've attached the kernel file. Could you please check if this one still fails the selftest on your machine?

Ethan (EO) 2011-11-10 22:41

I'm still failing about half of the selftests with that kernel file. I'm going to revert to the exact contents of your 0.09 src zip to make sure I haven't mucked anything up in the project settings.


Ethan

edit: No luck -- unziped your src file directly, put the updated cl file in src, built Release/x64, and ran. No runtime cl compilation errors, but -- aha -- just noticed that it is passing the first test in each test case, and then failing the rest:

[CODE]
########## testcase 6/1558 ##########
tf(53134687, 68, 69, ...);
k_min = 2999999998380 - k_max = 3300000000000
Using GPU kernel "mfakto_cl_71_8"
class | candidates | time | avg. rate | SievePrimes | ETA | avg. wait
3120/4620 | 14.16M | 0.468s | 30.25M/s | 25000 | n.a. | 10798us
Result[00]: M53134687 has a factor: 337073926433410950601
found 1 factor(s) for M53134687 from 2^68 to 2^69 [mfakto 0.09-Win mfakto_cl_71_
8]
selftest for M53134687 passed (mfakto_cl_71_8)!
tf(): total time spent: 0.487s

tf(53134687, 68, 69, ...);
k_min = 2999999998380 - k_max = 3300000000000
Using GPU kernel "mfakto_cl_71_4"
class | candidates | time | avg. rate | SievePrimes | ETA | avg. wait
3120/4620 | 14.16M | 0.215s | 65.84M/s | 25000 | n.a. | 0us
no factor for M53134687 from 2^68 to 2^69 [mfakto 0.09-Win mfakto_cl_71_4]
ERROR: selftest failed for M53134687 (mfakto_cl_71_4)
no factor found
tf(): total time spent: 0.234s

tf(53134687, 68, 69, ...);
k_min = 2999999998380 - k_max = 3300000000000
Using GPU kernel "mfakto_cl_barrett79"
class | candidates | time | avg. rate | SievePrimes | ETA | avg. wait
3120/4620 | 14.16M | 0.214s | 66.15M/s | 25000 | n.a. | 0us
no factor for M53134687 from 2^68 to 2^69 [mfakto 0.09-Win mfakto_cl_barrett79]
ERROR: selftest failed for M53134687 (mfakto_cl_barrett79)
no factor found
tf(): total time spent: 0.233s

tf(53134687, 68, 69, ...);
k_min = 2999999998380 - k_max = 3300000000000
Using GPU kernel "mfakto_cl_barrett92"
class | candidates | time | avg. rate | SievePrimes | ETA | avg. wait
3120/4620 | 14.16M | 0.214s | 66.15M/s | 25000 | n.a. | 0us
no factor for M53134687 from 2^68 to 2^69 [mfakto 0.09-Win mfakto_cl_barrett92]
ERROR: selftest failed for M53134687 (mfakto_cl_barrett92)
no factor found
tf(): total time spent: 0.232s
[/code]

And that's consistent across the testcases.

reedit: The same executable runs without error on the CPU:

[CODE]
########## testcase 6/1558 ##########
tf(53134687, 68, 69, ...);
k_min = 2999999998380 - k_max = 3300000000000
Using GPU kernel "mfakto_cl_71_8"
class | candidates | time | avg. rate | SievePrimes | ETA | avg. wait
3120/4620 | 14.68M | 5.737s | 2.56M/s | 25000 | n.a. | 362964us
Result[00]: M53134687 has a factor: 337073926433410950601
found 1 factor(s) for M53134687 from 2^68 to 2^69 [mfakto 0.09-Win mfakto_cl_71_
8]
selftest for M53134687 passed (mfakto_cl_71_8)!
tf(): total time spent: 5.753s

tf(53134687, 68, 69, ...);
k_min = 2999999998380 - k_max = 3300000000000
Using GPU kernel "mfakto_cl_71_4"
class | candidates | time | avg. rate | SievePrimes | ETA | avg. wait
3120/4620 | 14.68M | 6.460s | 2.27M/s | 25000 | n.a. | 410991us
Result[00]: M53134687 has a factor: 337073926433410950601
found 1 factor(s) for M53134687 from 2^68 to 2^69 [mfakto 0.09-Win mfakto_cl_71_
4]
selftest for M53134687 passed (mfakto_cl_71_4)!
tf(): total time spent: 6.479s

tf(53134687, 68, 69, ...);
k_min = 2999999998380 - k_max = 3300000000000
Using GPU kernel "mfakto_cl_barrett79"
class | candidates | time | avg. rate | SievePrimes | ETA | avg. wait
3120/4620 | 14.68M | 4.439s | 3.31M/s | 25000 | n.a. | 276762us
Result[00]: M53134687 has a factor: 337073926433410950601
found 1 factor(s) for M53134687 from 2^68 to 2^69 [mfakto 0.09-Win mfakto_cl_bar
rett79]
selftest for M53134687 passed (mfakto_cl_barrett79)!
tf(): total time spent: 4.459s

tf(53134687, 68, 69, ...);
k_min = 2999999998380 - k_max = 3300000000000
Using GPU kernel "mfakto_cl_barrett92"
class | candidates | time | avg. rate | SievePrimes | ETA | avg. wait
3120/4620 | 14.68M | 5.800s | 2.53M/s | 25000 | n.a. | 366906us
Result[00]: M53134687 has a factor: 337073926433410950601
found 1 factor(s) for M53134687 from 2^68 to 2^69 [mfakto 0.09-Win mfakto_cl_bar
rett92]
selftest for M53134687 passed (mfakto_cl_barrett92)!
tf(): total time spent: 5.822s
[/CODE]

...and the 32bit build runs fine on both CPU and GPU.


All times are UTC. The time now is 16:13.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.