mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   mfakto: an OpenCL program for Mersenne prefactoring (https://www.mersenneforum.org/showthread.php?t=15646)

kracker 2013-04-25 18:19

[B][SIZE=2][URL="http://www.tomshardware.com/news/AMD-Radeon-Catalyst-Driver-WHQL,22251.html"]AMD Releases Catalyst 13.4 WHQL Drivers[/URL][/SIZE][/B]

[SIZE=2]@Bdot: Hope it fixes a few things you mentioned. :smile:[/SIZE]

Axelsson 2013-04-26 20:43

Hmmmm... interesting

[CODE] 93.2% | 0m28s | 48.74 | 0.429s | 37.75M | 87.99M/s | 1000 | 39.62%
93.3% | 0m31s | 42.67 | 0.490s | 37.75M | 77.04M/s | 1000 | 49.82%
93.4% | 0m31s | 42.50 | 0.492s | 37.75M | 76.73M/s | 1000 | 49.72%
93.5% | 0m30s | 43.75 | 0.478s | 37.75M | 78.97M/s | 1000 | 49.45%
Result[00]: M2010191 has a factor: 4610282327725138601
found 1 factor for M2010191 from 2^61 to 2^62 (partially tested) [mfakto 0.13pre1-Win barrett15_75_4]
tf(): total time spent: 10m 13.958s (32.70 GHz-days / day)[/CODE][CODE]got assignment: exp=2010191 bit_min=61 bit_max=63 (0.70 GHz-days)
Starting trial factoring M2010191 from 2^61 to 2^62 (0.23GHz-days)
Using GPU kernel "cl_barrett15_69"
No checkpoint file "M2010191.ckp" found.
Date Time | class Pct | time ETA | GHz-d/day Sieve Wait
Apr 26 22:23 | 4608 100.0% | 0.209 0m00s | 100.05 82485 0.00%
no factor for M2010191 from 2^61 to 2^62 [mfaktc mfakto 0.13pre3-Win cl_barrett15_69]
tf(): total time spent: 3m 35.350s (93.22 GHz-days / day)[/CODE]I told you I'm an expert in breaking things. :explode: :sorry:

/Göran

Bdot 2013-04-26 21:49

[QUOTE=Axelsson;338433]Hmmmm... interesting

[CODE] 93.2% | 0m28s | 48.74 | 0.429s | 37.75M | 87.99M/s | 1000 | 39.62%
93.3% | 0m31s | 42.67 | 0.490s | 37.75M | 77.04M/s | 1000 | 49.82%
93.4% | 0m31s | 42.50 | 0.492s | 37.75M | 76.73M/s | 1000 | 49.72%
93.5% | 0m30s | 43.75 | 0.478s | 37.75M | 78.97M/s | 1000 | 49.45%
Result[00]: M2010191 has a factor: 4610282327725138601
found 1 factor for M2010191 from 2^61 to 2^62 (partially tested) [mfakto 0.13pre1-Win barrett15_75_4]
tf(): total time spent: 10m 13.958s (32.70 GHz-days / day)[/CODE][CODE]got assignment: exp=2010191 bit_min=61 bit_max=63 (0.70 GHz-days)
Starting trial factoring M2010191 from 2^61 to 2^62 (0.23GHz-days)
Using GPU kernel "cl_barrett15_69"
No checkpoint file "M2010191.ckp" found.
Date Time | class Pct | time ETA | GHz-d/day Sieve Wait
Apr 26 22:23 | 4608 100.0% | 0.209 0m00s | 100.05 82485 0.00%
no factor for M2010191 from 2^61 to 2^62 [mfaktc mfakto 0.13pre3-Win cl_barrett15_69]
tf(): total time spent: 3m 35.350s (93.22 GHz-days / day)[/CODE]I told you I'm an expert in breaking things. :explode: :sorry:

/Göran[/QUOTE]
Hi Göran,

the kernel it is using internally is not the cl_barrett15_69 it claimed to use. So is the limits. The barrett32_77 that was the only GPU-sieve-enabled kernel does not work on FCs below 2^64.

The program should have refused this test. It did not because I made a hack to get that kernel to run. That's fixed already for 0.13pre4:

[code]
Selftest statistics
number of tests 73
successful tests 73

selftest PASSED!

got assignment: exp=2010191 bit_min=61 bit_max=63 (0.70 GHz-days)
Starting trial factoring M2010191 from 2^61 to 2^62 (0.23GHz-days)
ERROR: No suitable kernel found for bit_min=61, bit_max=62.
[/code]The next beta will have kernels to handle anything above 2^60 to 2^92.

Bdot 2013-04-26 21:54

[QUOTE=kracker;338294][B][SIZE=2][URL="http://www.tomshardware.com/news/AMD-Radeon-Catalyst-Driver-WHQL,22251.html"]AMD Releases Catalyst 13.4 WHQL Drivers[/URL][/SIZE][/B]

[SIZE=2]@Bdot: Hope it fixes a few things you mentioned. :smile:[/SIZE][/QUOTE]
Thanks. Yes, most likely they will. Do you think it is OK to require 13.4 for mfakto (at least for using the GPU sieve)?

kracker 2013-04-26 22:47

[QUOTE=Bdot;338442]Thanks. Yes, most likely they will. Do you think it is OK to require 13.4 for mfakto (at least for using the GPU sieve)?[/QUOTE]

I guess? Not sure if some laptops need a certain version, but I think it should be...

Axelsson 2013-04-27 10:06

[QUOTE=Bdot;338439]Hi Göran,

...

The next beta will have kernels to handle anything above 2^60 to 2^92.[/QUOTE]
Then I'll just try to wait patiently for the next beta. :sleep:

JoDu 2013-05-01 17:52

1 Attachment(s)
I tried using this package and it seems to work, but fails to find some of the factors during the the "-st" test.

With the default mfakto.ini it misses ~2/3 of them, but when:

NumStreams=1
VectorSize=1
GridSize=1

It only misses ~1/11 of the factors. Partial output from mfakto.hd4000.exe -d 11 -st along with a shot of GPU-Z is in the attached zip.

[ATTACH]9706[/ATTACH]

JoDu 2013-05-01 18:07

1 Attachment(s)
I ran part of the the "-st" test with mfakto.hd4000.exe and it failed to find a fair number of the factors, it missed ~2/3 with the default configuration and ~1/11 when I set:

NumStreams=1
VectorSize=1
GridSize=1

I attached a zip with a screen shot of GPU-Z and the output from the tests.

[ATTACH]9707[/ATTACH]

Bdot 2013-05-02 00:57

[QUOTE=JoDu;338960]I tried using this package and it seems to work, but fails to find some of the factors during the the "-st" test.

With the default mfakto.ini it misses ~2/3 of them, but when:

NumStreams=1
VectorSize=1
GridSize=1

It only misses ~1/11 of the factors. Partial output from mfakto.hd4000.exe -d 11 -st along with a shot of GPU-Z is in the attached zip.

[ATTACH]9706[/ATTACH][/QUOTE]
Well, that is something interesting. My guess is that the difference comes from the VectorSize, as the other settings influence performance, but not the code. If Intel's OpenCL does not properly support vectors, then they have to do some homework :smile:

At least, there is one kernel that never failed (barrett15_75). The major difference to the other kernels is, that it does not use any 32-bit multiplication, and no mul_hi.

I need to think about it ... maybe I can write some test code to see what exactly is failing here. Anyway, this shows how important the selftest is.

Thanks for your testing!

JoDu 2013-05-02 01:39

I did some more testing and isolated out the variables and it looks like it is actually GridSize which is doing it. I am currently running the whole battery with GridSize=0, but so far no failures have happened (26/1559).

JoDu 2013-05-02 02:19

With GridSize=0 mfakto.hd4000.exe -d 11 -st passes all 1559 tests.


All times are UTC. The time now is 23:08.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.