mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   mfakto: an OpenCL program for Mersenne prefactoring (https://www.mersenneforum.org/showthread.php?t=15646)

flashjh 2013-01-30 05:22

[QUOTE=Koyaanisqatsi;326576]They do 5 ms/it on the current workload for CUDALucas.[/QUOTE]
I'm getting ~5 ms/it with a 580 now, and that's without tweaking the FFT. That doesn't seem very exciting :no:. I hope the nVidia Titans are not the same as a 580 for CUDA work. Especially since they are going to cost $899 US.

Do you have a K20X or access to one?

LaurV 2013-01-30 07:54

[QUOTE=Koyaanisqatsi;326576]They do 5 ms/it on the current workload for CUDALucas.[/QUOTE]

[QUOTE=flashjh;326610]I'm getting ~5 ms/it with a 580 now, and that's without tweaking the FFT. That doesn't seem very exciting :no:. I hope the nVidia Titans are not the same as a 580 for CUDA work. Especially since they are going to cost $899 US.

Do you have a K20X or access to one?[/QUOTE]
Wrong tread, this is mfaktO. I suspect some mod moved them here, as the discussion is not connected, too.

kladner 2013-01-30 23:35

[QUOTE=kracker;326491]I don't think it will let you even install the drivers without a AMD card... I think[/QUOTE]

YMMV, but on my Asus AMD board, if I run the chipset driver install with the ATI iGPU disabled in BIOS, the ATI graphics driver is not an available option.

EDIT: All my graphics and crunching run on nVidia. I have, however, been able to run graphics on the ATI iGPU in combination with crunching on a GTX 460. I doubt that the HD49xx (don't remember exactly) iGPU has the cojones to run mfakto. The problem for using it for display, is that I then loose the discreet GPU processing power for Photoshop/Bridge. Hence, I don't use it at all.

Bdot 2013-01-30 23:49

[QUOTE=sdbardwick;326540]Results after installing AMD Catalyst (it automatically installed the AMD APP SDK).
Any truncated output is due to crash.[/QUOTE]

Brilliant, thanks a lot for these tests!

[LIST][*]The first test (running on the CPU) shows that even without the AMD SDK, mfakto can compile the kernels, but they don't work with the Intel compiler (no factor found).[*]After installing the AMD SDK (as part of Catalyst), you need to force mfakto to use the Intel platform (mfakto w/o -d switch did not find anything)[*]Now the CPU device is missing, instead the HD4000 is available as the only usable device, but it supports less features (no surprise here). The -35 error is CL_INVALID_QUEUE_PROPERTIES - I can remove my request for out-of-order processing on the device, that should get around this error.[*]Compilation of my OpenCL sources does not automatically propagate constants to vectors - I can change that in the code. And it does not know about printf - easy as well.[/LIST]I think I can provide a version of mfakto soon that solves these issues, I just don't have time right now ... and then we may face the not-working kernels problem again, this time on the GPU. And this one will be harder to tackle.

Rodrigo 2013-01-31 00:15

Bdot,

Thank you very much for continuing to follow up on this frustrating issue! :tu:

Rodrigo

Bdot 2013-01-31 00:21

[QUOTE=kladner;326755]YMMV, but on my Asus AMD board, if I run the chipset driver install with the ATI iGPU disabled in BIOS, the ATI graphics driver is not an available option.

EDIT: All my graphics and crunching run on nVidia. I have, however, been able to run graphics on the ATI iGPU in combination with crunching on a GTX 460. I doubt that the HD49xx (don't remember exactly) iGPU has the cojones to run mfakto. The problem for using it for display, is that I then loose the discreet GPU processing power for Photoshop/Bridge. Hence, I don't use it at all.[/QUOTE]

The iGPU of the AMD APUs is capable of running mfakto and delivers ~30GHz-days/day ([FONT=Verdana][SIZE=2][COLOR=#000000][FONT=verdana,geneva][SIZE=2]HD 6550D / A8 3850)[/SIZE][/FONT][/COLOR][/SIZE][/FONT]. You should not try the older iGPUs. Though anything HD4xxx and higher will work, you may end up with 3 or 5 GHz-days/day.

Bdot 2013-01-31 01:12

[QUOTE=Bdot;326759]I just don't have time right now ... [/QUOTE]

OK, you made me curious, and I can sleep later :smile:

I made the changes and created a special [URL="http://www.mersenneforum.org/mfakto/mfakto-0.12-hd4000/mfakto-hd4000.zip"]hd4000 package[/URL]. Would you give it a try? I hope I caught all the compile errors.

The mfakto.hd4000.exe binary is the ordinary one, if that succeeds compiling the kernels with the -d 11 switch, then you can try 'mfakto.hd4000-pi.exe - d 11 -st' in order to get detailed performance numbers for the different kernels.

Rodrigo 2013-01-31 01:43

Hi Bdot,

I take it that that hd4000 package is intended specifically for sdbardwick to try on his machine, is that right?

Rodrigo

Bdot 2013-01-31 02:04

[QUOTE=Rodrigo;326777]Hi Bdot,

I take it that that hd4000 package is intended specifically for sdbardwick to try on his machine, is that right?

Rodrigo[/QUOTE]
Yes. And for others who see the HD4000 in the clinfo output and have the ADM SDK as well as the Intel SDK installed.

sdbardwick 2013-01-31 02:34

1 Attachment(s)
Progress...I think.
Hangs on selftest
[CODE]C:\hd>mfakto.hd4000 -d 11
mfakto 0.12-Win-HD4000 (64bit build)


Runtime options
Inifile mfakto.ini
WARNING: Cannot read SievePrimesMin from inifile, using default value (5000)
SievePrimesMin 5000
WARNING: Cannot read SievePrimesMax from inifile, using default value (1000000)
SievePrimesMax 1000000
WARNING: Cannot read SievePrimes from inifile, using default value (25000)
SievePrimes 25000
WARNING: Cannot read SievePrimesAdjust from inifile, using default value (0)
SievePrimesAdjust 0
WARNING: Cannot read NumStreams from inifile, using default value (3)
NumStreams 3
WARNING: Cannot read GridSize from inifile, using default value (3)
GridSize 3
WARNING: Cannot read WorkFile from inifile, using default (worktodo.txt)
WorkFile worktodo.txt
WARNING: Cannot read ResultsFile from inifile, using default (results.txt)
ResultsFile results.txt
WARNING: Cannot read Checkpoints from inifile, enabled by default
Checkpoints enabled
WARNING: Cannot read CheckpointDelay from inifile, set to 300s by default
CheckpointDelay 300s
WARNING: Cannot read Stages from inifile, enabled by default
Stages enabled
WARNING: Cannot read StopAfterFactor from inifile, set to 1 by default
StopAfterFactor bitlevel
WARNING: Cannot read PrintMode from inifile, set to 0 by default
PrintMode full
V5UserID none
ComputerID none
WARNING: Cannot read AllowSleep from inifile, set to 0 by default
AllowSleep no
TimeStampInResults no
WARNING: Cannot read VectorSize from inifile, set to 4 by default
VectorSize 4
WARNING: Cannot read GPUType from inifile, using default (AUTO)
GPUType AUTO
WARNING: Cannot read SieveOnGPU from inifile, set to 0 by default
SieveOnGPU no
WARNING: Cannot read SmallExp from inifile, set to 0 by default
SmallExp no
WARNING: Cannot read SieveCPUMask from inifile, set to 0 by default
SieveCPUMask 0
Compiletime options
SIEVE_SIZE_LIMIT 36kiB
SIEVE_SIZE 289731bits
SIEVE_SPLIT 250
MORE_CLASSES enabled
Select device - Get device info - Compiling kernels ..........
WARNING: Unknown GPU name, assuming VLIW5 type. Please post the device name "Int
el(R) HD Graphics 4000 (Intel(R) Corporation)" to http://www.mersenneforum.org/s
howthread.php?t=15646 to have it added to mfakto. Set GPUType in mfakto.ini to s
elect a GPU type yourself and avoid this warning.

OpenCL device info
name Intel(R) HD Graphics 4000 (Intel(R) Corporation)
device (driver) version OpenCL 1.1 (9.17.10.2932)
maximum threads per block 512
maximum threads per grid 134217728
number of multiprocessors 16 (1280 compute elements)
clock rate 350MHz

Automatic parameters
threads per grid 1048576
optimizing kernels for VLIW5

running a simple selftest ...
ERROR: selftest failed for M53015323 (mfakto_cl_barrett92)
no factor found
########## testcase 2/19 (#2598) ##########[/CODE]

[CODE]
C:\hd>mfakto.hd4000-pi -d 11 -st
mfakto 0.12-Win-HD4000 (64bit build)


Runtime options
Inifile mfakto.ini
WARNING: Cannot read SievePrimesMin from inifile, using default value (5000)
SievePrimesMin 5000
WARNING: Cannot read SievePrimesMax from inifile, using default value (1000000)
SievePrimesMax 1000000
WARNING: Cannot read SievePrimes from inifile, using default value (25000)
SievePrimes 25000
WARNING: Cannot read SievePrimesAdjust from inifile, using default value (0)
SievePrimesAdjust 0
WARNING: Cannot read NumStreams from inifile, using default value (3)
NumStreams 3
WARNING: Cannot read GridSize from inifile, using default value (3)
GridSize 3
WARNING: Cannot read WorkFile from inifile, using default (worktodo.txt)
WorkFile worktodo.txt
WARNING: Cannot read ResultsFile from inifile, using default (results.txt)
ResultsFile results.txt
WARNING: Cannot read Checkpoints from inifile, enabled by default
Checkpoints enabled
WARNING: Cannot read CheckpointDelay from inifile, set to 300s by default
CheckpointDelay 300s
WARNING: Cannot read Stages from inifile, enabled by default
Stages enabled
WARNING: Cannot read StopAfterFactor from inifile, set to 1 by default
StopAfterFactor bitlevel
WARNING: Cannot read PrintMode from inifile, set to 0 by default
PrintMode full
V5UserID none
ComputerID none
WARNING: Cannot read AllowSleep from inifile, set to 0 by default
AllowSleep no
TimeStampInResults no
WARNING: Cannot read VectorSize from inifile, set to 4 by default
VectorSize 4
WARNING: Cannot read GPUType from inifile, using default (AUTO)
GPUType AUTO
WARNING: Cannot read SieveOnGPU from inifile, set to 0 by default
SieveOnGPU no
WARNING: Cannot read SmallExp from inifile, set to 0 by default
SmallExp no
WARNING: Cannot read SieveCPUMask from inifile, set to 0 by default
SieveCPUMask 0
Compiletime options
SIEVE_SIZE_LIMIT 36kiB
SIEVE_SIZE 289731bits
SIEVE_SPLIT 250
MORE_CLASSES enabled
CL_PERFORMANCE_INFO enabled (DEBUG option)
Select device - Get device info - Compiling kernels ..........
WARNING: Unknown GPU name, assuming VLIW5 type. Please post the device name "Int
el(R) HD Graphics 4000 (Intel(R) Corporation)" to http://www.mersenneforum.org/s
howthread.php?t=15646 to have it added to mfakto. Set GPUType in mfakto.ini to s
elect a GPU type yourself and avoid this warning.

OpenCL device info
name Intel(R) HD Graphics 4000 (Intel(R) Corporation)
device (driver) version OpenCL 1.1 (9.17.10.2932)
maximum threads per block 512
maximum threads per grid 134217728
number of multiprocessors 16 (1280 compute elements)
clock rate 350MHz

Automatic parameters
threads per grid 1048576
optimizing kernels for VLIW5

########## testcase 1/1559 ##########
Starting trial factoring M50804297 from 2^67 to 2^68 (0.59GHz-days)
k_min = 1599999998520 - k_max = 1900000000000
Using GPU kernel "barrett15_75"
done | ETA | GHz |time/class| #FCs | avg. rate | SieveP. |CPU idle
1048576 FCs copied in 0.37 ms (11287.15 MB/s), proc'd in 167.74 ms (6.25 M/s)
[/CODE]

UPDATED CLINFO (After Catalyst install)

Edit: Additional information for above -st run (let the program run after "hang")
[CODE]########## testcase 1/1559 ##########
Starting trial factoring M50804297 from 2^67 to 2^68 (0.59GHz-days)
k_min = 1599999998520 - k_max = 1900000000000
Using GPU kernel "barrett15_75"
done | ETA | GHz |time/class| #FCs | avg. rate | SieveP. |CPU idle
1048576 FCs copied in 0.37 ms (11287.15 MB/s), proc'd in 167.74 ms (6.25 M/s)
Error -5: Copying h_ktab(clEnqueueWriteBuffer)
ERROR from tf_class.
Error exit as selftest failed[/CODE]

kracker 2013-01-31 02:38

1 Attachment(s)
[QUOTE=Bdot;326767]The iGPU of the AMD APUs is capable of running mfakto and delivers ~30GHz-days/day ([FONT=Verdana][SIZE=2][COLOR=#000000][FONT=verdana,geneva][SIZE=2]HD 6550D / A8 3850)[/SIZE][/FONT][/COLOR][/SIZE][/FONT].[/QUOTE]

A bit different here. :smile:
Running two instances on integrated (6550D) to max it.


All times are UTC. The time now is 23:06.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.