mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   mfaktc: a CUDA program for Mersenne prefactoring (https://www.mersenneforum.org/showthread.php?t=12827)

Prime95 2012-07-12 19:44

Build problems - help
 
I installed 64-bit Ubuntu 10.04, Cuda version 5, and mfaktc-0.18.tar.gz. I did a make. Running the built mfaktc fails all the self-tests. Where do I go from here?

BTW, I built CUDALucas 2.0.1 and it works just fine on my GTX 460.

Dubslow 2012-07-12 19:47

Isn't CUDA 5 still in preview? Perhaps there's something CUDA 5 doesn't like about mfaktc. (I assume CUDALucas was built with CUDA 5? I know that it automatically targets arch_1.3, which would force CUDA < 5, so that may be why it avoided issues.)

Prime95 2012-07-12 19:58

A more general question from a Linux and CUDA neophyte.

What is the recommended development environment for GPU programming? Is there a Linux IDE that is integrated with CUDA debugging and profiling tools?

chalsall 2012-07-12 20:18

[QUOTE=Prime95;304557]Is there a Linux IDE that is integrated with CUDA debugging and profiling tools?[/QUOTE]

[URL="http://ydl.net/eclipse_cuda_plugin/"]http://ydl.net/eclipse_cuda_plugin/[/URL]

For those who use IDEs, Eclipse is your (Open Source) friend.... :smile:

TheJudger 2012-07-13 08:36

[QUOTE=Prime95;304554]I installed 64-bit Ubuntu 10.04, Cuda version 5, and mfaktc-0.18.tar.gz. I did a make. Running the built mfaktc fails all the self-tests. Where do I go from here?

BTW, I built CUDALucas 2.0.1 and it works just fine on my GTX 460.[/QUOTE]

So it builds without much warnings (usually there are some warning in sieve.c about signed/unsigned variable mismatch)?
I guess you're using CUDA 5.0.7 (preview release), right?
Fails all the selftest -> missing factors (no factor found) or wrong factors?
I guess you're using default src/params.c and mfaktc.ini, right?

You can try to enable some debugging code in src/params.h:[LIST][*]If you enable [I]RAW_GPU_BENCH[/I] the effect is basically that you disable the sieve code (ofcouse this slows down the application)[*]you can try to enable [I]CHECKS_MODBASECASE[/I] and [I]USE_DEVICE_PRINTF[/I], this will enable debug code in the long integer division code, if something is really screwed the [I]USE_DEVICE_PRINTF[/I] will cause overflows because of too much printfs in GPU context.[/LIST]
Another option to test: add [I]-malign-double[/I] to the CFLAGS in the Makefile (default in mfaktc-0.19...). There are some known issues with CUDA/gcc about alignment of 64bit variables.

Can you provide me some lines of [I]./mfaktc.exe -v 2 -st[/I]? (just start it and stop after a few seconds by pressing <Ctrl>+C) and send me the screen output.

I'll try CUDA 5.0.7 later on my system.

Oliver

P.S. at least it is good to know that the selftest works...

Prime95 2012-07-13 15:33

[QUOTE=TheJudger;304626]So it builds without much warnings (usually there are some warning in sieve.c about signed/unsigned variable mismatch)?
I guess you're using CUDA 5.0.7 (preview release), right?
Fails all the selftest -> missing factors (no factor found) or wrong factors?
I guess you're using default src/params.c and mfaktc.ini, right?

Can you provide me some lines of [I]./mfaktc.exe -v 2 -st[/I]? (just start it and stop after a few seconds by pressing <Ctrl>+C) and send me the screen output.[/QUOTE]

Correct - a simple install and make without modifying anything. I'll try your debug suggestions later. Here is the output you wanted:

[CODE]mfaktc v0.18 (64bit built)

Compiletime options
THREADS_PER_BLOCK 256
SIEVE_SIZE_LIMIT 32kiB
SIEVE_SIZE 193154bits
SIEVE_SPLIT 250
MORE_CLASSES enabled

Runtime options
SievePrimes 25000
SievePrimesAdjust 1
NumStreams 3
CPUStreams 3
GridSize 3
WorkFile worktodo.txt
Checkpoints enabled
CheckpointDelay 30s
Stages enabled
StopAfterFactor bitlevel
PrintMode full
AllowSleep no

CUDA version info
binary compiled for CUDA 5.0
CUDA runtime version 5.0
CUDA driver version 5.0

CUDA device info
name GeForce GTX 460
compute capability 2.1
maximum threads per block 1024
number of multiprocessors 7 (336 shader cores)
clock rate 1502MHz

Automatic parameters
threads per grid 917504

########## testcase 1/1557 ##########
Starting trial factoring M50804297 from 2^67 to 2^68
k_min = 1599999998520
k_max = 1900000000000
Using GPU kernel "71bit_mul24"
class | candidates | time | ETA | avg. rate | SievePrimes | CPU wait
3387/4620 | 14.68M | 0.213s | n.a. | 68.92M/s | 25000 | 13.45%
no factor for M50804297 from 2^67 to 2^68 [mfaktc 0.18 71bit_mul24]
ERROR: selftest failed for M50804297
no factor found
tf(): total time spent: 0.219s

Starting trial factoring M50804297 from 2^67 to 2^68
k_min = 1599999998520
k_max = 1900000000000
Using GPU kernel "75bit_mul32"
class | candidates | time | ETA | avg. rate | SievePrimes | CPU wait
3387/4620 | 14.68M | 0.145s | n.a. | 101.24M/s | 28125 | 0.38%
no factor for M50804297 from 2^67 to 2^68 [mfaktc 0.18 75bit_mul32]
ERROR: selftest failed for M50804297
no factor found
tf(): total time spent: 0.151s

Starting trial factoring M50804297 from 2^67 to 2^68
k_min = 1599999998520
k_max = 1900000000000
Using GPU kernel "95bit_mul32"
class | candidates | time | ETA | avg. rate | SievePrimes | CPU wait
3387/4620 | 14.68M | 0.164s | n.a. | 89.51M/s | 24609 | 0.34%
no factor for M50804297 from 2^67 to 2^68 [mfaktc 0.18 95bit_mul32]
ERROR: selftest failed for M50804297
no factor found
tf(): total time spent: 0.170s

Starting trial factoring M50804297 from 2^67 to 2^68
k_min = 1599999998520
k_max = 1900000000000
Using GPU kernel "barrett79_mul32"
class | candidates | time | ETA | avg. rate | SievePrimes | CPU wait
3387/4620 | 14.68M | 0.123s | n.a. | 119.35M/s | 21532 | 0.49%
no factor for M50804297 from 2^67 to 2^68 [mfaktc 0.18 barrett79_mul32]
ERROR: selftest failed for M50804297
no factor found
tf(): total time spent: 0.128s
[/CODE]

Prime95 2012-07-13 15:44

[QUOTE=TheJudger;304626][*]If you enable [I]RAW_GPU_BENCH[/I] the effect is basically that you disable the sieve code (ofcouse this slows down the application)[*]you can try to enable [I]CHECKS_MODBASECASE[/I] and [I]USE_DEVICE_PRINTF[/I], this will enable debug code in the long integer division code
Another option to test: add [I]-malign-double[/I] to the CFLAGS in the Makefile (default in mfaktc-0.19...).[/QUOTE]

None of these helped or diagnosed any problems.

TheJudger 2012-07-13 18:16

OK, next step: I'll try CUDA 5.0.7 on my system. I'm not sure if this will happen today.

Oliver

Prime95 2012-07-13 19:05

[QUOTE=TheJudger;304646]OK, next step: I'll try CUDA 5.0.7 on my system. I'm not sure if this will happen today.[/QUOTE]

Thanks. No rush, I'm primarily playing with gpusieve.

BTW, my bigger problem is NVidia's developer web site is down thanks to hackers.

TheJudger 2012-07-14 10:39

I'm able to reproduce this issue with mfaktc 0.18 + CUDA toolkit 5.0.7-preview on openSUSE 12.1.

CUDA 5.0 driver + CUDA toolkit 4.2 -> OK
CUDA 5.0 driver + CUDA toolkit 5.0.7 -> fail

Oliver

TheJudger 2012-07-14 12:04

first impressions:
[LIST][*]floatingpoint approximation for divisions seems to be OK[*]data transfer seems to be OK[/LIST]I see issue with carrys (using carry flag) in e.g. tf_barrett96.cu: mod_simple_96(). WTF?

So for now I can only say to everyone: [COLOR="Red"][B]Don't use CUDA Toolkit 5.0.7 for mfaktc![/B][/COLOR]

Oliver


All times are UTC. The time now is 23:17.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.