mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Msieve (https://www.mersenneforum.org/forumdisplay.php?f=83)
-   -   Msieve with GPU support (https://www.mersenneforum.org/showthread.php?t=12562)

smh 2010-03-10 14:46

[QUOTE=xilman;207747]Does sed exist in Windows?[/QUOTE][url]http://sourceforge.net/projects/unxutils/[/url]

sleigher 2010-03-10 17:31

building cuda on RHEL 5.3
 
Hi all, I am new here and have been reading the threads. I have tried a number of hings to get this to compile but it just isn't happening. I have a Dell Precision T3500 with a 4 core Xeon 2.93 GHZ, 6 GB of RAM and a Nvidia Quadro NVS 295 running Centos 5.3. This is the same as REHL 5.3 I have installed the cuda toolkit and set the ENV vars

[code]
CUDA_INC_DIR=/usr/local/cuda/include
CUDA_LIB_DIR=/usr/local/cuda/lib64
[/code]I have also tried what xilman posted that he removed the () from the makefile at

[code]
[FONT=monospace] CFLAGS += -I"$CUDA_INC_DIR" -DHAVE_CUDA
[/FONT][/code][FONT=monospace]

[FONT=Verdana]So I guess I am ondering if this can be built against a Quadro card? I have the latest nvidia 64 bit driver, 190.53 and the CUDA toolkit is 2.3. The software all seems right according to what I have read. I di notice in a post that it said I needed 2 files that were at someones website but when I tried to download them I got a 404 error.

This is where I am. I have searched for this file cuda.lib on my system and I plainly do not have it. Is this one of the files I need from an earlier post?

[code]
gcc -D_FILE_OFFSET_BITS=64 -Wall -W -I. -Iinclude -Ignfs -Ignfs/poly -Ignfs/poly/stage1 -I"UDA_INC_DIR" -DHAVE_CUDA demo.c -o msieve \
libmsieve.a "/cuda.lib" -lgmp -lm -lpthread
gcc: /cuda.lib: No such file or directory
make: *** [x86_64] Error 1
[/code]Any clues for the newbie? Thanks in advance.....
[/FONT] [/FONT]

sleigher 2010-03-10 18:18

So I tried linking /usr/lib64/libcuda.so.190.53 to /cuda.lib and I get a little further but it still errors out.

[code]
gcc -D_FILE_OFFSET_BITS=64 -Wall -W -I. -Iinclude -Ignfs -Ignfs/poly -Ignfs/poly/stage1 -I"UDA_INC_DIR" -DHAVE_CUDA demo.c -o msieve \
libmsieve.a "/cuda.lib" -lgmp -lm -lpthread
libmsieve.a(stage1_sieve_deg46_64.no): In function `sieve_lattice_batch':
stage1_sieve_deg46_64.c:(.text+0x8e4): undefined reference to `cuGetErrorMessage'
stage1_sieve_deg46_64.c:(.text+0x9cc): undefined reference to `cuGetErrorMessage'
stage1_sieve_deg46_64.c:(.text+0xa14): undefined reference to `cuGetErrorMessage'
stage1_sieve_deg46_64.c:(.text+0xaad): undefined reference to `cuGetErrorMessage'
stage1_sieve_deg46_64.c:(.text+0xbf5): undefined reference to `cuGetErrorMessage'
libmsieve.a(stage1_sieve_deg46_64.no):stage1_sieve_deg46_64.c:(.text+0xc3d): more undefined references to `cuGetErrorMessage' follow
collect2: ld returned 1 exit status
make: *** [x86_64] Error 1
[/code]Not sure what to do about that.....

sleigher 2010-03-10 18:35

Problem solved. Getting the failure, I had to make clean and then make. I guess starting form the beginning with everything in place mattered.....

Thanks. At least this is posted so others will see when building on linux.

jasonp 2010-03-10 19:18

sleigher: search your system for 'libcuda.a' or 'libcuda64.a', there probably is a preferred library to provide Nvidia's driver functions. Also, when switching from GPU to non-GPU builds, or vice versa, you have to 'make clean' because the list of functions to be compiled is different and they depend on different files; the makefile doesn't track those dependcies very well.

Unfortunately I have limited means to test the GPU code on linux, though others here really pound on it.

Paul:others have reported that the E-value bound for very large jobs is too conservative; maybe you can suggest changes to the minimum E-value for the code to use.

sleigher 2010-03-10 19:20

So now that things are working and I can begin workin on my problem I have a few questions if you all don't mind.

in factsieve.py where it asks NUM_GPU = 0 that measn GPU number 0. This GPU has 8 cores in it. Will the software use all 8 cores? Or how does that wrk exactly.

The other question I have is i I wan to split step 2 and 3 to 2 hosts with the same nvidia card, will I need to run each step individually? Or can I run until I get polys selected and just stop it and move half the data over there manually? What ould the best process be?

I know these are basic questions and I am doing a lot of reading but thought I would ask those who have done it.

Thanks

--

jasonp 2010-03-10 20:35

If all the computations are going to a single card, msieve will automatically tune the amount of searching based on how many cores are present on the card. Splitting the poly selection over more than one card is something current software cannot do. You can give two instances of msieve different search ranges by adding an argument to '-np'; see the help screen for (slightly) more detail.

sleigher 2010-03-10 21:08

I am going to run polynomial selection on one host specified by the -np option.

my command line looks like this
[code]
/msieve/msieve_gpu -s /ggnfs/e/e.dat -l /ggnfs/e/e.log -i /ggnfs/e/e.ini -nf /ggnfs/e/e.fb -g 0 -v -np $BIGNUM
[/code]I ran this for about 2 hours on the card and ctrl-c'd out and saw this.

[code]
received signal 2; shutting down
poly 5 p 48469867 q 54312037 coeff 2632497209889079
poly 33 p 48928787 q 51860791 coeff 2537485596490517
poly 27 p 49027339 q 52446649 coeff 2571319639937011
polynomial selection complete
[B]error generating or reading NFS polynomials[/B]
p1 factor: 2
p4 factor: 9173
p10 factor: 1103183071
c141 factor: 289230314549809501047395118348441892369208029120962925491447201388439102812213920176212457929299531507641614235586535733203805146109254832227
elapsed time 01:17:43

current factorization was interrupted
[/code]Any clue as to why there is an error outputting polys? Is that supposed to happen? Did it not find any yet?

jrk 2010-03-10 21:18

[QUOTE=sleigher;208000]I ran this for about 2 hours on the card and ctrl-c'd out and saw this.

[code]
received signal 2; shutting down
poly 5 p 48469867 q 54312037 coeff 2632497209889079
poly 33 p 48928787 q 51860791 coeff 2537485596490517
poly 27 p 49027339 q 52446649 coeff 2571319639937011
polynomial selection complete
[B]error generating or reading NFS polynomials[/B]
p1 factor: 2
p4 factor: 9173
p10 factor: 1103183071
c141 factor: 289230314549809501047395118348441892369208029120962925491447201388439102812213920176212457929299531507641614235586535733203805146109254832227
elapsed time 01:17:43

current factorization was interrupted
[/code][/QUOTE]

Does it surprise you that some small factors were found in $BIGNUM ? The number looks to be about the right size for a RSA key, so perhaps you've translated it incorrectly if that's the case.

[QUOTE=sleigher;208000]Any clue as to why there is an error outputting polys? Is that supposed to happen? Did it not find any yet? [/QUOTE]
Run the poly search for about 4 or 5 more days.

xilman 2010-03-10 21:32

[quote=jasonp;207986]Paul:others have reported that the E-value bound for very large jobs is too conservative; maybe you can suggest changes to the minimum E-value for the code to use.[/quote]So far it's found 24 polynomials with E > 7e-14 for a c178 out of 522826 in total, the largest now being 7.607e-14. The corresponding norm is 1.233e-17 and alpha is -7.015. There are 3479 polynomials with E > 6e-14. Perhaps 6e-14 or 6.5e-14 might be a reasonable choice?

Note this is running on a relatively powerful GPU (240 cores clocked at 1.3GHz) so I may be getting a rather skewed view of what is to be expected. That said, the run is still less than 20% through its scheduled 300 hours, implying that much less powerful cards could easily make as much progress as I've seen so far.

Paul

sleigher 2010-03-10 21:33

$BIGNUM is an RSA key. I think your right about translating it incorrectly. I am running it as a HEX number but I do notice it comes up with the wrong number of digits for the byte size.

64 Byte hex number should be 155 digits right? I get 154 digits when msieve starts..... Maybe that's right. I don't know.

[code]
Msieve v. 1.45
Wed Mar 10 13:12:11 2010
random seeds: 7e81ff42 55f99ffa
factoring 5853731358738835671377308437651441078912023204537001964649176763637237424878086557355026819252721668338725669518704934079163013348237168412858758307780482 (154 digits)
[/code]No idea why though.....

I have tried other ways of translating to decimal as well and always come up with 154 digits.


All times are UTC. The time now is 15:48.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.