mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > Msieve

Reply
 
Thread Tools
Old 2016-12-01, 15:34   #1
aein
 
Nov 2016

B16 Posts
Default error when running msieve 1.53 with cuda

Hi
I download msieve 1.53 and make in with CUDA-7.5 on ubuntu 14.4.
when I using this command to factor ~85 digit integer i get this error:

Msieve v. 1.53 (SVN Unversioned directory)
random seeds: c56acd4b 31ff2790
factoring 1058279957272128717101231216785847994867516402276836134081961397119465210180855494891 (85 digits)
no P-1/P+1/ECM available, skipping
commencing number field sieve (85-digit input)
commencing number field sieve polynomial selection
polynomial degree: 4
max stage 1 norm: 1.91e+14
max stage 2 norm: 3.49e+13
min E-value: 7.07e-08
poly select deadline: 419
time limit set to 0.12 CPU-hours
expecting poly E from 1.32e-07 to > 1.51e-07
searching leading coefficients from 1 to 3349105
using GPU 1 (GeForce GTX 980 Ti)
selected card has CUDA arch 5.2
deadline: 5 CPU-seconds per coefficient
error (line 1116): CUDA_ERROR_FILE_NOT_FOUND

any idea?
which file doesn't exist?
aein is offline   Reply With Quote
Old 2016-12-02, 18:09   #2
jasonp
Tribal Bullet
 
jasonp's Avatar
 
Oct 2004

352410 Posts
Default

Usually that's an error on CUDA startup. It's possible your card is too new for the GPU kernels that the sorting engine is configured with. Does it work any better if you change the last line of the makefile:
Code:
	cd cub && make WIN=$(WIN) WIN64=$(WIN64) sm=200,300,350 && cd ..
by adding '520' to the list after the sm, and rebuilding?
jasonp is offline   Reply With Quote
Old 2016-12-07, 07:05   #3
aein
 
Nov 2016

10112 Posts
Default

Quote:
Originally Posted by jasonp View Post
Usually that's an error on CUDA startup. It's possible your card is too new for the GPU kernels that the sorting engine is configured with. Does it work any better if you change the last line of the makefile:
Code:
	cd cub && make WIN=$(WIN) WIN64=$(WIN64) sm=200,300,350 && cd ..
by adding '520' to the list after the sm, and rebuilding?

Hi
unfortunately it doesn't work
I using Geforce gtx 980 ti

i have another question : do you know why type in terminal: "deadline: 5 CPU-seconds per coefficient"
how can i change it?
aein is offline   Reply With Quote
Old 2016-12-09, 02:07   #4
jasonp
Tribal Bullet
 
jasonp's Avatar
 
Oct 2004

22·881 Posts
Default

Don't worry about the deadline, that's a low-level choice the code makes.

After changing the makefile, did you first make clean and then rebuild? Dependency tracking in the makefile is not that great at the moment.
jasonp is offline   Reply With Quote
Old 2016-12-09, 03:05   #5
jasonp
Tribal Bullet
 
jasonp's Avatar
 
Oct 2004

DC416 Posts
Default

Never mind, this was a change for the Visual Studio build that didn't get added to the unix makefile.

Try patching the makefile as follows and rebuilding:
Code:
===================================================================
--- Makefile	(revision 1008)
+++ Makefile	(working copy)
@@ -179,6 +179,7 @@
 	stage1_core_sm20.ptx \
 	stage1_core_sm30.ptx \
 	stage1_core_sm35.ptx \
+	stage1_core_sm50.ptx \
 	cub/built
 
 #---------------------------------- NFS file lists -------------------------
@@ -320,5 +321,8 @@
 stage1_core_sm35.ptx: $(NFS_GPU_HDR)
 	$(NVCC) -arch sm_35 -ptx -o $@ $<
 
+stage1_core_sm50.ptx: $(NFS_GPU_HDR)
+	$(NVCC) -arch sm_50 -ptx -o $@ $<
+
 cub/built:
-	cd cub && make WIN=$(WIN) WIN64=$(WIN64) sm=200,300,350 && cd ..
+	cd cub && make WIN=$(WIN) WIN64=$(WIN64) sm=200,300,350,520 && cd ..
jasonp is offline   Reply With Quote
Old 2016-12-10, 16:14   #6
aein
 
Nov 2016

11 Posts
Default gpu cluster

Quote:
Originally Posted by jasonp View Post
Never mind, this was a change for the Visual Studio build that didn't get added to the unix makefile.

Try patching the makefile as follows and rebuilding:
Code:
===================================================================
--- Makefile    (revision 1008)
+++ Makefile    (working copy)
@@ -179,6 +179,7 @@
     stage1_core_sm20.ptx \
     stage1_core_sm30.ptx \
     stage1_core_sm35.ptx \
+    stage1_core_sm50.ptx \
     cub/built
 
 #---------------------------------- NFS file lists -------------------------
@@ -320,5 +321,8 @@
 stage1_core_sm35.ptx: $(NFS_GPU_HDR)
     $(NVCC) -arch sm_35 -ptx -o $@ $<
 
+stage1_core_sm50.ptx: $(NFS_GPU_HDR)
+    $(NVCC) -arch sm_50 -ptx -o $@ $<
+
 cub/built:
-    cd cub && make WIN=$(WIN) WIN64=$(WIN64) sm=200,300,350 && cd ..
+    cd cub && make WIN=$(WIN) WIN64=$(WIN64) sm=200,300,350,520 && cd ..
thank you for your response!!! it works well!!

can i run msieve polynomial selection for a big number on gpu cluster??? if i can, how???
aein is offline   Reply With Quote
Old 2016-12-10, 16:59   #7
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

3×5×281 Posts
Default

Folks new to our forum have widely varying concepts of "big number", so you ought to express just how big before you jump in to a task that may need just one machine, or may not be feasible with current tools.

For numbers of the proper size range, you can assign each GPU a different range of a1 coefficients using "x,y" in the msieve invocation (e.g. "1000,500000" to first GPU, "500000,15000000" to second GPU, etc). EDIT: There is also a flag to assign msieve to a specific GPU; I don't know that one, but the help listing will show you.

The larger the input candidate, the more pieces msieve splits the search space into. If your cluster is a single-digit number of GPUs and your input sufficiently large, you can just invoke msieve identically on each GPU and trust that there is little overlap among instances; you'll see in the screen output something like "searching range 7 of 41", where the "7" is randomly chosen by each instance. So, if you have 6 GPUs, you would suffer less than 10% overlap in the instances when two msieves choose the same slice of the search space.

I personally choose the former method to eliminate overlap, but a hybrid of assigning two GPUs to each range is also quite reasonable.

If you run -np1 phase on its own and do not alter the default parameters, you're going to generate a LOT of data; running -nps at the same time reduces this. You can reduce this further by reducing the stage1norm or stage2norm, as I've detailed in other threads; note that reducing stage 1 also reduces the number of pieces in each search range. For a first effort, perhaps divide default stage1 by 5 and divide default stage2 by 20 and run -np1 -nps in a single invocation:
./msieve -np1 -nps "stage1_norm={default divided by 5} stage2_norm={default divided by 20} 500000,1500000" -s file1

You discover default values via ./msieve -np1 -nps, waiting until poly search starts, then exiting and looking in the log file.

An invocation like this should use between 10 and 30% of one CPU core to do the size-optimizing, though perhaps a fast GPU might use more (I run a 750ti with use closer to 10%). A smaller stage1 norm will use less CPU, though it will also generate slightly fewer polynomials (an open question whether the missed polys are ever the best ones- there is a chance, seems).

Note the defaults for norms selected by msieve were generated before the GPU era; we tweak these norms because GPUs generate candidate polynomials so very quickly.

When your GPU run is done, you'll run ./msieve -npr -s file1
on the size-optimized polys generated by the nps step; if you do not alter bounds, you should sort and truncate the .ms file to the best 100 (or 200, if you like to be thorough or the GPU run was long) candidates. The npr step is quite slow; 100 candidates will take over an hour, and a default file of 5000 generated in a GPU day might take 2+ days.

So, how "large" is your large candidate?

Last fiddled with by VBCurtis on 2016-12-10 at 17:01
VBCurtis is offline   Reply With Quote
Old 2016-12-14, 00:21   #8
aein
 
Nov 2016

11 Posts
Default

Quote:
Originally Posted by VBCurtis View Post

So, how "large" is your large candidate?

my number is about 600 bit!

Last fiddled with by aein on 2016-12-14 at 00:21
aein is offline   Reply With Quote
Old 2016-12-14, 03:44   #9
jasonp
Tribal Bullet
 
jasonp's Avatar
 
Oct 2004

22×881 Posts
Default

A big number but not outlandishly so. If this is your first factorization you will definitely want to start with smaller numbers first.
jasonp is offline   Reply With Quote
Old 2019-02-25, 14:09   #10
Neutron3529
 
Neutron3529's Avatar
 
Dec 2018
China

33 Posts
Default

In windows, both generate(rename sm35 to sm50) a sm_50 ptx file or generate a new sm_50 file is possible:
code could be(should download and unzip msieve-153.tar.gz, install nvcc and VS(ensure cl.exe is in the %PATH% so that cmd ccould call cl.exe directly.)):

B:\msieve-1.53\gnfs\poly\stage1\stage1_core_gpu>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\bin\nvcc.exe" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\include" -I. -O3 stage1_core.cu -arch sm_61 -ptx -o stage1_core_sm50.ptx
Here sm_61 could be anything that fits your graphic card, BUT, you should ensure "-o stage1_core_sm50.ptx" and later copy this file to the folder contains msieve153_gpu.exe
Neutron3529 is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Error while running Msieve 1.53 with factmsieve.py FelicityGranger Msieve 2 2016-12-04 10:44
Can anyone help me about msieve 1.53 with CUDA? Seto Msieve 8 2016-09-24 12:54
Problem in running msieve with CUDA mohamed Msieve 20 2013-08-01 08:27
CUDA_ERROR_LAUNCH_OUT_OF_RESOURCES when running msieve 1.5.0 with CUDA ryanp Msieve 3 2012-06-12 03:27
Error running GGNFS+msieve+factmsieve.py D. B. Staple Factoring 6 2011-06-12 22:23

All times are UTC. The time now is 03:03.

Mon Jul 13 03:03:06 UTC 2020 up 110 days, 36 mins, 0 users, load averages: 2.24, 2.12, 2.01

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.