mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GMP-ECM (https://www.mersenneforum.org/forumdisplay.php?f=55)
-   -   ECM for CUDA GPUs in latest GMP-ECM ? (https://www.mersenneforum.org/showthread.php?t=16480)

storm5510 2019-11-23 15:22

[QUOTE=Karl M Johnson;288948]Windows binary wanted :smile:[/QUOTE]

A "working" 64-bit Windows variant would be nice... :bangheadonwall:

EdH 2019-12-20 16:46

Does the GPU branch allow for multi-threading stage 2? I can't seem to find anything in the docs.

My Colab sessions only get two Xeon cores, but using both would "double" the throughput for stage 2.

PhilF 2019-12-20 17:02

[QUOTE=EdH;533283]Does the GPU branch allow for multi-threading stage 2? I can't seem to find anything in the docs.

My Colab sessions only get two Xeon cores, but using both would "double" the throughput for stage 2.[/QUOTE]

I don't think so. Stage 2 is run on the CPU, not the GPU, so I don't think anything about stage 2 gets changed when the program is compiled with the --enable-gpu option.

This is in the readme.gpu file:

[quote]It will compute step 1 on the GPU, and then perform step 2 on the CPU (not in parallel).[/quote]

EdH 2019-12-20 17:38

[QUOTE=PhilF;533286]I don't think so. Stage 2 is run on the CPU, not the GPU, so I don't think anything about stage 2 gets changed when the program is compiled with the --enable-gpu option.

This is in the readme.gpu file:[/QUOTE]
Yeah, I saw that, but it seemed that there was somewhere I read that the latest version had multi-threading. Even if I can't invoke both CPUs in conjunction with a GPU run, if I can run the stage 1 with B2=0 and save the residues, I could rerun ECM with both threads to run the residues.

I may have to explore ecm.py and see if there is a manner I can both run the GPU branch and fill the CPU for stage 2. . .

chris2be8 2019-12-21 16:32

[QUOTE=EdH;533283]Does the GPU branch allow for multi-threading stage 2? I can't seem to find anything in the docs.
[/QUOTE]

You need to script it. Basically run stage 1 on the GPU, saving parms to a file, split the file into as many bits as you have CPUs, then run an ecm task for each part.

My latest script (not yet fully tested) is:
[code]
#!/bin/bash

# Script to run ecm on 2 or more cores against the number in $NAME.poly or $NAME.n aided by the gpu doing stage 1.
# It's intended to be called from factMsieve.factordb.pl which searches the logs for factors.

# The GPU can do stage 1 in about 1/2 the time the CPU takes to do stage 2 on one core.

# It expects 5 parms, the filename prefix, log suffix, the B1 to resume, B1 for GPU to use and the number of cores to use.
# The .ini file should have already been created by the caller

#set -x

NAME=$1
LEN=$2
OLDB1=$3
NEWB1=$4
CORES=$5

INI=$NAME.ini
if [[ ! -f $INI ]]; then echo "Can't find .ini file"; exit;fi
if [[ -z $LEN ]]; then echo "Can't tell what to call the log";exit;fi
if [[ -z $OLDB1 ]]; then echo "Can't tell previous B1 to use";exit;fi
if [[ -z $NEWB1 ]]; then echo "Can't tell what B1 to use";exit;fi
if [[ -z $CORES ]]; then echo "Can't tell how many cores to use";exit;fi

SAVE=$NAME.save
if [[ ! -f $SAVE ]]; then echo "Can't find save file from last run"; exit;fi

LOG=$NAME.ecm$LEN.log

# First split the save file from the previous run and start running them, followed by standard ecm until the GPU has finished.
# /home/chris/ecm-6.4.4/ecm was compiled with -enable-shellcmd to make it accept -idlecmd.
date "+ %c ecm to $LEN digits starts now" >> $LOG

rm save.*
split -nr/$CORES $NAME.save save.
rm $NAME.save
for FILE in save.*
do
date "+ %c ecm stage 2 with B1=$OLDB1 starts now" >> $NAME.ecm$LEN.$FILE.log
(nice -n 19 /home/chris/ecm-gpu/trunk/ecm -resume $FILE $OLDB1;nice -n 19 /home/chris/ecm-6.4.4/ecm -c 999 -idlecmd 'ps -ef | grep -q [-]save' -n $NEWB1 <$INI ) | tee -a $NAME.ecm$LEN.$FILE.log | grep actor &
done

# Now start running stage 1 on the gpu
/home/chris/ecm.2741/trunk/ecm -gpu -save $NAME.save $NEWB1 1 <$INI | tee -a $LOG | grep actor
date "+ %c ecm to $LEN digits stage 1 ended" >> $LOG
wait # for previous ecm's to finish

date "+ %c Finished" | tee -a $NAME.ecm$LEN.save.* >> $LOG

grep -q 'Factor found' $LOG $NAME.ecm$LEN.save.* # Check if we found a factor
exit $? # And pass RC back to caller
[/code]

But I've never used colab so don't know how to run things on it.

Chris

EdH 2019-12-22 18:50

[QUOTE=chris2be8;533323]You need to script it. Basically run stage 1 on the GPU, saving parms to a file, split the file into as many bits as you have CPUs, then run an ecm task for each part.

My latest script (not yet fully tested) is:
[code]
#!/bin/bash

# Script to run ecm on 2 or more cores against the number in $NAME.poly or $NAME.n aided by the gpu doing stage 1.
# It's intended to be called from factMsieve.factordb.pl which searches the logs for factors.

# The GPU can do stage 1 in about 1/2 the time the CPU takes to do stage 2 on one core.

# It expects 5 parms, the filename prefix, log suffix, the B1 to resume, B1 for GPU to use and the number of cores to use.
# The .ini file should have already been created by the caller

#set -x

NAME=$1
LEN=$2
OLDB1=$3
NEWB1=$4
CORES=$5

INI=$NAME.ini
if [[ ! -f $INI ]]; then echo "Can't find .ini file"; exit;fi
if [[ -z $LEN ]]; then echo "Can't tell what to call the log";exit;fi
if [[ -z $OLDB1 ]]; then echo "Can't tell previous B1 to use";exit;fi
if [[ -z $NEWB1 ]]; then echo "Can't tell what B1 to use";exit;fi
if [[ -z $CORES ]]; then echo "Can't tell how many cores to use";exit;fi

SAVE=$NAME.save
if [[ ! -f $SAVE ]]; then echo "Can't find save file from last run"; exit;fi

LOG=$NAME.ecm$LEN.log

# First split the save file from the previous run and start running them, followed by standard ecm until the GPU has finished.
# /home/chris/ecm-6.4.4/ecm was compiled with -enable-shellcmd to make it accept -idlecmd.
date "+ %c ecm to $LEN digits starts now" >> $LOG

rm save.*
split -nr/$CORES $NAME.save save.
rm $NAME.save
for FILE in save.*
do
date "+ %c ecm stage 2 with B1=$OLDB1 starts now" >> $NAME.ecm$LEN.$FILE.log
(nice -n 19 /home/chris/ecm-gpu/trunk/ecm -resume $FILE $OLDB1;nice -n 19 /home/chris/ecm-6.4.4/ecm -c 999 -idlecmd 'ps -ef | grep -q [-]save' -n $NEWB1 <$INI ) | tee -a $NAME.ecm$LEN.$FILE.log | grep actor &
done

# Now start running stage 1 on the gpu
/home/chris/ecm.2741/trunk/ecm -gpu -save $NAME.save $NEWB1 1 <$INI | tee -a $LOG | grep actor
date "+ %c ecm to $LEN digits stage 1 ended" >> $LOG
wait # for previous ecm's to finish

date "+ %c Finished" | tee -a $NAME.ecm$LEN.save.* >> $LOG

grep -q 'Factor found' $LOG $NAME.ecm$LEN.save.* # Check if we found a factor
exit $? # And pass RC back to caller
[/code]But I've never used colab so don't know how to run things on it.

Chris[/QUOTE]Thanks! I'm looking it over to see how I can incorporate some of the calls. I'm bouncing around between an awful lot of things ATM, which is probably causing some of my difficulties.

Fan Ming 2020-01-04 11:55

GPU-ECM for CC2.0
 
Does anyone who has set up enough Windows development toolkits interests in compiling Windows binary of relatively new version(for example, 7.0.4-dev or 7.0.4 or 7.0.5-dev) GPU-ECM for CC2.0 card? It would be good to run it on old notebooks, thanks.:smile:

EdH 2020-04-13 12:28

Revisions >3076 of Dev No Longer Work With Cuda 10.x Due To "unnamed structs/unions" in cuda.h
 
It was recently reported to me that my GMP-ECM-GPU branch instructions for a Colab session no longer work. In verifying the trouble, I too, received the following, during compilation:
[code]
configure: Using cuda.h from /usr/local/cuda-10.0/targets/x86_64-linux/include
checking cuda.h usability... no
checking cuda.h presence... yes
configure: WARNING: cuda.h: present but cannot be compiled
configure: WARNING: cuda.h: check for missing prerequisite headers?
configure: WARNING: cuda.h: see the Autoconf documentation
configure: WARNING: cuda.h: section "Present But Cannot Be Compiled"
configure: WARNING: cuda.h: proceeding with the compiler's result
configure: WARNING: ## ------------------------------------------------ ##
configure: WARNING: ## Report this to ecm-discuss@lists.gforge.inria.fr ##
configure: WARNING: ## ------------------------------------------------ ##
checking for cuda.h... no
configure: error: required header file missing
Makefile:807: recipe for target 'config.status' failed
make: *** [config.status] Error 1
[/code]Further research per ECM Team request showed the following from config.log:
[code]
In file included from conftest.c:127:0:
/usr/local/cuda-10.0/targets/x86_64-linux/include/cuda.h:432:10: warning: ISO C99 doesn't support unnamed structs/unions [-Wpedantic]
};
^
/usr/local/cuda-10.0/targets/x86_64-linux/include/cuda.h:442:10: warning: ISO C99 doesn't support unnamed structs/unions [-Wpedantic]
};
^
configure:15232: $? = 0
configure: failed program was:
| /* confdefs.h */
[/code]

EdH 2020-04-13 16:35

On the off-chance I could solve this simply by adding a name to the unions referenced above, I tried:
[code]
union noname {
[/code]But alas, no joy:
[code]
| #include <cuda.h>
configure:15308: result: no
configure:15308: checking cuda.h presence
configure:15308: x86_64-linux-gnu-gcc -E -I/usr/local/cuda-10.0/targets/x86_64-linux/include -I/usr/local//include -I/usr/local//include conftest.c
configure:15308: $? = 0
configure:15308: result: yes
configure:15308: WARNING: cuda.h: present but cannot be compiled
configure:15308: WARNING: cuda.h: check for missing prerequisite headers?
configure:15308: WARNING: cuda.h: see the Autoconf documentation
configure:15308: WARNING: cuda.h: section "Present But Cannot Be Compiled"
configure:15308: WARNING: cuda.h: proceeding with the compiler's result
configure:15308: checking for cuda.h
configure:15308: result: no
configure:15315: error: required header file missing
[/code]

EdH 2020-04-19 13:59

[QUOTE=EdH;542518]It was recently reported to me that my GMP-ECM-GPU branch instructions for a Colab session no longer work. . . [/QUOTE]GMP-ECM has been updated to revision 3081 and this is now working in my Colab instances.

"Thanks!" go out to the GMP-ECM Team.

RichD 2020-08-09 13:11

I have had mixed results using ECM-GPU on CoLab. Not that CoLab is the problem, it may be the way I am using it. I run sets of 1024 curves at 11e7 on the GPU. Then I transfer the results file to my local system to run step 2. I’ve notice the sigmas are generated consecutively. Is that enough variety or should I break it down and run twice as many sets at 512 curves each?

Running three sets of 1024 curves at 11e7 failed to find a p43. Another run of two sets of 1024 at 11e7 failed to find a p46. Lastly, on the first set of 1024 curves at 11e7 found a p53.

On the GPU I perform:
[CODE]echo <number> | ecm -v -save Cxxx.txt -gpu -gpucurves 1024 11e7[/CODE]

After transferring the 1024 line result file I break it into four pieces using “head” and “tail”. Then each 256 line file is run by:
[CODE]ecm -resume Cxxx[B]y[/B].txt -one 11e7[/CODE]
where y is a suffix from a to d representing the four smaller files.

The p53 may be a lucky hit but the p43 & p46 are a big time miss. Should I run more of the smaller sets to get a better “spread” of sigma?


All times are UTC. The time now is 04:22.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.