![]() |
[QUOTE=Karl M Johnson;288948]Windows binary wanted :smile:[/QUOTE]
A "working" 64-bit Windows variant would be nice... :bangheadonwall: |
Does the GPU branch allow for multi-threading stage 2? I can't seem to find anything in the docs.
My Colab sessions only get two Xeon cores, but using both would "double" the throughput for stage 2. |
[QUOTE=EdH;533283]Does the GPU branch allow for multi-threading stage 2? I can't seem to find anything in the docs.
My Colab sessions only get two Xeon cores, but using both would "double" the throughput for stage 2.[/QUOTE] I don't think so. Stage 2 is run on the CPU, not the GPU, so I don't think anything about stage 2 gets changed when the program is compiled with the --enable-gpu option. This is in the readme.gpu file: [quote]It will compute step 1 on the GPU, and then perform step 2 on the CPU (not in parallel).[/quote] |
[QUOTE=PhilF;533286]I don't think so. Stage 2 is run on the CPU, not the GPU, so I don't think anything about stage 2 gets changed when the program is compiled with the --enable-gpu option.
This is in the readme.gpu file:[/QUOTE] Yeah, I saw that, but it seemed that there was somewhere I read that the latest version had multi-threading. Even if I can't invoke both CPUs in conjunction with a GPU run, if I can run the stage 1 with B2=0 and save the residues, I could rerun ECM with both threads to run the residues. I may have to explore ecm.py and see if there is a manner I can both run the GPU branch and fill the CPU for stage 2. . . |
[QUOTE=EdH;533283]Does the GPU branch allow for multi-threading stage 2? I can't seem to find anything in the docs.
[/QUOTE] You need to script it. Basically run stage 1 on the GPU, saving parms to a file, split the file into as many bits as you have CPUs, then run an ecm task for each part. My latest script (not yet fully tested) is: [code] #!/bin/bash # Script to run ecm on 2 or more cores against the number in $NAME.poly or $NAME.n aided by the gpu doing stage 1. # It's intended to be called from factMsieve.factordb.pl which searches the logs for factors. # The GPU can do stage 1 in about 1/2 the time the CPU takes to do stage 2 on one core. # It expects 5 parms, the filename prefix, log suffix, the B1 to resume, B1 for GPU to use and the number of cores to use. # The .ini file should have already been created by the caller #set -x NAME=$1 LEN=$2 OLDB1=$3 NEWB1=$4 CORES=$5 INI=$NAME.ini if [[ ! -f $INI ]]; then echo "Can't find .ini file"; exit;fi if [[ -z $LEN ]]; then echo "Can't tell what to call the log";exit;fi if [[ -z $OLDB1 ]]; then echo "Can't tell previous B1 to use";exit;fi if [[ -z $NEWB1 ]]; then echo "Can't tell what B1 to use";exit;fi if [[ -z $CORES ]]; then echo "Can't tell how many cores to use";exit;fi SAVE=$NAME.save if [[ ! -f $SAVE ]]; then echo "Can't find save file from last run"; exit;fi LOG=$NAME.ecm$LEN.log # First split the save file from the previous run and start running them, followed by standard ecm until the GPU has finished. # /home/chris/ecm-6.4.4/ecm was compiled with -enable-shellcmd to make it accept -idlecmd. date "+ %c ecm to $LEN digits starts now" >> $LOG rm save.* split -nr/$CORES $NAME.save save. rm $NAME.save for FILE in save.* do date "+ %c ecm stage 2 with B1=$OLDB1 starts now" >> $NAME.ecm$LEN.$FILE.log (nice -n 19 /home/chris/ecm-gpu/trunk/ecm -resume $FILE $OLDB1;nice -n 19 /home/chris/ecm-6.4.4/ecm -c 999 -idlecmd 'ps -ef | grep -q [-]save' -n $NEWB1 <$INI ) | tee -a $NAME.ecm$LEN.$FILE.log | grep actor & done # Now start running stage 1 on the gpu /home/chris/ecm.2741/trunk/ecm -gpu -save $NAME.save $NEWB1 1 <$INI | tee -a $LOG | grep actor date "+ %c ecm to $LEN digits stage 1 ended" >> $LOG wait # for previous ecm's to finish date "+ %c Finished" | tee -a $NAME.ecm$LEN.save.* >> $LOG grep -q 'Factor found' $LOG $NAME.ecm$LEN.save.* # Check if we found a factor exit $? # And pass RC back to caller [/code] But I've never used colab so don't know how to run things on it. Chris |
[QUOTE=chris2be8;533323]You need to script it. Basically run stage 1 on the GPU, saving parms to a file, split the file into as many bits as you have CPUs, then run an ecm task for each part.
My latest script (not yet fully tested) is: [code] #!/bin/bash # Script to run ecm on 2 or more cores against the number in $NAME.poly or $NAME.n aided by the gpu doing stage 1. # It's intended to be called from factMsieve.factordb.pl which searches the logs for factors. # The GPU can do stage 1 in about 1/2 the time the CPU takes to do stage 2 on one core. # It expects 5 parms, the filename prefix, log suffix, the B1 to resume, B1 for GPU to use and the number of cores to use. # The .ini file should have already been created by the caller #set -x NAME=$1 LEN=$2 OLDB1=$3 NEWB1=$4 CORES=$5 INI=$NAME.ini if [[ ! -f $INI ]]; then echo "Can't find .ini file"; exit;fi if [[ -z $LEN ]]; then echo "Can't tell what to call the log";exit;fi if [[ -z $OLDB1 ]]; then echo "Can't tell previous B1 to use";exit;fi if [[ -z $NEWB1 ]]; then echo "Can't tell what B1 to use";exit;fi if [[ -z $CORES ]]; then echo "Can't tell how many cores to use";exit;fi SAVE=$NAME.save if [[ ! -f $SAVE ]]; then echo "Can't find save file from last run"; exit;fi LOG=$NAME.ecm$LEN.log # First split the save file from the previous run and start running them, followed by standard ecm until the GPU has finished. # /home/chris/ecm-6.4.4/ecm was compiled with -enable-shellcmd to make it accept -idlecmd. date "+ %c ecm to $LEN digits starts now" >> $LOG rm save.* split -nr/$CORES $NAME.save save. rm $NAME.save for FILE in save.* do date "+ %c ecm stage 2 with B1=$OLDB1 starts now" >> $NAME.ecm$LEN.$FILE.log (nice -n 19 /home/chris/ecm-gpu/trunk/ecm -resume $FILE $OLDB1;nice -n 19 /home/chris/ecm-6.4.4/ecm -c 999 -idlecmd 'ps -ef | grep -q [-]save' -n $NEWB1 <$INI ) | tee -a $NAME.ecm$LEN.$FILE.log | grep actor & done # Now start running stage 1 on the gpu /home/chris/ecm.2741/trunk/ecm -gpu -save $NAME.save $NEWB1 1 <$INI | tee -a $LOG | grep actor date "+ %c ecm to $LEN digits stage 1 ended" >> $LOG wait # for previous ecm's to finish date "+ %c Finished" | tee -a $NAME.ecm$LEN.save.* >> $LOG grep -q 'Factor found' $LOG $NAME.ecm$LEN.save.* # Check if we found a factor exit $? # And pass RC back to caller [/code]But I've never used colab so don't know how to run things on it. Chris[/QUOTE]Thanks! I'm looking it over to see how I can incorporate some of the calls. I'm bouncing around between an awful lot of things ATM, which is probably causing some of my difficulties. |
GPU-ECM for CC2.0
Does anyone who has set up enough Windows development toolkits interests in compiling Windows binary of relatively new version(for example, 7.0.4-dev or 7.0.4 or 7.0.5-dev) GPU-ECM for CC2.0 card? It would be good to run it on old notebooks, thanks.:smile:
|
Revisions >3076 of Dev No Longer Work With Cuda 10.x Due To "unnamed structs/unions" in cuda.h
It was recently reported to me that my GMP-ECM-GPU branch instructions for a Colab session no longer work. In verifying the trouble, I too, received the following, during compilation:
[code] configure: Using cuda.h from /usr/local/cuda-10.0/targets/x86_64-linux/include checking cuda.h usability... no checking cuda.h presence... yes configure: WARNING: cuda.h: present but cannot be compiled configure: WARNING: cuda.h: check for missing prerequisite headers? configure: WARNING: cuda.h: see the Autoconf documentation configure: WARNING: cuda.h: section "Present But Cannot Be Compiled" configure: WARNING: cuda.h: proceeding with the compiler's result configure: WARNING: ## ------------------------------------------------ ## configure: WARNING: ## Report this to ecm-discuss@lists.gforge.inria.fr ## configure: WARNING: ## ------------------------------------------------ ## checking for cuda.h... no configure: error: required header file missing Makefile:807: recipe for target 'config.status' failed make: *** [config.status] Error 1 [/code]Further research per ECM Team request showed the following from config.log: [code] In file included from conftest.c:127:0: /usr/local/cuda-10.0/targets/x86_64-linux/include/cuda.h:432:10: warning: ISO C99 doesn't support unnamed structs/unions [-Wpedantic] }; ^ /usr/local/cuda-10.0/targets/x86_64-linux/include/cuda.h:442:10: warning: ISO C99 doesn't support unnamed structs/unions [-Wpedantic] }; ^ configure:15232: $? = 0 configure: failed program was: | /* confdefs.h */ [/code] |
On the off-chance I could solve this simply by adding a name to the unions referenced above, I tried:
[code] union noname { [/code]But alas, no joy: [code] | #include <cuda.h> configure:15308: result: no configure:15308: checking cuda.h presence configure:15308: x86_64-linux-gnu-gcc -E -I/usr/local/cuda-10.0/targets/x86_64-linux/include -I/usr/local//include -I/usr/local//include conftest.c configure:15308: $? = 0 configure:15308: result: yes configure:15308: WARNING: cuda.h: present but cannot be compiled configure:15308: WARNING: cuda.h: check for missing prerequisite headers? configure:15308: WARNING: cuda.h: see the Autoconf documentation configure:15308: WARNING: cuda.h: section "Present But Cannot Be Compiled" configure:15308: WARNING: cuda.h: proceeding with the compiler's result configure:15308: checking for cuda.h configure:15308: result: no configure:15315: error: required header file missing [/code] |
[QUOTE=EdH;542518]It was recently reported to me that my GMP-ECM-GPU branch instructions for a Colab session no longer work. . . [/QUOTE]GMP-ECM has been updated to revision 3081 and this is now working in my Colab instances.
"Thanks!" go out to the GMP-ECM Team. |
I have had mixed results using ECM-GPU on CoLab. Not that CoLab is the problem, it may be the way I am using it. I run sets of 1024 curves at 11e7 on the GPU. Then I transfer the results file to my local system to run step 2. I’ve notice the sigmas are generated consecutively. Is that enough variety or should I break it down and run twice as many sets at 512 curves each?
Running three sets of 1024 curves at 11e7 failed to find a p43. Another run of two sets of 1024 at 11e7 failed to find a p46. Lastly, on the first set of 1024 curves at 11e7 found a p53. On the GPU I perform: [CODE]echo <number> | ecm -v -save Cxxx.txt -gpu -gpucurves 1024 11e7[/CODE] After transferring the 1024 line result file I break it into four pieces using “head” and “tail”. Then each 256 line file is run by: [CODE]ecm -resume Cxxx[B]y[/B].txt -one 11e7[/CODE] where y is a suffix from a to d representing the four smaller files. The p53 may be a lucky hit but the p43 & p46 are a big time miss. Should I run more of the smaller sets to get a better “spread” of sigma? |
| All times are UTC. The time now is 04:22. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.