mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > GMP-ECM

Reply
 
Thread Tools
Old 2019-11-05, 21:31   #474
fivemack
(loop (#_fork))
 
fivemack's Avatar
 
Feb 2006
Cambridge, England

18CE16 Posts
Default

Quote:
Originally Posted by EdH View Post
Thanks! I must have gotten something mixed up in my earlier testing. (That's happening a lot , lately.) I thought I ran some 11e7 stage 1 tests on a Colab session in about 40 minutes for 832 curves, but I'm coming up on 2 hours this time. I must have been using a smaller candidate before.

I will modify my routine to save residues, then and see what other mischief I can come up with.

Thanks, again!
The size of the candidate shouldn't have any effect on the stage 1 GPU time - I believe the ECMGPU code uses fixed-size integers. 40 minutes sounds maybe more plausible for 11e6.
fivemack is offline   Reply With Quote
Old 2019-11-05, 22:02   #475
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

24×3×67 Posts
Default

Quote:
Originally Posted by fivemack View Post
The size of the candidate shouldn't have any effect on the stage 1 GPU time - I believe the ECMGPU code uses fixed-size integers. 40 minutes sounds maybe more plausible for 11e6.
It's quite possible it was 11e6, now that you mention it. I think at the time I was checking different values, 11e3, 11e4, 11e5, etc. I might have gotten mixed up, but I was rather certain that I had made note that the 11e7 stage 1 on the GPU Colab instance took longer at 40 minutes than the ecmpi runs locally at 35 minutes. Hmm, maybe I stopped it at 40 minutes, because it took longer. My memory is really great, but for too short a spell. Always a jab to remind me to keep better notes. . .
EdH is offline   Reply With Quote
Old 2019-11-13, 22:00   #476
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

24·3·67 Posts
Default

In playing with my Colab instance of GMP-ECM with GPU enabled (K80 GPU), I have made some observations:

The K80 shows 832 cores. If I run threads up to half of the cores, vs. over half the cores, the time for the run just about doubles. It is pretty close to constant within those halves. For example, running stage 1 for 5+3,1185L, it takes just about 22 seconds to complete <417 curves at 11e4. >416 curves takes about 41 seconds. Of note, I am using multiples of 32 cores.

Running a single curve without the GPU takes about .4 second. Obviously, running more curves on a single CPU core would sequentially add time - 416 curves by CPU would take about 166 seconds.

This appears to indicate that while the GPU sounds impressive with all its cores, a quad core CPU would keep up with the K80 GPU in stage 1 runs.

Am I missing something here?
EdH is offline   Reply With Quote
Old 2019-11-13, 22:54   #477
R.D. Silverman
 
R.D. Silverman's Avatar
 
Nov 2003

2×3×17×73 Posts
Default

Quote:
Originally Posted by EdH View Post
In playing with my Colab instance of GMP-ECM with GPU enabled (K80 GPU), I have made some observations:

The K80 shows 832 cores. If I run threads up to half of the cores, vs. over half the cores, the time for the run just about doubles. It is pretty close to constant within those halves. For example, running stage 1 for 5+3,1185L, it takes just about 22 seconds to complete <417 curves at 11e4. >416 curves takes about 41 seconds. Of note, I am using multiples of 32 cores.

Running a single curve without the GPU takes about .4 second. Obviously, running more curves on a single CPU core would sequentially add time - 416 curves by CPU would take about 166 seconds.

This appears to indicate that while the GPU sounds impressive with all its cores, a quad core CPU would keep up with the K80 GPU in stage 1 runs.

Am I missing something here?
It looks good to me.
R.D. Silverman is offline   Reply With Quote
Old 2019-11-14, 13:01   #478
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT)

2×2,837 Posts
Default

Quote:
Originally Posted by EdH View Post
In playing with my Colab instance of GMP-ECM with GPU enabled (K80 GPU), I have made some observations:

The K80 shows 832 cores. If I run threads up to half of the cores, vs. over half the cores, the time for the run just about doubles. It is pretty close to constant within those halves. For example, running stage 1 for 5+3,1185L, it takes just about 22 seconds to complete <417 curves at 11e4. >416 curves takes about 41 seconds. Of note, I am using multiples of 32 cores.

Running a single curve without the GPU takes about .4 second. Obviously, running more curves on a single CPU core would sequentially add time - 416 curves by CPU would take about 166 seconds.

This appears to indicate that while the GPU sounds impressive with all its cores, a quad core CPU would keep up with the K80 GPU in stage 1 runs.

Am I missing something here?
The gpu doesn't care what size of number you give it while the cpu does. A quad core cpu probably wouldn't keep up so well on a 1000 bit number(I believe your test case had 720 bits).
I seem to recall that it was possible to compile versions of gpu-ecm that ran faster with a limit of ~256 or ~512 bits. I am not sure whether ~768 worked. I think it needed to be a power of 2.
henryzz is online now   Reply With Quote
Old 2019-11-14, 14:18   #479
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

1100100100002 Posts
Default

Quote:
Originally Posted by henryzz View Post
The gpu doesn't care what size of number you give it while the cpu does. A quad core cpu probably wouldn't keep up so well on a 1000 bit number(I believe your test case had 720 bits).
I seem to recall that it was possible to compile versions of gpu-ecm that ran faster with a limit of ~256 or ~512 bits. I am not sure whether ~768 worked. I think it needed to be a power of 2.
Hmm! I was thinking of just giving up on the GPU. Maybe I need to play more after all. . .
EdH is offline   Reply With Quote
Old 2019-11-14, 14:27   #480
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT)

162A16 Posts
Default

Quote:
Originally Posted by EdH View Post
Hmm! I was thinking of just giving up on the GPU. Maybe I need to play more after all. . .
Be careful if you go down the route of trying gpu-ecm with different limits. I think there was some concern about whether it was appearing to work but wasn't finding all expected factors.
henryzz is online now   Reply With Quote
Old 2019-11-14, 14:53   #481
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

C9016 Posts
Default

Quote:
Originally Posted by henryzz View Post
Be careful if you go down the route of trying gpu-ecm with different limits. I think there was some concern about whether it was appearing to work but wasn't finding all expected factors.
Is this discussion available somewhere? I don't remember it in this thread. I think I went through this thread - maybe I skipped some. . . Anyway, I probably won't pursue this too much further, at least not too soon.

Thanks!
EdH is offline   Reply With Quote
Old 2019-11-14, 16:35   #482
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

3×23×61 Posts
Default

Henry is right- GPU-ECM is the same speed with any input within its bounds (1019 bits or smaller). So, it's relatively a waste of time for small numbers, but quite nice for 900+ bit inputs. When I had a working CUDA setup, I used it for lots of 900-1000 bit inputs. Even in your test case, it doubles the production of a quad-core by doing stage1 on GPU and stage2 on CPU (with a couple cores left over, since stage 2 runs faster than stage 1).
VBCurtis is offline   Reply With Quote
Old 2019-11-14, 21:37   #483
fivemack
(loop (#_fork))
 
fivemack's Avatar
 
Feb 2006
Cambridge, England

143168 Posts
Default

Quote:
Originally Posted by EdH View Post
In playing with my Colab instance of GMP-ECM with GPU enabled (K80 GPU), I have made some observations:

The K80 shows 832 cores.
That rather surprises me, since a K80 card is specified as having 4992 cores, so you're getting 1/6 of it (and, as you say the timings double when you go to >416 curves, 1/12 of it).

(though my 1080Ti is specified as 3584 cores and runs 1792 curves at a time, so there's a small constant factor there in any case)
fivemack is offline   Reply With Quote
Old 2019-11-14, 23:09   #484
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

24×3×67 Posts
Default

Thanks for all the info, everyone.

Quote:
Originally Posted by fivemack View Post
That rather surprises me, since a K80 card is specified as having 4992 cores, so you're getting 1/6 of it (and, as you say the timings double when you go to >416 curves, 1/12 of it).

(though my 1080Ti is specified as 3584 cores and runs 1792 curves at a time, so there's a small constant factor there in any case)
I've run nvidia-smi to see what I've been assigned, but it hasn't shown me core count, that I can recognize, anyway. When I run ECM with it enabled, it tells me 832 cores as default, which is supposed to be saturation. Can a GPU serve others at the same time I'm using the 832? I kind of thought not.

I think one of the times I got a P100, it showed 4000+ with my ECM run. Since I only get two Xeon cores to back up all the GPU stage 1s, unless I pull the residues off, it's kind of less than ideal to run much in a GPU-ECM Colab session.
EdH is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Running CUDA on non-Nvidia GPUs Rodrigo GPU Computing 3 2016-05-17 05:43
Error in GMP-ECM 6.4.3 and latest svn ATH GMP-ECM 10 2012-07-29 17:15
latest SVN 1677 ATH GMP-ECM 7 2012-01-07 18:34
Has anyone seen my latest treatise? davieddy Lounge 0 2011-01-21 19:29
Latest version? [CZ]Pegas Software 3 2002-08-23 17:05

All times are UTC. The time now is 12:29.

Sat Jul 4 12:29:20 UTC 2020 up 101 days, 10:02, 1 user, load averages: 1.44, 1.37, 1.40

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.