mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Lone Mersenne Hunters (https://www.mersenneforum.org/forumdisplay.php?f=12)
-   -   1M-9M GPU TF vs. CPU P-1/ECM (KWh/factor) (https://www.mersenneforum.org/showthread.php?t=20097)

 VictordeHolland 2015-03-05 17:24

1M-9M GPU TF vs. CPU P-1/ECM (KWh/factor)

There are some people (including myself) that are doing TF, P-1 and/or ECM in the 1M-9M region. Currently the 2M range is TFed to 65bits. The goal is to find factors for exponents without known factors.

The lower the exponent the more effort (GHzd) it takes to TF to the same bitlevel. The lower exponents are 'cheaper' to do P-1/ECM on, due to the the smaller FFT sizes. Which poses the question: at what point makes using P-1/ECM using a CPU makes more sense than TF on a GPU. I know comparing CPUs and GPUs is a bit like comparing apples to oranges, but the idea is to spend electricity wisely (read: lowest KWh/factor).

Of course there are other uses of GPU resources that are more helpful to GIMPS (DCTF, LLTF), but let's take them out of the equation for the moment.

Power and GHzd-d approximation
CPU: Intel i5 2500k 30-33 GHzd-d of P-1/ECM and uses 135W.
GPU: AMD 280X that can TF ~600 GHzd-d (<69bits) in these low ranges and uses 250W.

Power/GHzd: (nice numbers for easier calc)
CPU 135W * 24h / 32.4GHzd-d = 100Wh/GHzd
GPU: 250W * 24h / 600GHzd-d = 10Wh/GHzd

Assuming TF results in 1 factor in 200 runs (due to some P-1/ECM already done).

Rng | 65->66bit | effort/factor | Wh/factor
2M | 3.74 GHzd | 748 GHzd | 7,480
4M | 1.87 GHzd | 347 GHzd | 3,470
6M | 1.25 GHzd | 250 GHzd | 2,500
8M | 0.93 GHzd | 186 GHzd | 1,860

I've been doing some P-1 (B1=10e6 B2=200e6) in the 1.5-1.7M range and so far found 83 factor in ~1300 attempts, which works out to about 1/16 (remember nice numbers ;-) ). Expanding that to the higher ranges:

Rng | P-1 GHzd | effort/factor | Wh/factor
2M | 1.94 GHzd | 31.04 GHzd | 3,104
4M | 3.68 GHzd | 58.88 GHzd | 5,888
6M | 5.23 GHzd | 83.68 GHzd | 8,368
8M | 7.71 GHzd | 123.36 GHzd | 12,336

With ECM I ran 2300 curves (B1=5e4 B2=5e6) in the 1.5-1.7M range and found 2 factors. The experts will probably kill me for saying this: but let's assume 1 factor in 1000 curves.
Rng | ECM GHzd | effort/factor | Wh/factor
2M | 0.0845 GHzd | 84.5 GHzd | 8,450
4M | 0.180 GHzd | 180 GHzd | 18,000
6M | 0.270 GHzd | 270 GHzd | 27,000
8M | 0.397 GHzd | 397 GHzd | 39,700

GPU TF until Wh/F[SUB]TF GPU[/SUB] of the next bitlevel > Wh/F[SUB]CPU P-1/ECM[/SUB] ??????

That would imply:
2M no futher GPU TF
4M to 66bits
6M to 67bits
8M to 68bits

Is there something fundamentally wrong with my assumptions, or is GPU TF just still quite efficient in the >4M region?

[B]Disclaimer: [/B]Just to be [U]very clear[/U], this endeavour is purely for FUN! Nothing scientific to be gained here. :smile:

 alpertron 2015-03-05 18:40

[QUOTE=VictordeHolland;397082]
With ECM I ran 2300 curves (B1=5e4 B2=5e6) in the 1.5-1.7M range and found 2 factors. The experts will probably kill me for saying this: but let's assume 1 factor in 1000 curves.
Rng | ECM GHzd | effort/factor | Wh/factor
2M | 0.0845 GHzd | 84.5 GHzd | 8,450
4M | 0.180 GHzd | 180 GHzd | 18,000
6M | 0.270 GHzd | 270 GHzd | 27,000
8M | 0.397 GHzd | 397 GHzd | 39,700
[/QUOTE]

Given the amount of P-1 and TF already done for those numbers, I think it is advisable to run ECM curves with B1 = 250000, B2 = 25E6.

 lycorn 2015-03-05 22:26

[QUOTE=VictordeHolland;397082]There are some people (including myself) that are doing TF, P-1 and/or ECM in the 1M-9M region.[/QUOTE]

And don´t forget the < 1M range. Several people are doing ECM/P-1 there, as well as some TF to "kill" the remaining exponents at less than 65 bits. That range is currently the one with the higher percentage of exponents with at least one known factor (over 78%).
I am curently devoting my (scarce) resources to ECM in that range and TF there and in the 1 - 2M range.

 VictordeHolland 2015-03-06 00:17

[QUOTE=lycorn;397118]And don´t forget the < 1M range. Several people are doing ECM/P-1 there, as well as some TF to "kill" the remaining exponents at less than 65 bits.[/QUOTE]
I'm not forgetting you ;). The point of this exercise was to calculate (at least approximately) how far TFing would 'make sense' if CPU/GPU energy consumption is the [U]only[/U] concern.

With my current hardware (if I didn't make any big mistakes) would seem to be:
Rng - bits
2M 65
3M 65
4M 66
5M 67
6M 67
7M 68
8M 68
9M 69

I'm not obliging people to use these values. Different CPUs, GPUs could get different values. It could also be the case that there is more processing power of one of the two available. Software could become more efficient in the future, etc....

And people are of course free to use their resources as they please.

 bloodIce 2015-03-07 09:29

I am also interested in the 2M range, but sometimes I go 1M, <1M and rarely 3M. It is only for fun, so don't take my opinion seriously. But, lets talk about 2M. I have 10x less powerful card ( ~29GHd/d) and less efficient than yours (in the tables at [url]www.mersenne.ca[/url] it says that mine is 1.005 GHd/W and yours is 1.498 GHd/W). Even with these numbers I usually go in 2M to TF68, before I succumb to Pminus1 or ECM2. Here a powerful enough reason is, that it is not fun any more to go TF70 (or not as effective in finding factors). Usually for the same time I will complete ~250 ECM curves at (B1=5e4 B2=5e6) at the CPU. That way I would test the exponent to ~TF75. The reason I do crazy Pminus1 (or it was crazy with my previous CPU, now seems like fine), after ECM is only for full exhaustion of all viable options in reasonable time. Rarely Pminus1 is of any help after ECM, but sometimes there are surprises. So seeing your GPU, I would say go to TF68 in >2M, clearly at least 1 bitlevel up (TF66). For 1M, I guess TF66 would be fine. Than go ECM, it is still more efficient in finding factors up to 3.5M range. For the rest of you, who would say that I am an idiot, please read my reason again - JFF.

 All times are UTC. The time now is 01:23.