![]() |
[QUOTE=airsquirrels;413881]Clearly GIMPs penalizes my ranking for doing LL on the GPU. Curiously, it doesn't seem like it is worth it for me do the 74->75 work vs. just doing the LL.[/QUOTE]
You've shown that to be the case. You should stick to ->74 with that card. Given that CPU resources still outweigh GPU resources, sticking to TF, which your GPU can efficiently do, is best for the project, I would say... I don't know if James' numbers/graphs reflect the throughput given if two copies of clLucas are run. |
[QUOTE=airsquirrels;413881]
For LLTF: 38 TF to 74 for 960 GhzDays, clearing a number every 2-3 days for both TF and LL) 17 TF to 75 for 900 GhzDays, clearing a number every 5-6 days for TF, every 2-3 for LL) [/QUOTE] Well....for LL you have to double the time, actually a 2.1 or 2.2 times is more correct than 2.0 times, because every factor you find will "save" two LL tests [U]and[/U] a bit of P-1 work too. That is why LLTF always goes one bit higher than DCTF, for the same exponent range: the amount of time spent on TF doubles with each bit. So, if you need 3 days to do a LL on a 70M+ expo, then if you do TF and find a factor every 5-6 days, you are still better (i.e. more helpful for the project) doing TF. As Mark said, you shown the case pretty well. I only assumed that the time you put is for a [U]single[/U] LL test (which seems reasonable, if you do a DC in one day, than a LL in 2-3 days is ok, because you have a double number of iterations and each iteration takes longer; but sorry if my assumption was wrong). [QUOTE]Interestingly cLucas isn't very effective at utilizing a card to full capacity either. I get better total throughput running two instances on each card, which further improves the numbers for doing LL vs. TF.[/QUOTE]This is very valuable information. Not many of us have new toys like that, and I think we have to open some requests in the clLucas thread. Maybe Bdot or Kracker (or who?) lift the glove... |
The 2x credit for a factor found due to skipping both the LL and DC sounds good on paper, but in reality by the time we actually get around to the DC'ing it is going to be years down the road. Most likely we will all have much much faster hardware at that point and the "time saved" for the DC is likely a lot shorter than it would be if we did the DC today.
If we're talking about the choice between work we are actually going to do today with today's hardware, it's still TF vs the single LL. |
[QUOTE=airsquirrels;413951]The 2x credit for a factor found due to skipping both the LL and DC sounds good on paper, but in reality by the time we actually get around to the DC'ing it is going to be years down the road. Most likely we will all have much much faster hardware at that point and the "time saved" for the DC is likely a lot shorter than it would be if we did the DC today.
If we're talking about the choice between work we are actually going to do today with today's hardware, it's still TF vs the single LL.[/QUOTE] If you're talking about time, yes, but not if you're talking about work. The amount of work will still be the same in the future unless a more optimal algorithm is discovered. You also missed the excitement near the beginning of the year when we ran out of optimally trial factored DC exponents. What looked like being a year ahead changed when it was decided to assign DC work by default when a new user joins. :) The last bit is only worth doing when it can eliminate both LL tests. If we miss doing it before the first LL test, it's not worth doing it before the second LL test, given current hardware and algorithms. With your card, it's probably worth stopping at 74 and letting someone else take it to 75. The trade-off tends to be at a slightly higher point with NVidia GPUs, iirc. |
[QUOTE=Mark Rose;413989]You also missed the excitement near the beginning of the year when we ran out of optimally trial factored DC exponents. What looked like being a year ahead changed when it was decided to assign DC work by default when a new user joins. :)[/QUOTE]
Good times! George changed the assignment policy, assigning DC instead of LL for the "churners". I said I didn't think we could make it, but we /just/ did because many jumped into the DCTF'ing Just In Time. |
[QUOTE=airsquirrels;413747]My testing against clFFT 2.8 on a mild OC Fury X has been reliable, but ultimately trial factoring is much much more efficient on the GPU than LL tests.
I can do a DC range check fairly quickly, but I can do the same check on one of my CPU cores in 5-6 days. 6x faster than a CPU core for LL vs. order(s) of magnitude faster for TF. continuing work from a partial result M41783789 fft length = 2359296 iteration = 12112 Iteration 20000 0xfaeba26dafa9c190, n = 2359296 err = 0.1172 (0:21 real, 2.1078 ms/iter, ETA 24:27:01) Iteration 30000 0x792c2daddb35206b, n = 2359296 err = 0.1172 (0:27 real, 2.6578 ms/iter, ETA 30:49:21) Iteration 40000 0x34e1e2ca8738e03e, n = 2359296 err = 0.1172 (0:26 real, 2.6495 ms/iter, ETA 30:43:08) Iteration 50000 0xf5608f00c78edc10, n = 2359296 err = 0.1172 (0:27 real, 2.6803 ms/iter, ETA 31:04:09) Iteration 60000 0x4de84f8901847e82, n = 2359296 err = 0.1172 (0:27 real, 2.6523 ms/iter, ETA 30:44:15)[/QUOTE] Try setting checkpoints longer, and see if that has any effect on the speed.(-c argument) |
[QUOTE=airsquirrels;413951]If we're talking about the choice between work we are actually going to do today with today's hardware, it's still TF vs the single LL.[/QUOTE]
No, because your guy in the future will also have the option to choose between TF and LL. Think about it! |
[QUOTE=LaurV;414039]No, because your guy in the future will also have the option to choose between TF and LL. Think about it![/QUOTE]
More to the point: If, tomorrow, a processor was released that is 10x faster at both TF and LL, the judgement of where to stop TF does not change one bit. You're saying "faster hardware in future, therefore....", but the only speedup that's relevant is one that speeds up one task more than another. Your claim is equivalent to predicting that future LL tests will take less computational effort (as opposed to time) than today's LL tests, a claim that is difficult to support. |
It all depends what resource you are optimizing for. In my case, I'm optimizing for my time. If hypothetically my goal is to clear as many exponents as I can this year and my hardware reliability is very high, I'm not much concerned with whether my method of clearing takes -a few of someone else's cycles years down the road, especially if those cycles have a low probability of accomplishing anything other than verifying a result that was most likely correct the first time.
That assumes I regularly monitor and verify the sanity of my hardware, but in the hypothetical world I can do that. In that scenario a TF clear or a LL clear both hold the same value to present day me. The DC check is someone else's fruitless future effort. it's a selfish optimization and not reflective of my own spirit in this project, but given those assumptions it makes some sense. |
This view is, in my opinion, both true and valid. Optimize-for-self and optimize-for-project reach two different conclusions.
|
[QUOTE=VBCurtis;414166]Optimize-for-self and optimize-for-project reach two different conclusions.[/QUOTE]
Probably not. If the project's goal is to find a prime as soon as possible, then 1 LL test is what you should optimize for. You don't need doublecheck to find a prime. |
| All times are UTC. The time now is 22:00. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.