![]() |
I currently have 7 cores from laptops helping out. 6 working on 71-72 and one on 72-73.
It is fun watching how the graphs change each month. |
[QUOTE=Dubslow;294224]When we've raced ahead of the LL wave, sure :smile:
OTOH, chalsall's algos currently predict there's ~200 days of TF work to be done up through 61M, so it might be a while :P[/QUOTE] And by then there will be a need to go to 62 or 63 or 64M to keep ahead of LL. |
[QUOTE=LaurV;294237]I will take 2 exponents that need to be taken 8 bits higher (like 70 to 78, or 71 to 79, or 72 to 80) when I reach home in few hours and I will try some timing for gtx580@782MHz. Depending of how long time it will take, I may finish them too... :D[/QUOTE]
So, 4 copies of mfactc running on 2 gtx580 can spit out one exponent in this range from 69 to 70 bits every 56 seconds. And get 0.3596 GD credit for each. :smile: I just reserved 40, and did the most of them. In this cadence, and using a small trick*, my system could take one exponent from 0 to 80 bits in 38 hours, in the worst case. And 34 hours and 4 minutes in the best case. P95 estimated May 2012 :smile:. *the trick is to split the 79-to-80 bitlevel in two: I use one instance to do 69-78 bits, for which, considering the base of 4 minutes, the time is 4+8+16+...+1024= 34 hours and 4 minutes. The last step, from 77 to 78 takes 1024 minutes roughly, forgetting the fact that mfaktc gets more proficient for longer assignments, and the fact that initial step is smaller then 4 minutes. Say 1024 is 17 hours and 4 minutes, or a maximum 1140 minutes, or 19 hours, with some CPU activity (daily work). Together, all the range 69-78 bits could take between 34 hours and 4 minutes and 38 hours. Then the second instance of mfaktc will take the 78 to 79 bits, for which the estimation is also about 35 hours, say 38 maximum. Then here comes the trick: to be able to fill up the GPU to over 90%, the instances 3 and 4 have to split the work from 79 to 80 bits. For this, I manually created a "fake" checkpoint file, same way as the function "checkpoint_write()" from checkpoint.c in mfaktc is doing (in fact, I copied the function in a separate place and compile it, and launch the program with my expo, my class, etc). The "fake" checkpoint file tells to mfaktc that 2310 classes from 4620 were done already. So, when I start the third copy of mfaktc, it starts from class 2311, and it says that 36 hours are left (1 day, 12 hours). The 4-th copy is started normal, from 79 to 80 bits, and it says that 3 days are left, and I must take care to stop it when it reach over 2310 classes done. That would be very useful to have an option for mfaktc to run "from class x to class y", we could split big bitlevels is "sub-bitlevels" like this, without "faking" the "resume" function. Anyhow, I would need 3 days to run to 80 bits the two expos I got, I don't know if I will let them run to the end, I may stop at 76. But I will finish all 40 expos I got from 69 to 70, since I am writing this post another few of them finished, there are only 6 left which will finish in the next 6 minutes. edit: finished 40 expos from 69 to 70, no factor found. I have another 2 expos, currently running from 75 to 76, about 1 hour left, I will finish this bitlevel for sure, but I may not continue to 80. |
Is it worth running exponents to 79-80 bits if you still don't know if they have factors in the lower bit ranges?
Luigi |
[QUOTE=ET_;294266]Is it worth running exponents to 79-80 bits if you still don't know if they have factors in the lower bit ranges?
Luigi[/QUOTE] Most probably not... Lower bitlevels should be finished before one tries higher bitlevels. It was just a tentative on my side to see how fast one exponent can be taken from 72 to 80 in a 2x580 system. To maximize both cards you need to pump at least 4 CPU cores into them, that is why the need to split the range into 4 approximate-equal (in time) units of work, and work all in the same time. And this can not really be done without splitting the last bitlevel, as that takes half of the time. If a factor is found by one of the instances, of course, everything would stop. Meantime I finished the 76th bit for two expos, it took 4 hours and 50 minutes to each, and the estimated time for the 77th bit is 8 hours and 35 minutes. I would stop now and move to other things. [CODE] 4603/4620 | 3.08G | 16.418s | 0m33s | 187.64M/s | 5000 | 1.84% 4608/4620 | 3.08G | 16.008s | 0m16s | 192.45M/s | 5000 | 1.76% 4615/4620 | 3.08G | 16.036s | 0m00s | 192.11M/s | 5000 | 2.10% no factor for M332201657 from 2^75 to 2^76 [mfaktc 0.18 barrett79_mul32] tf(): time spent since restart: 4h 43m 47.781s estimated total time spent: 4h 51m 23.069s Starting trial factoring M332201657 from 2^76 to 2^77 k_min = 113722888088040 k_max = 227445776183814 Using GPU kernel "barrett79_mul32" class | candidates | time | ETA | avg. rate | SievePrimes | CPU wait 0/4620 | 6.16G | 31.746s | 8h27m | 194.09M/s | 5000 | 1.90% 3/4620 | 6.16G | 32.128s | 8h32m | 191.78M/s | 5000 | 1.81% 4/4620 | 6.16G | 32.266s | 8h34m | 190.96M/s | 5000 | 1.83% 12/4620 | 6.16G | 31.980s | 8h29m | 192.67M/s | 5000 | 1.90%[/CODE] and [CODE] 4611/4620 | 3.08G | 16.037s | 0m32s | 192.10M/s | 5000 | 2.07% 4616/4620 | 3.08G | 16.147s | 0m16s | 190.79M/s | 5000 | 1.99% 4619/4620 | 3.08G | 16.033s | 0m00s | 192.15M/s | 5000 | 2.20% no factor for M332264749 from 2^75 to 2^76 [mfaktc 0.18 barrett79_mul32] WARNING: can't delete the checkpoint file "mfaktc.ckp" tf(): time spent since restart: 4h 43m 22.811s estimated total time spent: 4h 48m 10.994s Starting trial factoring M332264749 from 2^76 to 2^77 k_min = 113701293852300 k_max = 227402587705487 Using GPU kernel "barrett79_mul32" class | candidates | time | ETA | avg. rate | SievePrimes | CPU wait 0/4620 | 6.16G | 32.258s | 8h35m | 190.97M/s | 5000 | 1.86% 11/4620 | 6.16G | 31.891s | 8h29m | 193.17M/s | 5000 | 1.91% 12/4620 | 6.16G | 31.921s | 8h29m | 192.99M/s | 5000 | 1.93% 15/4620 | 6.16G | 31.818s | 8h26m | 193.61M/s | 5000 | 1.97%[/CODE] |
[QUOTE=LaurV;294264]So, 4 copies of mfactc running on 2 gtx580 can spit out one exponent in this range from 69 to 70 bits every 56 seconds. And get 0.3596 GD credit for each.[/QUOTE]
All exponents from 332192831[SUP]*[/SUP] to 332399999 are already at 71 bits. All exponents from 332192831 to 332599999 are already at 68 bits. All exponents from 332192831 to 332999999 are already at 67 bits. [COLOR="DimGray"]*the first possible candidate above 100M digits.[/COLOR] This effort is 2 pronged: 1) attempt to get all exponents to 79 or higher (and P-1), before they are handed out for L-L. 2) clear out as many exponents out as possible ahead of time (so if Jane User decides to do a random 100M digit LL [above the current front], the exponent is likely to have had at least a reasonable level of TF). Again, do what you enjoy.:smile: |
Ok, to make you happy, I will take 4000 expos to 68 :P:P
First results after about 100 of them (it takes 25 to 40 seconds for one, depends what the harvester does else) are: [CODE]M332609747 has a factor: 185637517603452208793 [TF:67:68*:mfaktc 0.18 barrett79_mul32] M332609861 has a factor: 166287716399044373849 [TF:67:68*:mfaktc 0.18 barrett79_mul32] [/CODE] By the way, PrimeNet crashed 3 times when I want to see my assignment list :D, and I believe that everytime I clicked "retry" he in fact assigned me another bunch. Never had so many assignments... I am not sure I grabbed all of them, I will see after I finish what I have, if something else left in the list. But even the list I can't see... it is too big and primenet crashes saying that the request takes too much time. |
By the way LaurV, if you're doing assignments less than 10 minutes or so, you should try using the LESS_CLASSES version of mfaktc, you might get a noticeable improvement on very short assignments.
|
[QUOTE=Dubslow;294325]By the way LaurV, if you're doing assignments less than 10 minutes or so, you should try using the LESS_CLASSES version of mfaktc, you might get a noticeable improvement on very short assignments.[/QUOTE]
You are talking foreign language :D I know there is a "less classes" version, as that was (unfortunately) the first one I ever installed when I started to come here around last year. But after I understood what and how, I deleted that version and installed the "correct" one. I got the feeling that the one with less classes is used for OBD only, and never tried to use it again. So, you say I could use it for 100M assignments too? Well... I will try to find it and play with it, thanks! edit: meantime I see I found another 3 factors: 332607181: 221376198437107268759 332615743: 189167942656620123641 332602679: 204441886716511071697 edit 2: if "less classes" is significantly faster for such short-term assignments, then I may try to do all Uncwilli's range from 67 to 68 or higher. Bad news is first that I can't reach that computer again today, and second, I will be in holiday next week, till April 16, and in this perion we won't touch the computers... our SWMBO will be watching like a hawk... We are "suffering" the [URL="http://en.wikipedia.org/wiki/Songkran"]Songkran[/URL] here in this period, and we will retreat somewhere in south on a nice island for two weeks, to get rid of the 40k-50k crazy farangs and 200k-300k crazy Thais who come to ChiangMai especially for the water fights every year (and few tens of them never make it back to their homes!). We had enough Songkran in the first 2-3 years in Thai (that was 10 years ago) and try to avoid it since. |
It depends on the candidates per class, which is almost exactly proportional to assignment runtime. If candidates per class is too low (too much overhead, reduced efficiency) with 4620 classes, then the obvious way to increase that ratio is to reduce the number of classes. That's why mfaktc is more efficient with longer running assignents, because the candidates per class ratio is higher. The higher the ratio, the better, up to a certain point.
You'll have to ask TheJudger to figure out the threshold where less classes is better. (There is a less_classes Windows binary available in the mirror.) Edit: Of course, if you take each expo through 80 bits, then you'll obviously have a very long running assignment and so the standard version would be better. |
[QUOTE=Dubslow;294358]Edit: Of course, if you take each expo through 80 bits, then you'll obviously have a very long running assignment and so the standard version would be better.[/QUOTE]I think that the best utilization of the GPU resources is to take exponents that are already around 74 and run them up to 79 or 80.
[url]http://v5www.mersenne.org/report_factoring_effort/?exp_lo=332192831&exp_hi=332399999&bits_lo=74&bits_hi=79&txt=1&exassigned=1&B1=Get+Data[/url] That is what is available in the 'classical' 100M digit range. |
| All times are UTC. The time now is 22:49. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.