![]() |
Checkpoint overhead?
Can someone estimate what the overhead of checkpoints is? I decided several weeks ago to turn them off, as mfaktc and my computer are very stable. On rare occasions I need to reboot the computer, and I might lose an hour of processing time if I am too impatient to wait for the current bitlevels to finish.
I am wondering if a month's overhead of checkpoints is more than an hour of lost work time. |
[QUOTE=Chuck;283235]Can someone estimate what the overhead of checkpoints is? I decided several weeks ago to turn them off, as mfaktc and my computer are very stable. On rare occasions I need to reboot the computer, and I might lose an hour of processing time if I am too impatient to wait for the current bitlevels to finish.
I am wondering if a month's overhead of checkpoints is more than an hour of lost work time.[/QUOTE] I just timed CPs on a W7-64 Core i7-M620 laptop with a slow disk. per CP: 0.01 ms for creating the checksum (CPU load) 0.2 ms writing & closing the file 1 ms for remove/rename operations for the backup file (mfakto only - mfaktc just has a remove ~ 0.2 ms) 1 ms for committing to disk (fflush); CPs are written after a class is finished, and before more work is loaded on the GPU - so this is "idle time" for the GPU if you just run a single instance. When running more instances per GPU, then they will overlap. So if you calculate single instance, 2 ms per CP, one CP after each class, 2 seconds per class, then you spend 0.1% of the time for writing the CP (this should be pretty much worst case). 0.1% of one month is ~ 45 min. If you lose 1h / month due to not writing CP's, you'd already be better off enabling them. And now you can configure mfaktc to write CP's less frequently - in your case you can set it to maximum (900 s) and it will still write a CP when you abort it with ^C. Then you spend about 6 seconds per month for writing the CPs. Still anyone running without checkpoints? :smile: |
hehe ramdisk - and all those problems dissappear.
-- Craig |
Thanks bdot that was very helpful. I hadn't looked at checkpoints for some time since before GPUTO72 I was "lumberjacking" in the M600,000,000 range where a TF run took around a minute (I was using chalsall's MORE_CLASSES disabled version).
I went with 600 as the checkpoint delay. It's nice that one is taken after a CTRL-C. |
[QUOTE=Chuck;283266](I was using chalsall's MORE_CLASSES disabled version)[/QUOTE]
That wasn't me, Guv. |
[QUOTE=chalsall;283272]That wasn't me, Guv.[/QUOTE]
That would have been "mfaktc171apsen.cuda40.sm_multi.LESS_CLASSES", maybe? |
Oh that's right chalsall is the GPUTO72 author — anyway there was a post somewhere with the MORE_CLASSES disabled or LESS_CLASSES enabled and I picked up the executable and used it for a couple of months.
|
[QUOTE=Chuck;283310]Oh that's right chalsall is the GPUTO72 author — anyway there was a post somewhere with the MORE_CLASSES disabled or LESS_CLASSES enabled and I picked up the executable and used it for a couple of months.[/QUOTE]
I've posted an executable without MORE_CLASSES [URL="http://www.mersenneforum.org/showpost.php?p=273900&postcount=363"]here[/URL] (mfaktc 0.17). Oliver |
I just found a factor with 0.18:
[QUOTE]M52248761 has a factor: 3708847255636615579439 [TF:70:72*:mfaktc 0.18 barrett79_mul32] found 1 factor for M52248761 from 2^70 to 2^72 (partially tested) [mfaktc 0.18 barrett79_mul32] [/QUOTE]Obviously the prime server does not yet like the nice new accurate messages from version 0.18. [QUOTE] No factor lines found: 0 Mfaktc no factor lines found: 0 Mfakto no factor lines found: 0 Factors found: 1 Processing result: M52248761 has a factor: 3708847255636615579439 Insufficient information for accurate CPU credit. For stats purposes, assuming factor was found using P-1 with B1 = 800000. CPU credit is 2.4586 GHz-days. P-1 lines found: 0 LL lines found: 0 Mlucas lines found: 0 Glucas (G29) lines found: 0 Glucas lines found: 0 MacLucasFFTW lines found: 0 CUDALucas lines found: 0 ECM lines found: 0 [/QUOTE] Edit: Ok, I just saw that this is on James Heinrich's todo list. Sorry |
[QUOTE=Radikalinsky;283459]I just found a factor with 0.18:
Obviously the prime server does not yet like the nice new accurate messages from version 0.18.[/QUOTE] I saw this once. I think it occurred when I uploaded the result before the second, "end of level" line was generated. As in: [CODE]M52279247 has a factor: 1525757169405396899617 [TF:70:71:mfaktc 0.18 barrett79_mul32] found 1 factor for M52279247 from 2^70 to 2^71 [mfaktc 0.18 barrett79_mul32] [/CODE] |
@Kladner,
I manually submitted both lines. Maybe it is because with partial tests the primenet server does some assumptions. But as I understand, the primenet server just does not yet understand all the details of the mfaktc message, both 0.17 and 0.18. Thanks, Rad |
| All times are UTC. The time now is 23:16. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.