mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2011-12-22, 21:57   #1442
Chuck
 
Chuck's Avatar
 
May 2011
Orange Park, FL

37516 Posts
Default Checkpoint overhead?

Can someone estimate what the overhead of checkpoints is? I decided several weeks ago to turn them off, as mfaktc and my computer are very stable. On rare occasions I need to reboot the computer, and I might lose an hour of processing time if I am too impatient to wait for the current bitlevels to finish.

I am wondering if a month's overhead of checkpoints is more than an hour of lost work time.
Chuck is offline   Reply With Quote
Old 2011-12-23, 00:10   #1443
Bdot
 
Bdot's Avatar
 
Nov 2010
Germany

3×199 Posts
Default

Quote:
Originally Posted by Chuck View Post
Can someone estimate what the overhead of checkpoints is? I decided several weeks ago to turn them off, as mfaktc and my computer are very stable. On rare occasions I need to reboot the computer, and I might lose an hour of processing time if I am too impatient to wait for the current bitlevels to finish.

I am wondering if a month's overhead of checkpoints is more than an hour of lost work time.
I just timed CPs on a W7-64 Core i7-M620 laptop with a slow disk.

per CP:
0.01 ms for creating the checksum (CPU load)
0.2 ms writing & closing the file
1 ms for remove/rename operations for the backup file (mfakto only - mfaktc just has a remove ~ 0.2 ms)
1 ms for committing to disk (fflush);


CPs are written after a class is finished, and before more work is loaded on the GPU - so this is "idle time" for the GPU if you just run a single instance. When running more instances per GPU, then they will overlap.

So if you calculate single instance, 2 ms per CP, one CP after each class, 2 seconds per class, then you spend 0.1% of the time for writing the CP (this should be pretty much worst case). 0.1% of one month is ~ 45 min. If you lose 1h / month due to not writing CP's, you'd already be better off enabling them.
And now you can configure mfaktc to write CP's less frequently - in your case you can set it to maximum (900 s) and it will still write a CP when you abort it with ^C. Then you spend about 6 seconds per month for writing the CPs.

Still anyone running without checkpoints?
Bdot is offline   Reply With Quote
Old 2011-12-23, 02:09   #1444
nucleon
 
nucleon's Avatar
 
Mar 2003
Melbourne

51510 Posts
Default

hehe ramdisk - and all those problems dissappear.

-- Craig
nucleon is offline   Reply With Quote
Old 2011-12-23, 03:14   #1445
Chuck
 
Chuck's Avatar
 
May 2011
Orange Park, FL

15658 Posts
Default

Thanks bdot that was very helpful. I hadn't looked at checkpoints for some time since before GPUTO72 I was "lumberjacking" in the M600,000,000 range where a TF run took around a minute (I was using chalsall's MORE_CLASSES disabled version).

I went with 600 as the checkpoint delay. It's nice that one is taken after a CTRL-C.
Chuck is offline   Reply With Quote
Old 2011-12-23, 03:46   #1446
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

9,767 Posts
Default

Quote:
Originally Posted by Chuck View Post
(I was using chalsall's MORE_CLASSES disabled version)
That wasn't me, Guv.
chalsall is offline   Reply With Quote
Old 2011-12-23, 04:12   #1447
kladner
 
kladner's Avatar
 
"Kieren"
Jul 2011
In My Own Galaxy!

27AE16 Posts
Default

Quote:
Originally Posted by chalsall View Post
That wasn't me, Guv.
That would have been "mfaktc171apsen.cuda40.sm_multi.LESS_CLASSES", maybe?
kladner is offline   Reply With Quote
Old 2011-12-23, 13:19   #1448
Chuck
 
Chuck's Avatar
 
May 2011
Orange Park, FL

11011101012 Posts
Default

Oh that's right chalsall is the GPUTO72 author — anyway there was a post somewhere with the MORE_CLASSES disabled or LESS_CLASSES enabled and I picked up the executable and used it for a couple of months.
Chuck is offline   Reply With Quote
Old 2011-12-23, 15:39   #1449
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

11·101 Posts
Default

Quote:
Originally Posted by Chuck View Post
Oh that's right chalsall is the GPUTO72 author — anyway there was a post somewhere with the MORE_CLASSES disabled or LESS_CLASSES enabled and I picked up the executable and used it for a couple of months.
I've posted an executable without MORE_CLASSES here (mfaktc 0.17).

Oliver
TheJudger is offline   Reply With Quote
Old 2011-12-25, 03:29   #1450
Radikalinsky
 
Mar 2011
Germany

11 Posts
Default

I just found a factor with 0.18:

Quote:
M52248761 has a factor: 3708847255636615579439 [TF:70:72*:mfaktc 0.18 barrett79_mul32]
found 1 factor for M52248761 from 2^70 to 2^72 (partially tested) [mfaktc 0.18 barrett79_mul32]
Obviously the prime server does not yet like the nice new accurate messages from version 0.18.

Quote:
No factor lines found: 0
Mfaktc no factor lines found: 0
Mfakto no factor lines found: 0
Factors found: 1
Processing result: M52248761 has a factor: 3708847255636615579439
Insufficient information for accurate CPU credit. For stats purposes, assuming factor was found using P-1 with B1 = 800000.
CPU credit is 2.4586 GHz-days.
P-1 lines found: 0
LL lines found: 0
Mlucas lines found: 0
Glucas (G29) lines found: 0
Glucas lines found: 0
MacLucasFFTW lines found: 0
CUDALucas lines found: 0
ECM lines found: 0
Edit: Ok, I just saw that this is on James Heinrich's todo list. Sorry

Last fiddled with by Radikalinsky on 2011-12-25 at 03:35 Reason: Its always a good idea to read the documentation ;-)
Radikalinsky is offline   Reply With Quote
Old 2011-12-25, 03:35   #1451
kladner
 
kladner's Avatar
 
"Kieren"
Jul 2011
In My Own Galaxy!

2×3×1,693 Posts
Default

Quote:
Originally Posted by Radikalinsky View Post
I just found a factor with 0.18:


Obviously the prime server does not yet like the nice new accurate messages from version 0.18.
I saw this once. I think it occurred when I uploaded the result before the second, "end of level" line was generated. As in:

Code:
M52279247 has a factor: 1525757169405396899617 [TF:70:71:mfaktc 0.18 barrett79_mul32]
found 1 factor for M52279247 from 2^70 to 2^71 [mfaktc 0.18 barrett79_mul32]

Last fiddled with by kladner on 2011-12-25 at 03:36
kladner is offline   Reply With Quote
Old 2011-12-25, 04:04   #1452
Radikalinsky
 
Mar 2011
Germany

1110 Posts
Default

@Kladner,
I manually submitted both lines. Maybe it is because with partial tests the primenet server does some assumptions. But as I understand, the primenet server just does not yet understand all the details of the mfaktc message, both 0.17 and 0.18.

Thanks, Rad
Radikalinsky is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
mfakto: an OpenCL program for Mersenne prefactoring Bdot GPU Computing 1676 2021-06-30 21:23
The P-1 factoring CUDA program firejuggler GPU Computing 753 2020-12-12 18:07
gr-mfaktc: a CUDA program for generalized repunits prefactoring MrRepunit GPU Computing 32 2020-11-11 19:56
mfaktc 0.21 - CUDA runtime wrong keisentraut Software 2 2020-08-18 07:03
World's second-dumbest CUDA program fivemack Programming 112 2015-02-12 22:51

All times are UTC. The time now is 06:48.


Mon Aug 2 06:48:32 UTC 2021 up 10 days, 1:17, 0 users, load averages: 0.76, 0.93, 1.07

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.