View Single Post
Old 2020-03-08, 22:18   #27
ewmayer's Avatar
Sep 2002
Rep├║blica de California

2×7×829 Posts

I revisited the currently-disabled checkpointing code yesterday - it predates my experiments with multithreaded runs and thus needs a complete overhaul, but in silver-lining news, I believe I've found a way to do it that will serve both your "poor's man's multithreading" current run mode and a future true-multithreaded implementation. Here 2 code comments describing the schema I have in mind:
/* The factoring checkpoint file is assumed to have the format:
The file is ascii format, with the following [entries] - user-added comments or annotations are allowed below,
lines 1 and 2, as long as they do not trigger hits for the find-substring-in-file triggered by the mandatory,
program-autogenerated entries:

Line 1:	[String containing the current exponent stored in pstring.]
		Exponent must be odd (but not necessarily prime), and have digit length corresponding
		to the number * of 64-bit limbs set via -DP*WORD at build time. (-DNWORD means unlimited)
Line 2: [Value of TF_PASSES in the build (16 or 960)]

Followed by anywhere from 1 to TF_PASSES lines of the following form,
which need not be in numeric order but must have no index repeats:

Pass [index < TF_PASSES]: [Max factor-k value reached for this pass number by the run(s) which updated this savefile]

Here, "factor-k value" refers to the standard form of prime-exponent Mersenne number factors: M(p) has 1 or more
factors of form q = 2.k.p+1. Mfactor does not in fact require prime exponents, but for nonprime ones will only search
for factors of that form. For example, for the composite-exponent case M(25) = 2^25-1

Every 1024th pass through the small-primes sieve, and also following the final pass
through the sieve, write the checkpoint file, with format as described previously.

Since we expect that multiple jobs and/or threads may be working on the same exponent,
we make such checkpoint-updates atomic as follows:

1. job/thread X acquires file lock and opens checkpoint file <filename> for reading, if it exists.
	No other job/thread may acquire file lock until X releases it;
2. X also opens a 2nd, temporary, file <filename.tmp> for writing, beginning with line 1: exponent (= pstring)
	and line 2, Value of TF_PASSES in the build (16 or 960)
3. If there was an existing savefile found in step [1], X copies its contents
	to the .tmp file in [2] line-by-line, only updating the single "Pass *: [max k reached]" entry
	corresponding to the current pass whose progress is being saved via checkpointing;
4. X closes both files, renames <filename.tmp> to <filename>, thus overwriting the now-obsolete
	version of the latter;
5. X releases the file lock and resumes processing factor candidates.
ewmayer is offline   Reply With Quote