mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Information & Answers (https://www.mersenneforum.org/forumdisplay.php?f=38)
-   -   Trial Factoring for Dummies (https://www.mersenneforum.org/showthread.php?t=25493)

LOBES 2020-04-27 16:08

Trial Factoring for Dummies
 
I've recently got back into things, and really enjoy using mfaktc-win-64.exe to do TF'ing. I've got an nVidia 1080 Ti so it can burn through them at a fairly good pace. But I do have a couple of comments/questions about TF'ing and Manual Testing I would love to get more info on:

1. So lets look at exponent 203437321. I recently used mfaktc to TF from 71 to 74, and no factors were found. So the status (of this writing anyway) of this exponent is: [B]No factors below 2^74[/B] Did my TF work actually save any time for when this exponent is actually tested for being prime, say using LL? Will LL run any faster for whomever tests it because it knows there are no factors below that level? Or did I simply not eliminate the exponent from testing because it was not factored? I'm not entirely clear on how I contributed in this instance.

2. As I said, I enjoy doing the TF work, and like getting manual assignments (both directly from ([url]https://www.mersenne.org/manual_assignment/[/url]) and also from GPU72 ([url]https://www.gpu72.com/account/getassignments/[/url]). However, it seems to me that the submission process is wide open to fraudulent results? I would never do so, but from looking at the results.txt that gets generated, it would be easy to fudge that to claim work that wasn't actually done.

3. On a slightly different topic, is work that is done on Primenet duplicated on Primegrid? If I'm going to be away from my PC for a while, I sometimes fire up Generalized Fermat Prime Search using BOINC. Is there any "stepping on toes" so to speak between these two projects?

Uncwilly 2020-04-27 16:28

[QUOTE=LOBES;543960]1. So lets look at exponent 203437321. I recently used mfaktc to TF from 71 to 74, and no factors were found. So the status (of this writing anyway) of this exponent is: [B]No factors below 2^74[/B] Did my TF work actually save any time for when this exponent is actually tested for being prime, say using LL? Will LL run any faster for whomever tests it because it knows there are no factors below that level? Or did I simply not eliminate the exponent from testing because it was not factored? I'm not entirely clear on how I contributed in this instance.[/quote]
If you didn't find a factor, we still have to do a primality test. (Currently because of lower error rates we are favoring PRP tests. If an exponent passes that then we will do an LL to nail it down for sure.) On a good GPU we should take that exponent to 79 bits. TF is being used to eliminate the need for primality tests. It does not speed them up individually, but the project overall.
[quote]2. As I said, I enjoy doing the TF work, and like getting manual assignments (both directly from ([url]https://www.mersenne.org/manual_assignment/[/url]) and also from GPU72 ([url]https://www.gpu72.com/account/getassignments/[/url]). However, it seems to me that the submission process is wide open to fraudulent results? I would never do so, but from looking at the results.txt that gets generated, it would be easy to fudge that to claim work that wasn't actually done.[/quote]This is a known issue. If someone reports a false factor, that is easy to detect. If they report a single false "No Factor" that is only possible to detect by repeating the TF. If someone reports 500 results with "No Factor", now we can use stats to look at how many they 'should' have found. So big fraud is easy to detect. The overhead in producing some checksum during TF has been discussed. Some ideas of how to do it have been mentioned. But no action taken.

I will let someone else answer #3

LaurV 2020-04-27 16:29

[QUOTE=LOBES;543960]
1.Did my TF work actually save any time for when this exponent is actually tested for being prime, say using LL?
[/QUOTE]

No.

[QUOTE]
Will LL run any faster for whomever tests it because it knows there are no factors below that level?

[/QUOTE]
No.
[QUOTE]
Or did I simply not eliminate the exponent from testing because it was not factored?

[/QUOTE]
Yes.
[QUOTE]
I'm not entirely clear on how I contributed in this instance.
[/QUOTE]
No contribution for this particular exponent. But if you factor to x bits, your chance is to find (about) one factor in x trials. In that case, you will eliminate an exponent, and we will only need to make x-1 LL/PRP tests and x-1 DC tests, instead of x and x. If the time to TF x exponents to x bits is faster than making 2 LL/PRP tests (which usually IS), then it is a gain for the project, and THAT is your contribution.
[QUOTE]
2.However, it seems to me that the submission process is wide open to fraudulent results?
[/QUOTE]
Yes. It appears so. You can cheat few times, and never got caught. But doing so repeatedly, you will be caught with your paw in the cookie jar, because there is an expected rate of factors, which you will not be able to reach (see above) if you cheat. Sooner than you expect, some guys here have nothing better to do than watching their neighbors... :razz:
[QUOTE]
3. On a slightly different topic, is work that is done on Primenet duplicated on Primegrid? If I'm going to be away from my PC for a while, I sometimes fire up Generalized Fermat Prime Search using BOINC. Is there any "stepping on toes" so to speak between these two projects?
[/QUOTE]

Not really. Those guys know what they are doing.

chalsall 2020-04-27 16:33

[QUOTE=LOBES;543960]Did my TF work actually save any time for when this exponent is actually tested for being prime, say using LL?[/QUOTE]

Strictly speaking, no. Only when a factor is found is an FC (LL or PRP) test saved.

BUT, the P-1 run /will/ be slightly faster, because Prime95/mprime spends slightly less time (with a slightly less chance of finding a factor) the deeper a candidate has been TF'ed.

[QUOTE=LOBES;543960]However, it seems to me that the submission process is wide open to fraudulent results?[/QUOTE]

You are correct. There is no way to prove that a TF run was actually done, except for where a factor is found.

However, GPU72 (and I believe Primenet) both keep track of what number of factors /should/ have been found as a heuristical function of the number of runs done.

When GPU72 was first started, I would often waste days of GPU time double-checking runs of participants who's stats were "outliners". Not once did I ever find anyone cheating (and/or, had bad kit).

kriesel 2020-04-27 17:12

[QUOTE=chalsall;543964]There is no way to prove that a TF run was actually done, except for where a factor is found.[/QUOTE]Not currently implemented on client software and server, but R. Gerbicz came up with a method for providing a proof of work in the TF no-factor case. See the many links to its discussion at [URL]https://www.mersenneforum.org/showpost.php?p=509936&postcount=2[/URL]
I think a variation could be applied to P-1. See [url]https://www.mersenneforum.org/showpost.php?p=510079&postcount=127[/url]

chalsall 2020-04-27 17:50

[QUOTE=kriesel;543967]See the many links to its discussion at [URL]https://www.mersenneforum.org/showpost.php?p=509936&postcount=2[/URL][/QUOTE]

Again, I must thank you for your thorough curation of this very technical and rarified space! Very valuable. :tu:

I don't have the time to drill down on all the prior discussion. I will say, however, that Oliver asked (many years ago) if anyone had ideas on how to produce a "checksum" or some other "proof of work" mechanism. Several ideas were proposed, but they all failed to be viable because of the unpredictable (and massive) concurrency of the CUDA TF runs.

If Gerbicz has proposed something which /is/ viable, it would be cool to see it codified and beta-tested.

Not that I think there's much, if any, cheating going on. But there /could/ be more "bad kit" out there than we realize. Having mfakt[c|o] be able to provide a long-term sanity data stream would be very useful to many people.

kriesel 2020-04-27 19:09

[QUOTE=chalsall;543971]Again, I must thank you for your thorough curation of this very technical and rarified space! Very valuable.[/QUOTE]You're welcome. It's my hope that when I become unwilling or unable (hopefully many years later) to continue it, someone else will step forward to continue it somehow. No I don't have anyone in mind.

LOBES 2020-04-27 19:54

Is there perhaps a better use of my 1080 Ti than doing TF'ing? From what I understand, mfaktc-win-64.exe is only used for Trial Factoring.

What is (if there is) the recommended program for doing actual prime testing using GPU (Windows based only at this time)? That is why I also use PrimeGrid, since BOINC makes it so easy.

chalsall 2020-04-27 20:05

[QUOTE=kriesel;543984]No I don't have anyone in mind.[/QUOTE]

It's a somewhat rare skill-set -- not everyone can do it. But ask a librarian *anything*, and they'll know (or know where to read). :smile:

You're probably best equipped to answer LOBES questions. I'd be interested in the answer(s) myself.

Dylan14 2020-04-27 20:06

@lobes: For GPU primality testing on Nvidia cards you can either use CUDAlucas, or you can use gpuowl. gpuowl has the advantage that it has Gerbicz error detection (at the slight cost of the test being a probable prime test).
Builds and source code links for gpuowl can be found [URL="https://www.mersenneforum.org/showpost.php?p=488539&postcount=4"]here[/URL] and that for CUDAlucas [URL="https://download.mersenne.ca/CUDALucas"]here.[/URL]

kriesel 2020-04-27 20:06

[QUOTE=LOBES;543990]Is there perhaps a better use of my 1080 Ti than doing TF'ing? From what I understand, mfaktc-win-64.exe is only used for Trial Factoring.

What is (if there is) the recommended program for doing actual prime testing using GPU (Windows based only at this time)? That is why I also use PrimeGrid, since BOINC makes it so easy.[/QUOTE]TF will provide the most computing credit from such a gpu, and the most benefit to the GIMPS project per hour of running. P-1 factoring is another possibility, either CUDAPm1 or gpuowl. If setting out to use a gpu for primality testing or P-1, first run at least one or two doublechecks of the work of others that did first-tests, to test the reliability of your setup. (Get assignments for whatever you run.) Verifying P-1 known factors is another reliability test.
For primality testing, gpuOwl to run PRP with Gerbicz error checking. PRP indicating composite with the GEC is almost conclusive proof of not prime. If it returns P indicating probably prime, it is either a software bug, a false positive, or time to test it with multiple LL tests to confirm or disprove the discovery of a new prime. To doublecheck an LL test, either CUDALucas or a very recent version of gpuowl. [URL]http://www.mersenneforum.org/showpost.php?p=488291&postcount=2[/URL]

Uncwilly 2020-04-27 20:31

[QUOTE=LOBES;543990]Is there perhaps a better use of my 1080 Ti than doing TF'ing? From what I understand, mfaktc-win-64.exe is only used for Trial Factoring.[/QUOTE]
While you can use it for other things... It is best for GIMPS if you use it for TF. By using GPU's the level of TF that make sense is higher, thus more exponents eliminated. Also, GPU's doing -all- the TF makes sense vs having the CPU's do any.

R.D. Silverman 2020-04-27 22:14

[QUOTE=chalsall;543971]Again, I must thank you for your thorough curation of this very technical and rarified space! Very valuable. :tu:

I don't have the time to drill down on all the prior discussion. I will say, however, that Oliver asked (many years ago) if anyone had ideas on how to produce a "checksum" or some other "proof of work" mechanism. Several ideas were proposed, but they all failed to be viable because of the unpredictable (and massive) concurrency of the CUDA TF runs.

If Gerbicz has proposed something which /is/ viable, it would be cool to see it codified and beta-tested.

Not that I think there's much, if any, cheating going on. But there /could/ be more "bad kit" out there than we realize. Having mfakt[c|o] be able to provide a long-term sanity data stream would be very useful to many people.[/QUOTE]

When one requests to perform TD on 2^p-1 have the server provide a set of indices
s = [I1, I2, I3, .....]. Encrypt s with (say) AES. When one does the trial division,
SAVE the remainders r_I = 2^p-1 mod p_{I} for all I \in s. Then return
HASH(r1 | r2 | r3 | ....) for some chosen hash function.

Bury the AES key within the TD code. Yes, a determined hacker can find and extract the key with enough
effort. But people with the skill to do that are unlikely to want to cheat.
One can even change the key periodically if desired to making extracting it a little bit
harder. (via a key exchange). Yes, the new key can also be extracted, but it requires
extra work.


In other words, return a hash of a random set of concatened remainders. This at least
proves that the worker did at least some of the divisions. It can be easily checked by
the server. This is similar to a ZKP technique known as "cut and choose".
Change the set s each time.

chalsall 2020-04-28 00:41

[QUOTE=R.D. Silverman;544005]Change the set s each time.[/QUOTE]

Wow! I actually understood some of that! :smile:

I guess the question then becomes, what is the (computational) cost in doing this?

Also, it isn't clear to me if this provides any sort of "sanity" check during the TF/TD run? Short of running again (DC'ing a TF -- usually not done), will this help find border-line kit?

That would be a /strong/ motivator for some to run such a code-path. Many use these things for serious work too, where sane, deterministic kit is mission-critical.

R.D. Silverman 2020-04-28 02:26

[QUOTE=chalsall;544017]Wow! I actually understood some of that! :smile:

I guess the question then becomes, what is the (computational) cost in doing this?
[/QUOTE]

nil. The required remainders get computed anyway during the TD. The encryption
and Hash take almost no time.
[QUOTE]

Also, it isn't clear to me if this provides any sort of "sanity" check during the TF/TD run?
[/QUOTE]

Please be more specific.
It is almost impossible to correctly guess the set s, so if the hash is correct it means the
TF was done.

[QUOTE]

Short of running again (DC'ing a TF -- usually not done), will this help find border-line kit?
[/QUOTE]

"border-line kit"???? Meaning?

retina 2020-04-28 02:45

[QUOTE=R.D. Silverman;544005]When one requests to perform TD on 2^p-1 have the server provide a set of indices
s = [I1, I2, I3, .....]. Encrypt s with (say) AES. When one does the trial division,
SAVE the remainders r_I = 2^p-1 mod p_{I} for all I \in s. Then return
HASH(r1 | r2 | r3 | ....) for some chosen hash function.

Bury the AES key within the TD code. Yes, a determined hacker can find and extract the key with enough
effort. But people with the skill to do that are unlikely to want to cheat.
One can even change the key periodically if desired to making extracting it a little bit
harder. (via a key exchange). Yes, the new key can also be extracted, but it requires
extra work.


In other words, return a hash of a random set of concatened remainders. This at least
proves that the worker did at least some of the divisions. It can be easily checked by
the server. This is similar to a ZKP technique known as "cut and choose".
Change the set s each time.[/QUOTE]I would look at this from the perspective as a puzzle to be solved.

I simple solution to solve this puzzle would be to extract the key from the code and submit the required results. Problem solved. Server fooled. Easy.

So we would still need to monitor the expected frequency of successes to catch anyone using the short-cut. The end result is you've make it slightly more difficult for someone to cheat, at the cost of extra complexity for the server.

And for the amount of cheats we've seen in the past (i.e. zero) I don't think we need to go there yet. Let's not try to solve problems that don't yet exist.

axn 2020-04-28 02:58

[QUOTE=retina;544023]at the cost of extra complexity for the server.[/QUOTE]

... which pales in comparison to the extra complexity (and performance hit) to the client.

Not to mention the fact that mfaktc/o don't directly communicate to the server, so all the middlemen/scripts also needs to be updated. And people do need to build mfaktc/o from source anyway, so there goes the secrecy of the key.

Also, the index isn't a deterministic thing -- the base sieve used to find the candidate factors have a user-configurable sieve depth (because of efficiency concerns) and so different amounts of candidate factors are TD-ed.

R.D. Silverman 2020-04-28 03:01

[QUOTE=retina;544023]I would look at this from the perspective as a puzzle to be solved.

I simple solution to solve this puzzle would be to extract the key from the code
[/QUOTE]

It is not as easy as you think. Especially when the key is scattered as bits and pieces
throughout the code.

[QUOTE]

So we would still need to monitor the expected frequency of successes to catch anyone using the short-cut. The end result is you've make it slightly more difficult for someone to cheat, at the cost of extra complexity for the server.
[/QUOTE]

Server complexity/cost is tiny.

R.D. Silverman 2020-04-28 03:13

[QUOTE=axn;544024]... which pales in comparison to the extra complexity (and performance hit) to the client.
[/QUOTE]

You are confused. The added computation would be hard to measure in a benchmark.
It is [b]tiny[/b]: Decrypt the index set, gather the residues during the TF, catenate them.
compute a hash.

[QUOTE]
Not to mention the fact that mfaktc/o don't directly communicate to the server, so all the middlemen/scripts also needs to be updated.
[/QUOTE]

[i]Any[/i] scheme will require code changes.

[QUOTE]
And people do need to build mfaktc/o from source anyway, so there goes the secrecy of the key.
[/QUOTE]

Why do they need source? Why can't you just distribute binaries?
(I suspect I know the answer]. It is the main reason I dislike these devices: lack of
interoperability. There are too many hardware variations. Code is not portable.

This problem can be avoided: Just compile a variety of different binaries. Don't give
out source.

[QUOTE]
Also, the index isn't a deterministic thing -- the base sieve used to find the candidate factors have a user-configurable sieve depth (because of efficiency concerns) and so different amounts of candidate factors are TD-ed.[/QUOTE]

It is not "index". It is "indices". And it should suffice to put a upper bound on the
max element in set s.

retina 2020-04-28 03:26

[QUOTE=R.D. Silverman;544025]It is not as easy as you think. Especially when the key is scattered as bits and pieces
throughout the code.[/QUOTE]Key extraction is really easy. Judicious use of a debugger makes it a no brainer. That is how DVDs and BluRay get pwned effortlessly. It doesn't work.

axn 2020-04-28 03:35

[QUOTE=R.D. Silverman;544026]You are confused. The added computation would be hard to measure in a benchmark.
It is [b]tiny[/b]: Decrypt the index set, gather the residues during the TF, catenate them.
compute a hash.[/quote]
You have clearly not done any GPU programming. What sounds like a simple operation in CPU land will have a large impact on GPU performance (not like 2x hit, but easily 20-30%). But for the sake of argument, let's say this is acceptable

[QUOTE=R.D. Silverman;544026][i]Any[/i] scheme will require code changes. [/quote]
The problem is that all of these have different authors. Any many people have their own scripts. Some people just manually get assignment from server and launch using command line. This isn't just code change -- it is a complex "system" change. The problem is that the whole "GPU TF" thing isn't very streamlined. Anyway, for the sake of argument, let's assume that this can somehow be coordinated.


[QUOTE=R.D. Silverman;544026]Why do they need source? Why can't you just distribute binaries?
(I suspect I know the answer]. It is the main reason I dislike these devices: lack of
interoperability. There are too many hardware variations. Code is not portable.

This problem can be avoided: Just compile a variety of different binaries. Don't give
out source.[/quote]
Hardware variations, OS variations, device driver variations, development platform (CUDA / OpenCL) variations.
Suppose, somehow we can overcome this problem and force everyone to use only trusted binaries. Then, we don't need to do any of this. We can just send in a nonce and the client can return HASH(nonce, exponent, bit depth). Since we already trust the client, we can trust it to generate this hash after full computation is done. No need to mess up the critical TF logic.


[QUOTE=R.D. Silverman;544026]It is not "index". It is "indices". And it should suffice to put a upper bound on the
max element in set s.[/QUOTE]

I'm trying to understand this part. Suppose the server asks client to compute has for indices {10, 20, 30}. The assumption is that the server knows (or can compute) exactly which candidate factors correspond to the given indices. But that is not deterministic on client side. Because of variable SoE sieve depth (and the inherent nondeterminism in GPU sieve), the exact sequence of candidate factors produced can vary and so the server can't validate the residue.

I suppose I'm misunderstanding you. Maybe you mean the actual factoring candidates (or the k in 2kp+1 for known prime candidates which will never be skipped). It would be tricky to implement such a thing efficiently.

kriesel 2020-04-28 03:53

[QUOTE=R.D. Silverman;544026]Just compile a variety of different binaries. Don't give
out source.[/QUOTE]As things stand now, mfaktc is GPLV3 licensed and the source is out there, and the license specifies source must be made available. Mfakto was derived from mfaktc.

chalsall 2020-04-28 03:54

[QUOTE=R.D. Silverman;544022]"border-line kit"???? Meaning?[/QUOTE]

Kit which is no longer producing reliable, reproducible (read: sane), results. Soon (potentially) pining for the fords, etc.

I first got involved with GIMPS by using mprime for ongoing sanity confirmation of deployed kit. This is one of the reasons I only run DCs (on the CPUs) on certain machines. When mprime starts throwing errors, it's time to decommission the machine. Like, soon.

It would be really cool if there was a way of doing similar ongoing monitoring of GPUs. For when they're being used for things like financial modeling, 3D rendering, etc.

Prime95 2020-04-28 05:31

One hitch: The sieve mfaktc uses produces a non-deterministic set of trial factors.
To produce a deterministic set would require a big hit to sieve performance (atomic bit clear operations).

R.D. Silverman 2020-04-28 08:24

[QUOTE=axn;544032]
Suppose, somehow we can overcome this problem and force everyone to use only trusted binaries. Then, we don't need to do any of this. We can just send in a nonce and the client can return HASH(nonce, exponent, bit depth).
[/QUOTE]

This proves nothing since it can be computed independently of whether the
divisions took place. You need to include some information generated by
the TF itself.


[QUOTE]


I'm trying to understand this part. Suppose the server asks client to compute has for indices {10, 20, 30}. The assumption is that the server knows (or can compute) exactly which candidate factors correspond to the given indices. But that is not deterministic on client side. Because of variable SoE sieve depth (and the inherent nondeterminism in GPU sieve), the exact sequence of candidate factors produced can vary and so the server can't validate the residue.
[/QUOTE]

Another reason to dislike GPU's: They are NDTM's. Can you even prove that two
different machines will actually trial divide with the same set of candidate divisors?
Perhaps some machines fail to test one or more candidates 2kp+1 because it skipped
some k that it should not?

I was specifying a method which I assumed was running on a DTM.

R.D. Silverman 2020-04-28 08:28

[QUOTE=chalsall;544036]Kit which is no longer producing reliable, reproducible (read: sane), results. Soon (potentially) pining for the fords, etc.
[/QUOTE]

"kit"? What is a "kit"??? Is it the GPU itself? A driver? A bus? etc.
Is this word standard GPU terminology?

retina 2020-04-28 08:37

[QUOTE=R.D. Silverman;544042]"kit"? What is a "kit"??? Is it the GPU itself? A driver? A bus? etc.
Is this word standard GPU terminology?[/QUOTE]I love it. RDS is back to form. :tu:

[url]https://en.wiktionary.org/wiki/kit#Noun[/url] number 4. But I suspect you figured that out from the context.

R.D. Silverman 2020-04-28 08:37

[QUOTE=Prime95;544038]One hitch: The sieve mfaktc uses produces a non-deterministic set of trial factors.


To produce a deterministic set would require a big hit to sieve performance (atomic bit clear operations).[/QUOTE]

So if two different machines trial divide with two different sets of trial factors how
do you know whether a true factor was actually tested??? You might be missing some.

A small change in the method can resolve different-order testing anyway. Instead of saving specific
residues, just xor ALL residues into an accumulator and hash it at the end.

R.D. Silverman 2020-04-28 08:53

[QUOTE=retina;544043]I love it. RDS is back to form. :tu:

[url]https://en.wiktionary.org/wiki/kit#Noun[/url] number 4. But I suspect you figured that out from the context.[/QUOTE]

I love it. People who make informal use of an English word during a technical discussion
in which such words may have more than one meaning. Technical discussions require
words to have only possible meaning.

Enlighten me. Please give a formal definition of the word "kit" in this context. What
items are included in a GPU "kit"?

retina 2020-04-28 08:57

[QUOTE=R.D. Silverman;544045]Enlighten me. Please give a formal definition of the word "kit" in this context. What
items are included in a GPU "kit"?[/QUOTE]ALL of it. Everything that is required to run a test. Including the power cable, the OS, the drivers, and the ability to keep the cat off the keyboard.

kriesel 2020-04-28 10:39

[QUOTE=retina;544046]ALL of it. Everything that is required to run a test. Including the power cable, the OS, the drivers, and the ability to keep the cat off the keyboard.[/QUOTE]Proper selection of every component of the kit. Including no kittens.

retina 2020-04-28 10:45

[QUOTE=kriesel;544053]Proper selection of every component of the kit. Including no kittens.[/QUOTE]Hah, yes I missed that obvious opportunity.

Revised "kit":
ALL of it. Everything that is required to run a test. Including the power cable, the OS, the drivers, and the ability to keep the kitty off the keyboard.

axn 2020-04-28 10:47

[QUOTE=R.D. Silverman;544041]This proves nothing since it can be computed independently of whether the
divisions took place. You need to include some information generated by
the TF itself.
[/QUOTE]

Perhaps we should do some threat modeling before evaluating proposed solutions.

What is the current problem? Someone can fake a "no factor" result by manually typing up the result line and submitting to the server.

A simple checksum that can only be generated by the client should prevent that. It should have the property that different exponent/bit level should give unique value. Presumably, the client should have enough safeguards that it can't be tricked into computing the checksum without doing the actual work. This would mean encrypting the checkpoint files written to the disk.

I don't think for the problem as stated, we need proof of work. Provided the server trusts the client, the checksum is basically the client vouching for the work done -- that should be good enough.

kriesel 2020-04-28 10:52

[QUOTE=R.D. Silverman;544044]So if two different machines trial divide with two different sets of trial factors how do you know whether a true factor was actually tested??? You might be missing some.[/QUOTE]It's my understanding that the leading gpu TF applications use not too few but too many candidate factors. Incomplete sieving of trial factoring candidates is accepted because it is higher performance, without excluding any candidate prime factors. It's faster in practice to sieve the Mersenne number with an occasional composite factor candidate also included, than to perfectly sieve the factor candidates. All the primes of the proper form in 2[SUP]b[/SUP] to 2[SUP]b+1[/SUP] are in the set of trial factors when doing TF from b to b+1, plus a varied set of composites. Assuming the hardware and software are operating to spec, that is, or sufficiently near it.

That's covered in points 8 and 9 of [URL]https://www.mersenneforum.org/showpost.php?p=508523&postcount=6[/URL] (my attempt to document all the concepts of mfaktc trial factoring I could identify, which had been reviewed by the author of mfaktc)

But suppose one in 10[SUP]6[/SUP] primes were omitted somehow from the TF candidate factors. We would still get nearly all the benefits of a perfect TF process, with all or nearly all the run time of a perfect TF process, over the same bit size interval.
A usable proof of work algorithm needs to be tolerant of both some level of expected software imperfection and speed optimizations.

Prime95 2020-04-28 13:33

[QUOTE=R.D. Silverman;544044]So if two different machines trial divide with two different sets of trial factors how
do you know whether a true factor was actually tested??? You might be missing some.[/QUOTE]

All prime factors will be tested. A small non-deterministic set of composite factors will also be tested.

chalsall 2020-04-28 13:52

[QUOTE=Prime95;544067]All prime factors will be tested. A small non-deterministic set of composite factors will also be tested.[/QUOTE]

Please forgive my ignorance, but would it be expensive to somehow sum/hash the vector of dividing candidates, post sieve? Or are they spread over too many vectors?

Edit: Sorry, I just realized this would be useless. Different for every run. Never mind...

Edit2: What about a sum/hash of the vector of candidates /before/ the sieve? Is that deterministic and reproducable? (Sorry... Haven't had my second cup of coffee yet...)

kriesel 2020-04-28 15:55

[QUOTE=Prime95;544067]All prime factors will be tested. A small non-deterministic set of composite factors will also be tested.[/QUOTE]Hmm.
Is this non-deterministic set of composite factors a problem for Gerbicz' TF proof of work method? [URL]https://www.mersenneforum.org/showpost.php?p=509277&postcount=30[/URL] If for the set of q=2kp+1 that are composite but are not sieved out, one or more of (Mp mod q) mod 2[SUP]t[/SUP] is 1, it would seem so. Testing in that case that a candidate k for the proof list gives a prime q (finishing the sieving of only those that are candidates to go into the verification list) should handle that. Maybe that's best done on the cpu. The value of the list formed, as a proof of work, is that it is hard to produce, and easy to verify, rather like a found factor. Also it is usable in the most common TF case, no factor found. And depends on doing most of the actual TF work. And can be made reliable yet somewhat compact in size. And does not depend on keeping secrets such as encryption codes secret.

One of the reasons I like the Gerbicz double mod method is something analogous might be usable for P-1 verification of work, which is also not currently present. Some fraction of the code could be reused.

chalsall 2020-04-28 18:03

[QUOTE=axn;544032]Because of variable SoE sieve depth (and the inherent nondeterminism in GPU sieve), the exact sequence of candidate factors produced can vary and so the server can't validate the residue.[/QUOTE]

Please forgive another possibly stupid question (I understand mfaktc largely as a black box)... But if the SieveP is constant during a run, does that make anything more usefully deterministic?

Probably wouldn't help determine sanity during runtime, but it would open the option of random DCs (for kit sanity checks, for example, if the user wanted to check and monitor their own kit).

SethTro 2020-04-28 18:28

[QUOTE=axn;544032]You have clearly not done any GPU programming. What sounds like a simple operation in CPU land will have a large impact on GPU performance (not like 2x hit, but easily 20-30%). But for the sake of argument, let's say this is acceptable
[/QUOTE]

Luckily there are some GPU programmers here! I coded up [URL="https://www.mersenneforum.org/showpost.php?p=520512&postcount=199"]my proposal[/URL] and see less than a 5% performance impact (I measured 2-3% over 4 hours of work but with some margin of error)


[QUOTE=axn;544032]
The problem is that all of these have different authors. Any many people have their own scripts. Some people just manually get assignment from server and launch using command line. This isn't just code change -- it is a complex "system" change. The problem is that the whole "GPU TF" thing isn't very streamlined. Anyway, for the sake of argument, let's assume that this can somehow be coordinated.
[/QUOTE]

I asked about distribution somewhere and was told that mfaktc is ~30% of GPU factoring so starting there would be a good idea.

kriesel 2020-04-28 18:52

[QUOTE=SethTro;544094]I asked about distribution somewhere and was told that mfaktc is ~30% of GPU factoring so starting there would be a good idea.[/QUOTE]Wow, I thought mfaktc was more than that, closer to 70%. As far as I know, there's only mfaktc, mfakto, and perhaps TJAOI. Mfakto competes for AMD gpus and their higher DP throughput relative to TF throughput with primality testing and P-1 with gpuowl. What are the other gpu TF apps for GIMPS? (I'm always looking for more for the "available software" list [URL]http://www.mersenneforum.org/showpost.php?p=488291&postcount=2[/URL])

SethTro 2020-04-28 23:57

[QUOTE=kriesel;544099]Wow, I thought mfaktc was more than that, closer to 70%. As far as I know, there's only mfaktc, mfakto, and perhaps TJAOI. Mfakto competes for AMD gpus and their higher DP throughput relative to TF throughput with primality testing and P-1 with gpuowl. What are the other gpu TF apps for GIMPS? (I'm always looking for more for the "available software" list [URL]http://www.mersenneforum.org/showpost.php?p=488291&postcount=2[/URL])[/QUOTE]

I looked up the post that I had basing that number on and I have incorrectly recalled it as having a percentage which it doesn't.
[url]https://www.mersenneforum.org/showthread.php?t=24994[/url]

SethTro 2020-04-29 00:05

[QUOTE=axn;544055]Perhaps we should do some threat modeling before evaluating proposed solutions.

What is the current problem? Someone can fake a "no factor" result by manually typing up the result line and submitting to the server.

A simple checksum that can only be generated by the client should prevent that. It should have the property that different exponent/bit level should give unique value. Presumably, the client should have enough safeguards that it can't be tricked into computing the checksum without doing the actual work. This would mean encrypting the checkpoint files written to the disk.

I don't think for the problem as stated, we need proof of work. Provided the server trusts the client, the checksum is basically the client vouching for the work done -- that should be good enough.[/QUOTE]

I think this is a good line of reasoning.

A suspect some small percent of TF results were never run because manually created NO_FACTOR results including a larger bitrange were submitted (e.g. a NF result for bitrange 1-80). Making that harder (via a checksum for the line even) seems like a good starting place. If you want to do more to verify the calculating (to prevent against overclocking / math errors) we should look at my proof-of-work proposal.

chalsall 2020-04-29 16:39

1 Attachment(s)
[QUOTE=retina;544054]Revised "kit": ALL of it. Everything that is required to run a test. Including the power cable, the OS, the drivers, and the ability to keep the kitty off the keyboard.[/QUOTE]

My problem is keeping the kittens off the bed... :smile:

kriesel 2020-04-29 16:42

[QUOTE=chalsall;544197]My problem is keeping the kittens off the bed... :smile:[/QUOTE]I find doors work well. Although if it's the bedroom door, they'll hook it at the bottom with a paw and rattle it in the frame until "he who feeds cats" wakes up, so a more distant door is better.

chalsall 2020-04-29 18:51

[QUOTE=kriesel;544198]I find doors work well.[/QUOTE]

Our house (designed by Linda) is a very "open plan". To a cat, there are effectively no internal doors.

chalsall 2020-05-01 16:32

[QUOTE=SethTro;544143]If you want to do more to verify the calculating (to prevent against overclocking / math errors) we should look at my proof-of-work proposal.[/QUOTE]

I don't have the time (nor knowledge) to drill down further on your proof-of-concept efforts (of a proof-of-work mechanism). But it sounds interesting.

There would be real value in being able to use mfakt[c|o] for ongoing sanity checking of GPU "kit"*. Even if it requires doing another TF run, some might "opt-in" doing this kind of work. It would be fine if it was run on the same users' GPU(s), since this isn't MP stuff, just (largely) no factors found of MP candidates.

It would be relatively trivial to expand the data stream from mfakt* to include the checksum. A simple parsing change on Primenet, and one additional table or field.

Has anyone with the requisite skillset had a look at this yet? Kit has empirically demonstrated it's potential to degrade over time.

* Definition of "Kit" in this context: All of it, sans kittens.

SethTro 2020-05-01 20:11

[QUOTE=chalsall;544375]I don't have the time (nor knowledge) to drill down further on your proof-of-concept efforts (of a proof-of-work mechanism). But it sounds interesting.

There would be real value in being able to use mfakt[c|o] for ongoing sanity checking of GPU "kit"*. Even if it requires doing another TF run, some might "opt-in" doing this kind of work. It would be fine if it was run on the same users' GPU(s), since this isn't MP stuff, just (largely) no factors found of MP candidates.

It would be relatively trivial to expand the data stream from mfakt* to include the checksum. A simple parsing change on Primenet, and one additional table or field.

Has anyone with the requisite skillset had a look at this yet?
[/QUOTE]

I worked on this for like a month but there wasn't interest from other people.

It's ~80% done
* I modified mfaktc to return one of the smaller residuals it found.
* I output the new proof of work line
* James may even have implemented part of verification on the server

[CODE]M59068201 proof_k(8940824503190951977): 29 bits [TF:60:64:mfaktc 0.21 75bit_mul32_gs]
[/CODE]

which can be verified by checking

let residual = pow(2, 59068201, 8940824503190951977) = 422536362
let confidence = log10(tests / P) - log10(K / residual)
in this case confidence = log10(2^63 / 59068201) - log10(8940824503190951977 / 422536362) = 0.86

I calculated the distribution of confidence somewhere. I think the outcome was mean 0.5 and 99.9% of the time confidence < 5, you can also store confidence from many submissions and make sure it averages < 1.

SethTro 2020-05-01 21:53

[QUOTE=SethTro;544404]I worked on this for like a month but there wasn't interest from other people.

It's ~80% done
* I modified mfaktc to return one of the smaller residuals it found.
* I output the new proof of work line
* James may even have implemented part of verification on the server

[/QUOTE]

I took a look at the code again.
[url]https://github.com/sethtroisi/mfaktc/tree/proof[/url]

For the smaller kernels in mfaktc you have access to the residual, for the larger kernels the residual is not always computed and we'd need a 2nd proof functions which will be similar but won't directly minimize residual but some other intermediate product (e.g. lower 32 bits of residual)

chalsall 2020-05-01 22:08

[QUOTE=SethTro;544412]For the smaller kernels in mfaktc you have access to the residual, for the larger kernels the residual is not always computed and we'd need a 2nd proof functions which will be similar but won't directly minimize residual but some other intermediate product (e.g. lower 32 bits of residual)[/QUOTE]

OK... For my edification, how "deep" into the GPU compute results does your proposal go? As in, would this be useful for catching unreliable GPUs (presumably, only after doing a second run, and not seeing the same residue)?

The question many will ask is it worthwhile to do this? A missed factor doesn't really impact GIMPS much (and certainly won't result in a missed MP). But I think there may be some interest in this ability (if it isn't too computationally expensive -- the second runs could be a random distribution).

kriesel 2020-05-01 23:38

[QUOTE=chalsall;544414]OK... For my edification, how "deep" into the GPU compute results does your proposal go? As in, would this be useful for catching unreliable GPUs (presumably, only after doing a second run, and not seeing the same residue)?

The question many will ask is it worthwhile to do this? A missed factor doesn't really impact GIMPS much (and certainly won't result in a missed MP). But I think there may be some interest in this ability (if it isn't too computationally expensive -- the second runs could be a random distribution).[/QUOTE]
Elsewhere he mentions an overhead of under 5%, perhaps 2-3%, seen in his testing of it, compared to unmodified mfaktc. That's without any spot-check duplicate runs.

SethTro 2020-05-02 00:16

[QUOTE=chalsall;544414]OK... For my edification, how "deep" into the GPU compute results does your proposal go? As in, would this be useful for catching unreliable GPUs (presumably, only after doing a second run, and not seeing the same residue)?

The question many will ask is it worthwhile to do this? A missed factor doesn't really impact GIMPS much (and certainly won't result in a missed MP). But I think there may be some interest in this ability (if it isn't too computationally expensive -- the second runs could be a random distribution).[/QUOTE]

my proposal sits at the very end of the GPU compute so it allows us to verify the GPU is correctly computing pow(2, P, test) and not making math errors.

Additionally residuals are randomly distributed (and residuals with many leading zeros are rare) so we are verifying that they are doing a large amount of correct work without having to run a 2nd check.

I'm not sure the amount of description that's needed. If this doesn't make sense I can add more details.

chalsall 2020-05-02 13:27

[QUOTE=SethTro;544425]I'm not sure the amount of description that's needed. If this doesn't make sense I can add more details.[/QUOTE]

It makes sense. However, I would very much like to hear from others who understand this kind of work to a much deeper level than I do.

What do people think? Is this worth implementing? It could be done in such a way as people "opt-in" to the (slightly more expensive) code-path. Perhaps running a percentage of jobs using this, for sanity checking?

Thoughts? Oliver, are you monitoring this thread? At the end of the day, it would be up to you to decide to pull this into your main code-base.

Uncwilly 2020-05-02 15:28

My 2 centavos is that any overhead that is not error checking related, should not be more than 0.5%. The loss vs. gain doesn't make sense.

What is the potential gain (speed-wise) to the project? Based upon the examination of "expected number of factors", we don't have systematic cheating. We have had a few idiots. Total loss due to them nearly nil.

Does this capture missed factors due to errors (thus speeding the project)? No.

Does this find false factors? No.

Is this an actual residual of the calculation, or is it just a manufactured calculation? It is not a natural byproduct of the calculation like the LL residual.



This is my very naive concept.
Why can't the program take the result to the mod operation at certain select k values and do a rolling operation on them like a CRC? Or certain classes. The k values to be selected could be calculated based upon the exponent. Doing the operation every 10000 k's tested should not put an undue burden on operation and would prove that the at least those items were done in sequence.

LaurV 2020-05-02 17:14

[QUOTE=Uncwilly;544456]My 2 centavos[/QUOTE]
+1. :goodposting: :tu:

kriesel 2020-05-02 18:32

As George has already pointed out, among others, the list of k values tried in gpu trial factoring is not reproducible from run to run of a given exponent/bit level combination, because the factor candidate sieving is not deterministic, letting some composite factors through to be tried against the mersenne number. It's done that way because it's faster than a perfect factor candidate sieving to primes only.
That means the list of k values will vary, so the n.10^c-th k value will vary, so the crc will vary, for the same inputs. To fix that issue, would require completely sieving each factor candidate, not just every 10^c.

I think Gerbicz' approach is more easily made immune to this leaky-sieve issue.
And since it is a single-word binary mod of something that's already been moded down to factor size, that part should be very quick. Only the factor candidates for the Gerbicz check values list need be further screened for primeness.

SethTro 2020-05-02 21:49

[QUOTE=Uncwilly;544456]My 2 centavos is that any overhead that is not error checking related, should not be more than 0.5%. The loss vs. gain doesn't make sense.
[/QUOTE]

I'm the only person working on this and I haven't focused on optimizing code, that can happen after people have been convinced this is worth doing, which has proved to be an uphill struggle.

[QUOTE=Uncwilly;544456]
What is the potential gain (speed-wise) to the project? Based upon the examination of "expected number of factors", we don't have systematic cheating. We have had a few idiots. Total loss due to them nearly nil.
[/QUOTE]
Loss comes in many forms, it can be [URL="https://www.mersenneforum.org/showpost.php?p=393055&postcount=145"]ghz-days[/URL], it can be [URL="https://www.mersenneforum.org/showthread.php?t=24103"]human hours[/URL], it can be trust in the project.
You are correct that this won't change the overall speed in any significant way today.
It is my belief that this would increase the reproducibility and overall trust in the project.
My goal would be that TJAOI stops finding bad results, and that no one reruns TF results because the don't trust old results.

[QUOTE=Uncwilly;544456]
Is this an actual residual of the calculation, or is it just a manufactured calculation? It is not a natural byproduct of the calculation like the LL residual.
[/QUOTE]
It [B]is[/B] a natural byproduct of the calculation.
It feels strange for you to not understand this, so maybe your claim is that keeping track of the smallest residual is not a natural byproduct?

[QUOTE=Uncwilly;544456]
This is my very naive concept.
[/QUOTE]
It doesn't feel good to spend time and effort working on something then have someone call it bad. The use of very feels unrestrained.

[QUOTE=Uncwilly;544456]
Why can't the program take the result to the mod operation at certain select k values and do a rolling operation on them like a CRC? Or certain classes. The k values to be selected could be calculated based upon the exponent. Doing the operation every 10000 k's tested should not put an undue burden on operation and would prove that the at least those items were done in sequence.[/QUOTE]

I think this comes back to what we are worried about.

I'm worried about them in this order

1. Having [B]any[/B] verifiable result associated with each TF submission.
1a. Result(s) should be quickly verifiable by the server (e.g. not requiring a 2nd DC)
1b. Result(s) should cover a large percent of work done.
2. Detect when the GPU/Program is not working quickly (e.g. after 10 submissions not 200)
3. Prevent Malicious/Prank submissions (I'd wager I can easily submit 10000 results at the cost of zero compute and have none of the be detected as fake)

Your method provides 1, 1a, and helps with 2, 3
My method provides stronger guarantees on 1b, 2 and 3.

SethTro 2020-05-03 07:48

I cleaned up some debug code and measured impact over the last 24 hours (13 DC TF)

Impact is ~0.4% for adding my new code on DC exponents evaluated with kernel barrett76_mul32_gs

[url]https://colab.research.google.com/drive/1kANlEiZPlkO8r90iD2f02BUuc4IcnMhD#scrollTo=HMlzwXBlejHb[/url]

retina 2020-05-03 08:17

Finding the needles in the haystack
 
The server creates a set of secret random prime-k values and computes the remainders.
Send the remainders to the client.
The clients task is to find the k values that generate the remainders.
Client sends the found k values to the server for verification.

How would that affect the client code if it needs to find some specific remainder values instead of just a remainder of zero, or the smallest remainder?

If you had one remainder then on average the client needs to run 50% of the work before finding the k value.
If you had two remainders then on average the client needs to run 75% of the work before finding both k values.
etc.

Considerations: The set of k values might not exactly match each other when two k values generate the same remainder, so just send one of them and the server can still verify the remainder is correct. The server doesn't need to remember the k values it chooses, just the remainders it sends.

SethTro 2020-05-03 08:23

[QUOTE=retina;544505]The server creates a set of secret random prime-k values and computes the remainders.
Send the remainders to the client.
The clients task is to find the k values that generate the remainders.
Client sends the found k values to the server for verification.

How would that affect the client code if it needs to find some specific remainder values instead of just a remainder of zero, or the smallest remainder?

If you had one remainder then on average the client needs to run 50% of the work before finding the k value.
If you had two remainders then on average the client needs to run 75% of the work before finding both k values.
etc.

Considerations: The set of k values might not exactly match each other when two k values generate the same remainder, so just send one of them and the server can still verify the remainder is correct. The server doesn't need to remember the k values it chooses, just the remainders it sends.[/QUOTE]

This assumes that mfaktc always computes the residual. which it does not; mfaktc notices when the lower bits of the residual are not zero and doesn't calculate the upper bits so 90%+ of the time the full residual is never calculated.

retina 2020-05-03 08:26

[QUOTE=SethTro;544506]This assumes that mfaktc always computes the residual. which it does not; mfaktc notices when the lower bits of the residual are not zero and doesn't calculate the upper bits so 90%+ of the time the full residual is never calculated.[/QUOTE]Oh.

Then how do you find the smallest remainder? Doesn't that require computing the upper bits to check if they are zeros?

SethTro 2020-05-03 10:17

[QUOTE=retina;544507]Oh.

Then how do you find the smallest remainder? Doesn't that require computing the upper bits to check if they are zeros?[/QUOTE]

I change the definition of "smallest" to be the most trailing zeros and only compute the full residual when the low word is all zero (rare but still happens several times per ghz-day).

I like you're idea otherwise. It can be reworked to return a lower residual to look for.
The downside is depending on which kernel is used different intermediate values are available this would require the server to make guesses about what kernel mfaktc will use.

Uncwilly 2020-05-04 16:21

[QUOTE=SethTro;544479]
It [B]is[/B] a natural byproduct of the calculation.
It feels strange for you to not understand this, so maybe your claim is that keeping track of the smallest residual is not a natural byproduct?[/quote]I think I was remembering Bob's suggestion. Sorry for the confusion.
[quote]It doesn't feel good to spend time and effort working on something then have someone call it bad. The use of very feels unrestrained.[/quote]I did not say bad. Again, I am concerned about cost vs benefit. You have brought the cost down. I don't like Retina's idea of the server sending some secret info (more than the AID) to the factoring program.

[quote]
I'm worried about them in this order

1. Having [B]any[/B] verifiable result associated with each TF submission.
1a. Result(s) should be quickly verifiable by the server (e.g. not requiring a 2nd DC)
1b. Result(s) should cover a large percent of work done.
2. Detect when the GPU/Program is not working quickly (e.g. after 10 submissions not 200)
3. Prevent Malicious/Prank submissions (I'd wager I can easily submit 10000 results at the cost of zero compute and have none of the be detected as fake)

Your method provides 1, 1a, and helps with 2, 3
My method provides stronger guarantees on 1b, 2 and 3.[/QUOTE]
I did say mine was naive. I have no GPU programming experience.
And sorry for the delay responding. It has been a busy few days.

retina 2020-05-04 22:16

[QUOTE=Uncwilly;544576]I don't like Retina's idea of the server sending some secret info (more than the AID) to the factoring program.[/QUOTE]Just to be clear. The server doesn't need to retain or send any secrets. The remainders can be public information. The initial k-values can be deleted after the remainders have been generated.

chalsall 2020-05-05 19:17

[QUOTE=SethTro;544479]It doesn't feel good to spend time and effort working on something then have someone call it bad. The use of very feels unrestrained.[/QUOTE]

Juggling some things at the moment, but...

Please never take offense from feedback here. It's never personal -- we're just trying to come to convergence.

At the end of the day, having a "Proof of Work" code-path is going to be a hard sell if it adds more than, let's say, 1% to the compute cost. The reason is we /believe/ there's almost no cheating going on.

Also, it's not the end of the world if a factor is missed. It just means that a P-1 run and then (likely, eventually) two LL tests will be done on the candidate, proving it's composite.

However, I personally feel that /some/ would value having some sort of sanity check on the GPU hardware. As said before, some of us these things for more than just TF'ing (where determinism (or, at least, sanity) is coveted). There would be value in being able to determine if our "kit" was sane while running mfakt[c|o] (as opposed to a GPU based DC test).

Lastly, it would be trivial for the servers to deal with this additional data stream. The heavy-lifting is all in the client (and, thus, the authors of same).

kriesel 2020-05-05 19:38

[QUOTE=chalsall;544659]/some/ would value having some sort of sanity check on the GPU hardware. As said before, some of us [use] these things for more than just TF'ing (where determinism (or, at least, sanity) is coveted). There would be value in being able to determine if our "kit" was sane while running mfakt[c|o] (as opposed to a GPU based DC test).[/QUOTE]Especially since issues can reside within specific circuitry or microcode that affects one computation type and not another, or for speed and low user involvement. Issues can be very localized, yet significant.

Remember the pentium fdiv bug; wrong output only for specific instruction and operand combinations. Various software workarounds were developed. The initial mass produced chips were wrong by design. [url]https://en.wikipedia.org/wiki/Pentium_FDIV_bug[/url]

greg 2020-05-05 22:23

[QUOTE=Uncwilly;544456]My 2 centavos is that any overhead that is not error checking related, should not be more than 0.5%. The loss vs. gain doesn't make sense.[/QUOTE]


If TF and/or P-1 became verifiable (even if only a subset of clients) then there is the potential to reward that work with a portion of the prime prize award (even if only a $2.56 check).

chalsall 2020-05-08 19:58

[QUOTE=greg;544673]If TF and/or P-1 became verifiable (even if only a subset of clients) then there is the potential to reward that work with a portion of the prime prize award (even if only a $2.56 check).[/QUOTE]

LOL. Not a bad idea! :smile:

In the official notice, the TF'ers and P-1'ers (etc.) are covered in the "et al" language. But it would be kinda cool getting an official document from GIMPS for being in direct assistance for the next MP found. Just don't fall into Knuth's situation with regards to cheques...

Also, a "bump" on this discussion. Do we have any code to the stage where it could be alpha tested on kit yet? We do have some places to test this...

SethTro 2020-05-08 20:34

[QUOTE=chalsall;544659]
However, I personally feel that /some/ would value having some sort of sanity check on the GPU hardware. As said before, some of us these things for more than just TF'ing (where determinism (or, at least, sanity) is coveted).
[/QUOTE]

I agree with the first sentence.
I think the 2nd sentence is missing an "are" (between "things for more") and I agree doubling with that, I've been thinking about if this could be adopted to P-1 or ECM so that a factordb like service wouldn't have to trust blindly submissions.

SethTro 2020-05-08 21:00

[QUOTE=chalsall;544908]LOL. Not a bad idea! :smile:

In the official notice, the TF'ers and P-1'ers (etc.) are covered in the "et al" language. But it would be kinda cool getting an official document from GIMPS for being in direct assistance for the next MP found. Just don't fall into Knuth's situation with regards to cheques...

Also, a "bump" on this discussion. Do we have any code to the stage where it could be alpha tested on kit yet? We do have some places to test this...[/QUOTE]

Absolutely; The code is at [url]https://github.com/sethtroisi/mfaktc/tree/proof[/url] and you can build it in the standard way.

If you use linux and cuda 10.1 I can provide a binary if you can't compile it yourself.

Everything should look like normal except the version is incremented and you should see proof lines in your results.txt

[CODE]M66023333 proof_k(20475767038603731855001): 28 bits [TF:74:75:mfaktc 0.21 barrett76_mul32_gs]
M66023333 proof_k(33270468931786126541321): 27 bits [TF:74:75:mfaktc 0.21 barrett76_mul32_gs]
M66023333 proof_k(33150685168141475305463): 25 bits [TF:74:75:mfaktc 0.21 barrett76_mul32_gs]
M66023333 proof_k(28851305727414925240879): 24 bits [TF:74:75:mfaktc 0.21 barrett76_mul32_gs]
M66023333 proof_k(32228068860752704443631): 22 bits [TF:74:75:mfaktc 0.21 barrett76_mul32_gs]
M66023333 proof_k(23676960155764888060537): 20 bits [TF:74:75:mfaktc 0.21 barrett76_mul32_gs]
[Fri May 1 22:46:32 2020]
no factor for M66023333 from 2^74 to 2^75 [mfaktc 0.21 barrett76_mul32_gs]
[/CODE]

It would actually be nice to build a worktodo.txt that uses many different kernals and verifies they all produce proof logs.

I'll provide a program to verify the proof logs this weekend; currently it's a combination of the check in #47 for small kernels (e.g. 75bit_mul32_gs) and [URL="this gist"]https://gist.github.com/sethtroisi/a2942e29be6ab43a6eeb4b4987ee3246[/URL] for larger kernels

chalsall 2020-05-08 21:19

[QUOTE=SethTro;544915]Absolutely; The code is at [url]https://github.com/sethtroisi/mfaktc/tree/proof[/url] and you can build it in the standard way.[/QUOTE]

Cool. I'll take a look at it (and run a few tests) this weekend.

[QUOTE=SethTro;544915]I'll provide a program to verify the proof logs this weekend; currently it's a combination of the check in #47 for small kernels (e.g. 75bit_mul32_gs) and [URL="this gist"]https://gist.github.com/sethtroisi/a2942e29be6ab43a6eeb4b4987ee3246[/URL] for larger kernels[/QUOTE]

OK. It would be good to run this across a spectrum of kit and kernel sizes. Do you happen to have a list of candidates which should be run?

And, could you define your datastream a bit more? Should the servers keep the values for each bit level? As in, it's effectively a three-dimensional sparse matrix (Exponent, Bit-level, PoW)?

Separately, does anyone who understands both the maths and the Python understand what's being done here? Black-box to me (and I'm OK with that).

SethTro 2020-05-08 22:04

[QUOTE=chalsall;544916]Cool. I'll take a look at it (and run a few tests) this weekend.

OK. It would be good to run this across a spectrum of kit and kernel sizes. Do you happen to have a list of candidates which should be run?

And, could you define your datastream a bit more? Should the servers keep the values for each bit level? As in, it's effectively a three-dimensional sparse matrix (Exponent, Bit-level, PoW)?

Separately, does anyone who understands both the maths and the Python understand what's being done here? Black-box to me (and I'm OK with that).[/QUOTE]

I plan to do a test run over a large range of exponent at TF=65,70,75 this weekend.

Right now the code outputs multiple proofs per <exponent,bit-level>, I think the server should keep <user, exp, bit-level, min-pow> where min-pow is the "best" proof of work where best = hardest = rarest = smallest res or most zeros. I need to change the code a little bit before this happens.

I cleaned up the verify script and have a new version you can pass your results.txt file to.
[url]https://gist.github.com/sethtroisi/40b3f0c5d68bd0aaa47007d55cdca7e5[/url]

my TODO is
1. Rename proof_k to proof_test or proof (it's not easy to manipulate large ints in mkfact)
2. Change output format from "X bits" to "X difficulty"
3. Run over large range of inputs and verify proof
4. Update verify script one more time.

SethTro 2020-05-09 12:20

2 Attachment(s)
[QUOTE=SethTro;544918]
TODO is
1. Rename proof_k to proof_test or proof (it's not easy to manipulate large ints in mkfact)
2. Change output format from "X bits" to "X difficulty"
3. Run over large range of inputs and verify proof
4. Update verify script one more time.
[/QUOTE]

These are all done now.

I ran ~50 exponents (5M to 250M) over several bitranges (58-60, 60-62, 65-67, 70-71) and verified ~458 proof statements (currently more than one can be generated per workitem).

Newest verification script is at [url]https://gist.github.com/sethtroisi/40b3f0c5d68bd0aaa47007d55cdca7e5[/url]

chalsall 2020-05-11 18:56

[QUOTE=SethTro;544948]These are all done now.[/QUOTE]

OK... Sweet. I love data! :smile:

However, one comment (read: change request): Your mfaktc delta should print the proof lines after the datestamp line. As it is currently, parsing the data from a log file it is a pain to get the timestamp correct.

The actual proof lines are great to RegEx through -- easy to parse. Thanks for that.

It's going to be interesting running this on a few different platforms, and ensure we get full convergence on the proofs.

Anyone out there with flaky kit willing to give this a try? It would be good to get such test-runs in the dataset.

P.S. Is now the time to discuss optionally producing XML output?

chalsall 2020-05-11 22:41

[QUOTE=chalsall;545113]However, one comment (read: change request): Your mfaktc delta should print the proof lines after the datestamp line. As it is currently, parsing the data from a log file it is a pain to get the timestamp correct.[/QUOTE]

OK... I finally had some cycles to actually run your code (on a slow little 1050 also doing display work). An observation/question...

I see during the run that the different "proof" lines are generated during the run. However, the last proof isn't printed (perhaps it's not supposed to be). As an example (copied from the console where the job was run):[CODE]
got assignment: exp=5500021 bit_min=65 bit_max=67 (4.08 GHz-days)
Starting trial factoring M5500021 from 2^65 to 2^66 (1.36 GHz-days)
k_min = 3353940660840
k_max = 6707881323983
Using GPU kernel "barrett76_mul32_gs"
Date Time | class Pct | time ETA | GHz-d/day Sieve Wait
May 11 18:14 | 260 5.7% | 0.408 6m09s | 299.71 82485 n.a.%
M5500021 proof(47281205372229406961): 33 difficulty
May 11 18:15 | 548 12.1% | 0.401 5m38s | 304.94 82485 n.a.%
M5500021 proof(41186001143328300737): 34 difficulty
May 11 18:15 | 923 20.2% | 0.406 5m11s | 301.18 82485 n.a.%
M5500021 proof(55435150410008224727): 37 difficulty
May 11 18:21 | 4619 100.0% | 0.409 0m00s | 298.97 82485 n.a.%
no factor for M5500021 from 2^65 to 2^66 [mfaktc 0.21 barrett76_mul32_gs]
tf(): total time spent: 6m 32.519s[/CODE]

I chose this example because the Proof of Work is only generated for ~20% of the run.

Is this by design? Or is there a way to get a useful hash for the last part of the run?

chalsall 2020-05-13 23:00

[QUOTE=chalsall;545124]Is this by design? Or is there a way to get a useful hash for the last part of the run?[/QUOTE]

Seth, are you out there? Can you answer this question, please?

I'd like to start doing some wide-spread test runs using this, and collecting data. But I'd like the codebase to be stable (and, of course, sane) before I do.

Thanks! :tu:

SethTro 2020-05-14 00:37

Sorry I missed these messages, I strongly rely on MF emails (I get a kick out of "You will probably never see this message because it will end up in your spam folder.") and seem to have missed theses

[QUOTE=chalsall;545113]OK... Sweet. I love data! :smile:

However, one comment (read: change request): Your mfaktc delta should print the proof lines after the datestamp line. As it is currently, parsing the data from a log file it is a pain to get the timestamp correct.

The actual proof lines are great to RegEx through -- easy to parse. Thanks for that.
[/QUOTE]

This is a work around for not having handled resume yet.
This weekend I'll change it so at default verbosity the best proof is printed directly after printing the statusline

[QUOTE=chalsall;545124]OK... I finally had some cycles to actually run your code (on a slow little 1050 also doing display work). An observation/question...

I see during the run that the different "proof" lines are generated during the run. However, the last proof isn't printed (perhaps it's not supposed to be). As an example (copied from the console where the job was run):[CODE]
got assignment: exp=5500021 bit_min=65 bit_max=67 (4.08 GHz-days)
Starting trial factoring M5500021 from 2^65 to 2^66 (1.36 GHz-days)
k_min = 3353940660840
k_max = 6707881323983
Using GPU kernel "barrett76_mul32_gs"
Date Time | class Pct | time ETA | GHz-d/day Sieve Wait
May 11 18:14 | 260 5.7% | 0.408 6m09s | 299.71 82485 n.a.%
M5500021 proof(47281205372229406961): 33 difficulty
May 11 18:15 | 548 12.1% | 0.401 5m38s | 304.94 82485 n.a.%
M5500021 proof(41186001143328300737): 34 difficulty
May 11 18:15 | 923 20.2% | 0.406 5m11s | 301.18 82485 n.a.%
M5500021 proof(55435150410008224727): 37 difficulty
May 11 18:21 | 4619 100.0% | 0.409 0m00s | 298.97 82485 n.a.%
no factor for M5500021 from 2^65 to 2^66 [mfaktc 0.21 barrett76_mul32_gs]
tf(): total time spent: 6m 32.519s[/CODE]

I chose this example because the Proof of Work is only generated for ~20% of the run.

Is this by design? Or is there a way to get a useful hash for the last part of the run?[/QUOTE]

This is by design, it prints the best proof as it discovers them, half of the time this is in the first 50% of the run, 1 in 5 times this is before 20%.

chalsall 2020-05-14 17:52

[QUOTE=SethTro;545332]This is a work around for not having handled resume yet. This weekend I'll change it so at default verbosity the best proof is printed directly after printing the statusline[/QUOTE]

OK, great. I'll look out for that.

So you know, I've "forked" your mfaktc delta on GitHub, so I'll pull your changes once they've been applied. I want to make some changes myself, for use in Colab and BOINC runs. I'll send a "Pull request" once I'm happy with my changes.

[QUOTE=SethTro;545332]This is by design, it prints the best proof as it discovers them, half of the time this is in the first 50% of the run, 1 in 5 times this is before 20%.[/QUOTE]

OK, thanks for letting me know.

Since I'm more concerned about kit sanity than cheating detection, this is fine. Someone would /really/ have to be motivated to get around this, even in the cases of 20% coverage.

SethTro 2020-05-18 04:43

I updated the code so that it only prints the best proof write after the NF / Factor line.

This doesn't work with checkpointing but I'm not worried about that till this gets larger adoption.

SheriGoddart33 2020-05-20 11:22

Yes, it has ambiguously happened.

kruoli 2020-05-20 22:33

[QUOTE=SheriGoddart33;545940]SPAM.[/QUOTE]

I'm quite sure that's a spam profile... At least it has one somewhat fitting post in out forum. xD


All times are UTC. The time now is 17:34.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.