mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Information & Answers (https://www.mersenneforum.org/forumdisplay.php?f=38)
-   -   Trial Factoring for Dummies (https://www.mersenneforum.org/showthread.php?t=25493)

chalsall 2020-04-29 18:51

[QUOTE=kriesel;544198]I find doors work well.[/QUOTE]

Our house (designed by Linda) is a very "open plan". To a cat, there are effectively no internal doors.

chalsall 2020-05-01 16:32

[QUOTE=SethTro;544143]If you want to do more to verify the calculating (to prevent against overclocking / math errors) we should look at my proof-of-work proposal.[/QUOTE]

I don't have the time (nor knowledge) to drill down further on your proof-of-concept efforts (of a proof-of-work mechanism). But it sounds interesting.

There would be real value in being able to use mfakt[c|o] for ongoing sanity checking of GPU "kit"*. Even if it requires doing another TF run, some might "opt-in" doing this kind of work. It would be fine if it was run on the same users' GPU(s), since this isn't MP stuff, just (largely) no factors found of MP candidates.

It would be relatively trivial to expand the data stream from mfakt* to include the checksum. A simple parsing change on Primenet, and one additional table or field.

Has anyone with the requisite skillset had a look at this yet? Kit has empirically demonstrated it's potential to degrade over time.

* Definition of "Kit" in this context: All of it, sans kittens.

SethTro 2020-05-01 20:11

[QUOTE=chalsall;544375]I don't have the time (nor knowledge) to drill down further on your proof-of-concept efforts (of a proof-of-work mechanism). But it sounds interesting.

There would be real value in being able to use mfakt[c|o] for ongoing sanity checking of GPU "kit"*. Even if it requires doing another TF run, some might "opt-in" doing this kind of work. It would be fine if it was run on the same users' GPU(s), since this isn't MP stuff, just (largely) no factors found of MP candidates.

It would be relatively trivial to expand the data stream from mfakt* to include the checksum. A simple parsing change on Primenet, and one additional table or field.

Has anyone with the requisite skillset had a look at this yet?
[/QUOTE]

I worked on this for like a month but there wasn't interest from other people.

It's ~80% done
* I modified mfaktc to return one of the smaller residuals it found.
* I output the new proof of work line
* James may even have implemented part of verification on the server

[CODE]M59068201 proof_k(8940824503190951977): 29 bits [TF:60:64:mfaktc 0.21 75bit_mul32_gs]
[/CODE]

which can be verified by checking

let residual = pow(2, 59068201, 8940824503190951977) = 422536362
let confidence = log10(tests / P) - log10(K / residual)
in this case confidence = log10(2^63 / 59068201) - log10(8940824503190951977 / 422536362) = 0.86

I calculated the distribution of confidence somewhere. I think the outcome was mean 0.5 and 99.9% of the time confidence < 5, you can also store confidence from many submissions and make sure it averages < 1.

SethTro 2020-05-01 21:53

[QUOTE=SethTro;544404]I worked on this for like a month but there wasn't interest from other people.

It's ~80% done
* I modified mfaktc to return one of the smaller residuals it found.
* I output the new proof of work line
* James may even have implemented part of verification on the server

[/QUOTE]

I took a look at the code again.
[url]https://github.com/sethtroisi/mfaktc/tree/proof[/url]

For the smaller kernels in mfaktc you have access to the residual, for the larger kernels the residual is not always computed and we'd need a 2nd proof functions which will be similar but won't directly minimize residual but some other intermediate product (e.g. lower 32 bits of residual)

chalsall 2020-05-01 22:08

[QUOTE=SethTro;544412]For the smaller kernels in mfaktc you have access to the residual, for the larger kernels the residual is not always computed and we'd need a 2nd proof functions which will be similar but won't directly minimize residual but some other intermediate product (e.g. lower 32 bits of residual)[/QUOTE]

OK... For my edification, how "deep" into the GPU compute results does your proposal go? As in, would this be useful for catching unreliable GPUs (presumably, only after doing a second run, and not seeing the same residue)?

The question many will ask is it worthwhile to do this? A missed factor doesn't really impact GIMPS much (and certainly won't result in a missed MP). But I think there may be some interest in this ability (if it isn't too computationally expensive -- the second runs could be a random distribution).

kriesel 2020-05-01 23:38

[QUOTE=chalsall;544414]OK... For my edification, how "deep" into the GPU compute results does your proposal go? As in, would this be useful for catching unreliable GPUs (presumably, only after doing a second run, and not seeing the same residue)?

The question many will ask is it worthwhile to do this? A missed factor doesn't really impact GIMPS much (and certainly won't result in a missed MP). But I think there may be some interest in this ability (if it isn't too computationally expensive -- the second runs could be a random distribution).[/QUOTE]
Elsewhere he mentions an overhead of under 5%, perhaps 2-3%, seen in his testing of it, compared to unmodified mfaktc. That's without any spot-check duplicate runs.

SethTro 2020-05-02 00:16

[QUOTE=chalsall;544414]OK... For my edification, how "deep" into the GPU compute results does your proposal go? As in, would this be useful for catching unreliable GPUs (presumably, only after doing a second run, and not seeing the same residue)?

The question many will ask is it worthwhile to do this? A missed factor doesn't really impact GIMPS much (and certainly won't result in a missed MP). But I think there may be some interest in this ability (if it isn't too computationally expensive -- the second runs could be a random distribution).[/QUOTE]

my proposal sits at the very end of the GPU compute so it allows us to verify the GPU is correctly computing pow(2, P, test) and not making math errors.

Additionally residuals are randomly distributed (and residuals with many leading zeros are rare) so we are verifying that they are doing a large amount of correct work without having to run a 2nd check.

I'm not sure the amount of description that's needed. If this doesn't make sense I can add more details.

chalsall 2020-05-02 13:27

[QUOTE=SethTro;544425]I'm not sure the amount of description that's needed. If this doesn't make sense I can add more details.[/QUOTE]

It makes sense. However, I would very much like to hear from others who understand this kind of work to a much deeper level than I do.

What do people think? Is this worth implementing? It could be done in such a way as people "opt-in" to the (slightly more expensive) code-path. Perhaps running a percentage of jobs using this, for sanity checking?

Thoughts? Oliver, are you monitoring this thread? At the end of the day, it would be up to you to decide to pull this into your main code-base.

Uncwilly 2020-05-02 15:28

My 2 centavos is that any overhead that is not error checking related, should not be more than 0.5%. The loss vs. gain doesn't make sense.

What is the potential gain (speed-wise) to the project? Based upon the examination of "expected number of factors", we don't have systematic cheating. We have had a few idiots. Total loss due to them nearly nil.

Does this capture missed factors due to errors (thus speeding the project)? No.

Does this find false factors? No.

Is this an actual residual of the calculation, or is it just a manufactured calculation? It is not a natural byproduct of the calculation like the LL residual.



This is my very naive concept.
Why can't the program take the result to the mod operation at certain select k values and do a rolling operation on them like a CRC? Or certain classes. The k values to be selected could be calculated based upon the exponent. Doing the operation every 10000 k's tested should not put an undue burden on operation and would prove that the at least those items were done in sequence.

LaurV 2020-05-02 17:14

[QUOTE=Uncwilly;544456]My 2 centavos[/QUOTE]
+1. :goodposting: :tu:

kriesel 2020-05-02 18:32

As George has already pointed out, among others, the list of k values tried in gpu trial factoring is not reproducible from run to run of a given exponent/bit level combination, because the factor candidate sieving is not deterministic, letting some composite factors through to be tried against the mersenne number. It's done that way because it's faster than a perfect factor candidate sieving to primes only.
That means the list of k values will vary, so the n.10^c-th k value will vary, so the crc will vary, for the same inputs. To fix that issue, would require completely sieving each factor candidate, not just every 10^c.

I think Gerbicz' approach is more easily made immune to this leaky-sieve issue.
And since it is a single-word binary mod of something that's already been moded down to factor size, that part should be very quick. Only the factor candidates for the Gerbicz check values list need be further screened for primeness.


All times are UTC. The time now is 20:29.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.