mersenneforum.org (https://www.mersenneforum.org/index.php)
-   -   Trial Factoring for Dummies (https://www.mersenneforum.org/showthread.php?t=25493)

 SethTro 2020-05-02 21:49

[QUOTE=Uncwilly;544456]My 2 centavos is that any overhead that is not error checking related, should not be more than 0.5%. The loss vs. gain doesn't make sense.
[/QUOTE]

I'm the only person working on this and I haven't focused on optimizing code, that can happen after people have been convinced this is worth doing, which has proved to be an uphill struggle.

[QUOTE=Uncwilly;544456]
What is the potential gain (speed-wise) to the project? Based upon the examination of "expected number of factors", we don't have systematic cheating. We have had a few idiots. Total loss due to them nearly nil.
[/QUOTE]
Loss comes in many forms, it can be [URL="https://www.mersenneforum.org/showpost.php?p=393055&postcount=145"]ghz-days[/URL], it can be [URL="https://www.mersenneforum.org/showthread.php?t=24103"]human hours[/URL], it can be trust in the project.
You are correct that this won't change the overall speed in any significant way today.
It is my belief that this would increase the reproducibility and overall trust in the project.
My goal would be that TJAOI stops finding bad results, and that no one reruns TF results because the don't trust old results.

[QUOTE=Uncwilly;544456]
Is this an actual residual of the calculation, or is it just a manufactured calculation? It is not a natural byproduct of the calculation like the LL residual.
[/QUOTE]
It [B]is[/B] a natural byproduct of the calculation.
It feels strange for you to not understand this, so maybe your claim is that keeping track of the smallest residual is not a natural byproduct?

[QUOTE=Uncwilly;544456]
This is my very naive concept.
[/QUOTE]
It doesn't feel good to spend time and effort working on something then have someone call it bad. The use of very feels unrestrained.

[QUOTE=Uncwilly;544456]
Why can't the program take the result to the mod operation at certain select k values and do a rolling operation on them like a CRC? Or certain classes. The k values to be selected could be calculated based upon the exponent. Doing the operation every 10000 k's tested should not put an undue burden on operation and would prove that the at least those items were done in sequence.[/QUOTE]

I think this comes back to what we are worried about.

I'm worried about them in this order

1. Having [B]any[/B] verifiable result associated with each TF submission.
1a. Result(s) should be quickly verifiable by the server (e.g. not requiring a 2nd DC)
1b. Result(s) should cover a large percent of work done.
2. Detect when the GPU/Program is not working quickly (e.g. after 10 submissions not 200)
3. Prevent Malicious/Prank submissions (I'd wager I can easily submit 10000 results at the cost of zero compute and have none of the be detected as fake)

Your method provides 1, 1a, and helps with 2, 3
My method provides stronger guarantees on 1b, 2 and 3.

 SethTro 2020-05-03 07:48

I cleaned up some debug code and measured impact over the last 24 hours (13 DC TF)

Impact is ~0.4% for adding my new code on DC exponents evaluated with kernel barrett76_mul32_gs

 retina 2020-05-03 08:17

Finding the needles in the haystack

The server creates a set of secret random prime-k values and computes the remainders.
Send the remainders to the client.
The clients task is to find the k values that generate the remainders.
Client sends the found k values to the server for verification.

How would that affect the client code if it needs to find some specific remainder values instead of just a remainder of zero, or the smallest remainder?

If you had one remainder then on average the client needs to run 50% of the work before finding the k value.
If you had two remainders then on average the client needs to run 75% of the work before finding both k values.
etc.

Considerations: The set of k values might not exactly match each other when two k values generate the same remainder, so just send one of them and the server can still verify the remainder is correct. The server doesn't need to remember the k values it chooses, just the remainders it sends.

 SethTro 2020-05-03 08:23

[QUOTE=retina;544505]The server creates a set of secret random prime-k values and computes the remainders.
Send the remainders to the client.
The clients task is to find the k values that generate the remainders.
Client sends the found k values to the server for verification.

How would that affect the client code if it needs to find some specific remainder values instead of just a remainder of zero, or the smallest remainder?

If you had one remainder then on average the client needs to run 50% of the work before finding the k value.
If you had two remainders then on average the client needs to run 75% of the work before finding both k values.
etc.

Considerations: The set of k values might not exactly match each other when two k values generate the same remainder, so just send one of them and the server can still verify the remainder is correct. The server doesn't need to remember the k values it chooses, just the remainders it sends.[/QUOTE]

This assumes that mfaktc always computes the residual. which it does not; mfaktc notices when the lower bits of the residual are not zero and doesn't calculate the upper bits so 90%+ of the time the full residual is never calculated.

 retina 2020-05-03 08:26

[QUOTE=SethTro;544506]This assumes that mfaktc always computes the residual. which it does not; mfaktc notices when the lower bits of the residual are not zero and doesn't calculate the upper bits so 90%+ of the time the full residual is never calculated.[/QUOTE]Oh.

Then how do you find the smallest remainder? Doesn't that require computing the upper bits to check if they are zeros?

 SethTro 2020-05-03 10:17

[QUOTE=retina;544507]Oh.

Then how do you find the smallest remainder? Doesn't that require computing the upper bits to check if they are zeros?[/QUOTE]

I change the definition of "smallest" to be the most trailing zeros and only compute the full residual when the low word is all zero (rare but still happens several times per ghz-day).

I like you're idea otherwise. It can be reworked to return a lower residual to look for.
The downside is depending on which kernel is used different intermediate values are available this would require the server to make guesses about what kernel mfaktc will use.

 Uncwilly 2020-05-04 16:21

[QUOTE=SethTro;544479]
It [B]is[/B] a natural byproduct of the calculation.
It feels strange for you to not understand this, so maybe your claim is that keeping track of the smallest residual is not a natural byproduct?[/quote]I think I was remembering Bob's suggestion. Sorry for the confusion.
[quote]It doesn't feel good to spend time and effort working on something then have someone call it bad. The use of very feels unrestrained.[/quote]I did not say bad. Again, I am concerned about cost vs benefit. You have brought the cost down. I don't like Retina's idea of the server sending some secret info (more than the AID) to the factoring program.

[quote]
I'm worried about them in this order

1. Having [B]any[/B] verifiable result associated with each TF submission.
1a. Result(s) should be quickly verifiable by the server (e.g. not requiring a 2nd DC)
1b. Result(s) should cover a large percent of work done.
2. Detect when the GPU/Program is not working quickly (e.g. after 10 submissions not 200)
3. Prevent Malicious/Prank submissions (I'd wager I can easily submit 10000 results at the cost of zero compute and have none of the be detected as fake)

Your method provides 1, 1a, and helps with 2, 3
My method provides stronger guarantees on 1b, 2 and 3.[/QUOTE]
I did say mine was naive. I have no GPU programming experience.
And sorry for the delay responding. It has been a busy few days.

 retina 2020-05-04 22:16

[QUOTE=Uncwilly;544576]I don't like Retina's idea of the server sending some secret info (more than the AID) to the factoring program.[/QUOTE]Just to be clear. The server doesn't need to retain or send any secrets. The remainders can be public information. The initial k-values can be deleted after the remainders have been generated.

 chalsall 2020-05-05 19:17

[QUOTE=SethTro;544479]It doesn't feel good to spend time and effort working on something then have someone call it bad. The use of very feels unrestrained.[/QUOTE]

Juggling some things at the moment, but...

Please never take offense from feedback here. It's never personal -- we're just trying to come to convergence.

At the end of the day, having a "Proof of Work" code-path is going to be a hard sell if it adds more than, let's say, 1% to the compute cost. The reason is we /believe/ there's almost no cheating going on.

Also, it's not the end of the world if a factor is missed. It just means that a P-1 run and then (likely, eventually) two LL tests will be done on the candidate, proving it's composite.

However, I personally feel that /some/ would value having some sort of sanity check on the GPU hardware. As said before, some of us these things for more than just TF'ing (where determinism (or, at least, sanity) is coveted). There would be value in being able to determine if our "kit" was sane while running mfakt[c|o] (as opposed to a GPU based DC test).

Lastly, it would be trivial for the servers to deal with this additional data stream. The heavy-lifting is all in the client (and, thus, the authors of same).

 kriesel 2020-05-05 19:38

[QUOTE=chalsall;544659]/some/ would value having some sort of sanity check on the GPU hardware. As said before, some of us [use] these things for more than just TF'ing (where determinism (or, at least, sanity) is coveted). There would be value in being able to determine if our "kit" was sane while running mfakt[c|o] (as opposed to a GPU based DC test).[/QUOTE]Especially since issues can reside within specific circuitry or microcode that affects one computation type and not another, or for speed and low user involvement. Issues can be very localized, yet significant.

Remember the pentium fdiv bug; wrong output only for specific instruction and operand combinations. Various software workarounds were developed. The initial mass produced chips were wrong by design. [url]https://en.wikipedia.org/wiki/Pentium_FDIV_bug[/url]

 greg 2020-05-05 22:23

[QUOTE=Uncwilly;544456]My 2 centavos is that any overhead that is not error checking related, should not be more than 0.5%. The loss vs. gain doesn't make sense.[/QUOTE]

If TF and/or P-1 became verifiable (even if only a subset of clients) then there is the potential to reward that work with a portion of the prime prize award (even if only a \$2.56 check).

All times are UTC. The time now is 00:23.