mersenneforum.org (https://www.mersenneforum.org/index.php)
-   -   Trial Factoring for Dummies (https://www.mersenneforum.org/showthread.php?t=25493)

 Uncwilly 2020-04-27 20:31

[QUOTE=LOBES;543990]Is there perhaps a better use of my 1080 Ti than doing TF'ing? From what I understand, mfaktc-win-64.exe is only used for Trial Factoring.[/QUOTE]
While you can use it for other things... It is best for GIMPS if you use it for TF. By using GPU's the level of TF that make sense is higher, thus more exponents eliminated. Also, GPU's doing -all- the TF makes sense vs having the CPU's do any.

 R.D. Silverman 2020-04-27 22:14

[QUOTE=chalsall;543971]Again, I must thank you for your thorough curation of this very technical and rarified space! Very valuable. :tu:

I don't have the time to drill down on all the prior discussion. I will say, however, that Oliver asked (many years ago) if anyone had ideas on how to produce a "checksum" or some other "proof of work" mechanism. Several ideas were proposed, but they all failed to be viable because of the unpredictable (and massive) concurrency of the CUDA TF runs.

If Gerbicz has proposed something which /is/ viable, it would be cool to see it codified and beta-tested.

Not that I think there's much, if any, cheating going on. But there /could/ be more "bad kit" out there than we realize. Having mfakt[c|o] be able to provide a long-term sanity data stream would be very useful to many people.[/QUOTE]

When one requests to perform TD on 2^p-1 have the server provide a set of indices
s = [I1, I2, I3, .....]. Encrypt s with (say) AES. When one does the trial division,
SAVE the remainders r_I = 2^p-1 mod p_{I} for all I \in s. Then return
HASH(r1 | r2 | r3 | ....) for some chosen hash function.

Bury the AES key within the TD code. Yes, a determined hacker can find and extract the key with enough
effort. But people with the skill to do that are unlikely to want to cheat.
One can even change the key periodically if desired to making extracting it a little bit
harder. (via a key exchange). Yes, the new key can also be extracted, but it requires
extra work.

In other words, return a hash of a random set of concatened remainders. This at least
proves that the worker did at least some of the divisions. It can be easily checked by
the server. This is similar to a ZKP technique known as "cut and choose".
Change the set s each time.

 chalsall 2020-04-28 00:41

[QUOTE=R.D. Silverman;544005]Change the set s each time.[/QUOTE]

Wow! I actually understood some of that! :smile:

I guess the question then becomes, what is the (computational) cost in doing this?

Also, it isn't clear to me if this provides any sort of "sanity" check during the TF/TD run? Short of running again (DC'ing a TF -- usually not done), will this help find border-line kit?

That would be a /strong/ motivator for some to run such a code-path. Many use these things for serious work too, where sane, deterministic kit is mission-critical.

 R.D. Silverman 2020-04-28 02:26

[QUOTE=chalsall;544017]Wow! I actually understood some of that! :smile:

I guess the question then becomes, what is the (computational) cost in doing this?
[/QUOTE]

nil. The required remainders get computed anyway during the TD. The encryption
and Hash take almost no time.
[QUOTE]

Also, it isn't clear to me if this provides any sort of "sanity" check during the TF/TD run?
[/QUOTE]

It is almost impossible to correctly guess the set s, so if the hash is correct it means the
TF was done.

[QUOTE]

Short of running again (DC'ing a TF -- usually not done), will this help find border-line kit?
[/QUOTE]

"border-line kit"???? Meaning?

 retina 2020-04-28 02:45

[QUOTE=R.D. Silverman;544005]When one requests to perform TD on 2^p-1 have the server provide a set of indices
s = [I1, I2, I3, .....]. Encrypt s with (say) AES. When one does the trial division,
SAVE the remainders r_I = 2^p-1 mod p_{I} for all I \in s. Then return
HASH(r1 | r2 | r3 | ....) for some chosen hash function.

Bury the AES key within the TD code. Yes, a determined hacker can find and extract the key with enough
effort. But people with the skill to do that are unlikely to want to cheat.
One can even change the key periodically if desired to making extracting it a little bit
harder. (via a key exchange). Yes, the new key can also be extracted, but it requires
extra work.

In other words, return a hash of a random set of concatened remainders. This at least
proves that the worker did at least some of the divisions. It can be easily checked by
the server. This is similar to a ZKP technique known as "cut and choose".
Change the set s each time.[/QUOTE]I would look at this from the perspective as a puzzle to be solved.

I simple solution to solve this puzzle would be to extract the key from the code and submit the required results. Problem solved. Server fooled. Easy.

So we would still need to monitor the expected frequency of successes to catch anyone using the short-cut. The end result is you've make it slightly more difficult for someone to cheat, at the cost of extra complexity for the server.

And for the amount of cheats we've seen in the past (i.e. zero) I don't think we need to go there yet. Let's not try to solve problems that don't yet exist.

 axn 2020-04-28 02:58

[QUOTE=retina;544023]at the cost of extra complexity for the server.[/QUOTE]

... which pales in comparison to the extra complexity (and performance hit) to the client.

Not to mention the fact that mfaktc/o don't directly communicate to the server, so all the middlemen/scripts also needs to be updated. And people do need to build mfaktc/o from source anyway, so there goes the secrecy of the key.

Also, the index isn't a deterministic thing -- the base sieve used to find the candidate factors have a user-configurable sieve depth (because of efficiency concerns) and so different amounts of candidate factors are TD-ed.

 R.D. Silverman 2020-04-28 03:01

[QUOTE=retina;544023]I would look at this from the perspective as a puzzle to be solved.

I simple solution to solve this puzzle would be to extract the key from the code
[/QUOTE]

It is not as easy as you think. Especially when the key is scattered as bits and pieces
throughout the code.

[QUOTE]

So we would still need to monitor the expected frequency of successes to catch anyone using the short-cut. The end result is you've make it slightly more difficult for someone to cheat, at the cost of extra complexity for the server.
[/QUOTE]

Server complexity/cost is tiny.

 R.D. Silverman 2020-04-28 03:13

[QUOTE=axn;544024]... which pales in comparison to the extra complexity (and performance hit) to the client.
[/QUOTE]

You are confused. The added computation would be hard to measure in a benchmark.
It is [b]tiny[/b]: Decrypt the index set, gather the residues during the TF, catenate them.
compute a hash.

[QUOTE]
Not to mention the fact that mfaktc/o don't directly communicate to the server, so all the middlemen/scripts also needs to be updated.
[/QUOTE]

[i]Any[/i] scheme will require code changes.

[QUOTE]
And people do need to build mfaktc/o from source anyway, so there goes the secrecy of the key.
[/QUOTE]

Why do they need source? Why can't you just distribute binaries?
(I suspect I know the answer]. It is the main reason I dislike these devices: lack of
interoperability. There are too many hardware variations. Code is not portable.

This problem can be avoided: Just compile a variety of different binaries. Don't give
out source.

[QUOTE]
Also, the index isn't a deterministic thing -- the base sieve used to find the candidate factors have a user-configurable sieve depth (because of efficiency concerns) and so different amounts of candidate factors are TD-ed.[/QUOTE]

It is not "index". It is "indices". And it should suffice to put a upper bound on the
max element in set s.

 retina 2020-04-28 03:26

[QUOTE=R.D. Silverman;544025]It is not as easy as you think. Especially when the key is scattered as bits and pieces
throughout the code.[/QUOTE]Key extraction is really easy. Judicious use of a debugger makes it a no brainer. That is how DVDs and BluRay get pwned effortlessly. It doesn't work.

 axn 2020-04-28 03:35

[QUOTE=R.D. Silverman;544026]You are confused. The added computation would be hard to measure in a benchmark.
It is [b]tiny[/b]: Decrypt the index set, gather the residues during the TF, catenate them.
compute a hash.[/quote]
You have clearly not done any GPU programming. What sounds like a simple operation in CPU land will have a large impact on GPU performance (not like 2x hit, but easily 20-30%). But for the sake of argument, let's say this is acceptable

[QUOTE=R.D. Silverman;544026][i]Any[/i] scheme will require code changes. [/quote]
The problem is that all of these have different authors. Any many people have their own scripts. Some people just manually get assignment from server and launch using command line. This isn't just code change -- it is a complex "system" change. The problem is that the whole "GPU TF" thing isn't very streamlined. Anyway, for the sake of argument, let's assume that this can somehow be coordinated.

[QUOTE=R.D. Silverman;544026]Why do they need source? Why can't you just distribute binaries?
(I suspect I know the answer]. It is the main reason I dislike these devices: lack of
interoperability. There are too many hardware variations. Code is not portable.

This problem can be avoided: Just compile a variety of different binaries. Don't give
out source.[/quote]
Hardware variations, OS variations, device driver variations, development platform (CUDA / OpenCL) variations.
Suppose, somehow we can overcome this problem and force everyone to use only trusted binaries. Then, we don't need to do any of this. We can just send in a nonce and the client can return HASH(nonce, exponent, bit depth). Since we already trust the client, we can trust it to generate this hash after full computation is done. No need to mess up the critical TF logic.

[QUOTE=R.D. Silverman;544026]It is not "index". It is "indices". And it should suffice to put a upper bound on the
max element in set s.[/QUOTE]

I'm trying to understand this part. Suppose the server asks client to compute has for indices {10, 20, 30}. The assumption is that the server knows (or can compute) exactly which candidate factors correspond to the given indices. But that is not deterministic on client side. Because of variable SoE sieve depth (and the inherent nondeterminism in GPU sieve), the exact sequence of candidate factors produced can vary and so the server can't validate the residue.

I suppose I'm misunderstanding you. Maybe you mean the actual factoring candidates (or the k in 2kp+1 for known prime candidates which will never be skipped). It would be tricky to implement such a thing efficiently.

 kriesel 2020-04-28 03:53

[QUOTE=R.D. Silverman;544026]Just compile a variety of different binaries. Don't give
out source.[/QUOTE]As things stand now, mfaktc is GPLV3 licensed and the source is out there, and the license specifies source must be made available. Mfakto was derived from mfaktc.

All times are UTC. The time now is 00:18.