mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Data

Reply
 
Thread Tools
Old 2019-02-23, 09:56   #12
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

26×131 Posts
Default

Email verification should work (well.. at least, as you said, slow down some badass kids).
Limiting the number of results reported per day for new users should work too, or limbo them for few days, they will get discouraged seeing that their last bravado does not pop them immediately in top of the list.
Adding some CRC to the reported results would be the proper way to go.
Proof of work, anybody? Forcing an about 1 minute SHA calculus, or so...
Sure, all of these were discussed in the past, and they have advantages and disadvantages. There is no "perfect" or "foolproof" method. Fools are intelligent guys..
But small steps, day by day, and we will improve...


Edit: on the other hand, all these are easy to spot an kill. Reporting results in other's names would be more dangerous, for example, imagine me reporting fake TF results for Oliver to discredit him... They would be difficult to spot, due to his high activity, and later create a lot of headache to re-check his (enormous amount of) work, when people will start to find missing factors in his reported results. Right now, nothing stop me to do so. We used to "trade credit" in the past, for example I was CPU-bottlenecked but had a lot of GPU power available, and I was doing and reporting TF work for other people who did and reported CPU work (DC) for myself, and there was nothing to stop us. I don't believe things changed ever since. That's why I said the CRC is the proper way to go...

Last fiddled with by LaurV on 2019-02-23 at 10:05
LaurV is offline   Reply With Quote
Old 2019-02-23, 11:17   #13
ATH
Einyen
 
ATH's Avatar
 
Dec 2003
Denmark

22×17×41 Posts
Default

Limit how much credit a brand new account can turn in? Like max 1,000 GHz-days in the first week? or 10,000 GHz-days the first month? or whatever seems fitting.
Maybe 1000*x GHz-days where x is the age of the account in days.

Could you add the rule: If 2000 or more trial factor lines are reported without a single found factor it will alert you?

Last fiddled with by ATH on 2019-02-23 at 11:19
ATH is online now   Reply With Quote
Old 2019-02-23, 16:38   #14
ramgeis
 
ramgeis's Avatar
 
Apr 2013

5·23 Posts
Default

I would definitely use the expected success rate to detect possible bogus results.
When a user reports hundreds of thousands of GHz-days of work the actual success rate should converge to the expected one. A too big difference then triggers an alert.
ramgeis is offline   Reply With Quote
Old 2019-02-23, 16:52   #15
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

52·113 Posts
Default

Quote:
Originally Posted by LaurV View Post
Adding some CRC to the reported results would be the proper way to go.
mfaktc v0.22-pre6 has some code for generating a checksum for each result line. The manual results form will validate the checksum for any results submitted with it, but obviously can't reject results without a checksum since it's not (yet) in common use.
Quote:
Originally Posted by ATH View Post
Could you add the rule: If 2000 or more trial factor lines are reported without a single found factor it will alert you?
There is now a similar rule in place, thanks for the suggestion. And I've fixed the SPE-typo that prevented my false-results alert system from working that should have flagged these results.
James Heinrich is offline   Reply With Quote
Old 2019-02-23, 17:37   #16
Madpoo
Serpentine Vermin Jar
 
Madpoo's Avatar
 
Jul 2014

3×5×7×31 Posts
Default

Quote:
Originally Posted by LaurV View Post
...
Adding some CRC to the reported results would be the proper way to go.
Proof of work, anybody? Forcing an about 1 minute SHA calculus, or so...
If it were asymmetrical so the server could quickly verify, that'd be okay. One glitch is that there are users who genuinely submit a LOT of results daily (think TJAOI or even the Gpu72 bot) so it might place an undue burden on them.

At least for official Prime95 clients, there's a security check in the result, but for other clients like mfakt* or gpuOwl, there's nothing like that.

We've actually been chatting with Ernst about getting mlucas to be able to use the API (hope I'm not spoiling any secret there...and it may or may not happen), but part of that is that API usage for the most part is restricted to "secure" clients that send a valid security hash in the requests.

We can set Ernst up with his own key so the client/server can trust each other, but the problem comes in because of the need to be open source. I guess it could be like Prime95 where the security bits are not made public and you can build without it, with the net effect being it cripples the automatic communication. Or something like that... I vaguely recall that it adds "UNTRUSTED" to the result or some such.

And besides the API security, it'd be nice to have those other apps be able to include a hash in the result line itself even for manual results.

Maybe it wouldn't be too terrible to let the official builds have the security features intact, and if you choose to compile it yourself without that, then you can still check in but under a bit more of a microscope? Not sure how many people "roll their own" and compile things themselves.

Quote:
Edit: on the other hand, all these are easy to spot an kill. Reporting results in other's names would be more dangerous, for example, imagine me reporting fake TF results for Oliver to discredit him... They would be difficult to spot, due to his high activity, and later create a lot of headache to re-check his (enormous amount of) work, when people will start to find missing factors in his reported results. Right now, nothing stop me to do so. We used to "trade credit" in the past, for example I was CPU-bottlenecked but had a lot of GPU power available, and I was doing and reporting TF work for other people who did and reported CPU work (DC) for myself, and there was nothing to stop us. I don't believe things changed ever since. That's why I said the CRC is the proper way to go...
To report results as someone else you'd have to do it in one of two ways...
  • Know their login/password on the site and submit using the manual forms
  • Configure your client with their login id and submit using the client (no password needed for that)

The first one is definitely a problem which is why you should pick good passwords.

The second one would be harder to exploit because the client is doing the communicating, and due to the aforementioned security hashes that Prime95 implements on the API side would be very hard to automate. You might as well actually do the work and submit it. LOL

Preventing fraudulent results from coming in probably won't be possible to totally stop, but we can do better. Detecting and removing them is feasible as we've seen, because of the "many eyes" who are quick to spot even some minor weirdness.

It's probably a good thing whenever we have these new prime discoveries and people go on their egg hunts, because they pore over the data looking for clues. That's the same type of analysis that helps find bad actors.
Madpoo is offline   Reply With Quote
Old 2019-02-23, 18:02   #17
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

34×109 Posts
Default

Quote:
Originally Posted by Madpoo View Post
One glitch is that there are users who genuinely submit a LOT of results daily (think TJAOI or even the Gpu72 bot) so it might place an undue burden on them.
Just to note, the GPU72 bots never submit results; that's done by each individual worker's clients.

Quote:
Originally Posted by Madpoo View Post
Know their login/password on the site and submit using the manual forms.
Many years back, LaurV and I were trading work. He knew my primenet username (but NOT my password) and was doing TF'ing for me, and I knew his username and was doing DC'ing for him.

LaurV being LaurV, a few years later he played a joke on me, and submitted a tonne of (legitimate) DC work under my username, but somehow used a different Display Name.
chalsall is online now   Reply With Quote
Old 2019-02-23, 18:04   #18
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

44F16 Posts
Default

Quote:
Originally Posted by Madpoo View Post
5520 TF results, all of them "no factor found". It'd be astounding to do that many and not find any factors, ya know? And making a point of specifying "1 to 82 bits", because all of them had been TF'd to at least some bit level in the 60's, haven't they?

Plus, with a line like this, doesn't this signify that the mfaktc build only goes up to 76 bits?
Code:
no factor for M900200747 from 2^1 to 2^82 [mfaktc 0.20 barrett76_mul32_gs]
Correct, that kernel can handle numbers up to 276. And the lower limit for all current barrett based kernels in mfaktc is 264.

Oliver
TheJudger is offline   Reply With Quote
Old 2019-02-23, 18:16   #19
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

7·491 Posts
Default

Quote:
Originally Posted by ATH View Post
Limit how much credit a brand new account can turn in? Like max 1,000 GHz-days in the first week? or 10,000 GHz-days the first month? or whatever seems fitting.
Maybe 1000*x GHz-days where x is the age of the account in days.
GTX1080: rated at 1015.8 GhzD/day. Let's round to 1000.

So, one gtx1080-day/week (14% of TF throughput), 10/month (33% of TF throughput), cap of a user account to one gtx1080's TF throughput indefinitely?That would really hamstring any RTX2060 or RTX2080 owner's credit for contribution, or the top producers that run multiple systems and gpus. TheJudger is running the equivalent of 8 RTX2080 Ti's; 107GhzD/year=~28,000GhzD/day. (Awful unit, that GhzDay/day; why not Ge, Gigahertz-equivalent? If you think the corporation GE might take exception, use G e or G.e or Ghe...)

The top couple dozen producers are each averaging over the past year, more than the proposed 1000 GhzDays/day cap, 365,000 GhzDays/year. Given the current gpu state of the art, a couple orders of magnitude higher threshold would be advisable, with adjustment occasionally as hardware designs advance. https://www.mersenne.org/report_top_500/ https://www.mersenne.ca/mfaktc.php
The last 24 hours of GIMPS was 175.5Thz. TheJudger alone averaged 27.7 over the past year. The 24th on the top producer list averaged a bit over 1Thz. The top 24 combined contributed over 108Thz over the past year, or more than 60% of the past day's total GIMPS throughput from all partiipants.

I suggest caution about removing or reducing incentives of the top performers. And all were once new users. Someone joining the project with a shiny new GTX 1080 Ti or faster gpu could get quickly disillusioned and leave, that might have become a very substantial contributor over time. There are consumer gpus on the market that will each produce more than triple the proposed 1000 GhzD/day credit cap.
Currently there is nothing in signup, manual assignment or reporting, that enables the user to indicate what their gpu computing capacity is, other than rolling gpu model name into computer name in mfaktx or gpuowl; in CUDALucas or CUDAPm1 nothing at all. So from some applications, the primenet server does not receive gpu model and have a basis for computing whether the rate of "results" is plausible. (sorry, earlier version had that backwards plus incomplete!)

Last fiddled with by kriesel on 2019-02-23 at 18:54
kriesel is online now   Reply With Quote
Old 2019-02-23, 18:18   #20
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

D6D16 Posts
Default

Quote:
Originally Posted by TheJudger View Post
Correct, that kernel can handle numbers up to 276. And the lower limit for all current barrett based kernels in mfaktc is 264.

Oliver
So, not just a counterfeit, but a redundantly easily exposed one. And claiming to have used older than the current released version. (Looking forward to 0.22 here...)

Last fiddled with by kriesel on 2019-02-23 at 18:23
kriesel is online now   Reply With Quote
Old 2019-02-23, 18:47   #21
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

D6D16 Posts
Default

Quote:
Originally Posted by Madpoo View Post
We've actually been chatting with Ernst about getting mlucas to be able to use the API (hope I'm not spoiling any secret there...and it may or may not happen), but part of that is that API usage for the most part is restricted to "secure" clients that send a valid security hash in the requests.

We can set Ernst up with his own key so the client/server can trust each other, but the problem comes in because of the need to be open source. I guess it could be like Prime95 where the security bits are not made public and you can build without it, with the net effect being it cripples the automatic communication. Or something like that... I vaguely recall that it adds "UNTRUSTED" to the result or some such.
...
It's probably a good thing whenever we have these new prime discoveries and people go on their egg hunts, because they pore over the data looking for clues. That's the same type of analysis that helps find bad actors.
As I understand it from reading Ernst's readme and the bare beginnings of getting started with mlucas myself, standard operating procedure has been that the end user compiles his own mlucas for his OS and hardware type. He's covering a range of Intel processors, AMD, and ARM.
Similarly, at least on the linux side of things, gpuowl and other gpu apps are generally end user compiled. I build Windows executables frequently for gpuowl and share them in the gpuowl thread.
Flashjh and others used to provide CUDALucas executables. Etc.

None of the gpu apps have interfaced the Primenet API or security code. I think there are some structural issues with the cpu-oriented primenet API that relate to that. I suppose one could give the primenet api fake properties for a gpu as a strange sort of single-core cpu to try to shoehorn it in.
kriesel is online now   Reply With Quote
Old 2019-02-23, 20:38   #22
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

7×491 Posts
Default Primenet API and documentation

Please review https://www.mersenneforum.org/showth...845#post505845
Post 1: primenet practice in prime95 has outpaced the documentation
2: the existing API is cpu and prime95 oriented, which does not fit gpu applications well
3: early incomplete draft regarding extending the primenet API to support gpu applications, including multi-gpu per system configurations.
kriesel is online now   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Results ET_ Operazione Doppi Mersennes 604 2020-03-26 15:17
Where have all the TF results gone?... lycorn PrimeNet 22 2017-10-02 02:40
PGS Results danaj Prime Gap Searches 0 2017-08-14 18:35
CPU Results last 24 hrs Unregistered Information & Answers 3 2010-07-26 00:49
0x results... Mike PrimeNet 11 2004-05-23 12:55

All times are UTC. The time now is 00:37.

Thu Apr 2 00:37:49 UTC 2020 up 7 days, 22:10, 3 users, load averages: 1.12, 1.13, 1.21

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.