mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > PrimeNet

Closed Thread
 
Thread Tools
Old 2003-01-22, 21:24   #177
Old man PrimeNet
 
Old man PrimeNet's Avatar
 
Jan 2003
Altitude>12,500 MSL

10110 Posts
Default A few more loose ends

Comments...?

Web site reporting - what do we want to do here? - Luke Welsh helped me design the PrimeNet hourly status reports layouts, particularly the ranges & counts columns (we did it over beers at the Faultline microbrewery in Silicon Valley). I think it will be fun to revisit what we want to see there.

Web site manual forms access - I've seen a few posts here and previously on the Mersenne list suggesting we limit hits by IP-count-per-hour or similar. Is there a quick & easy way to configure this? Otherwise, it's something we could more simply avoid by making a 'sandbox' table exposing a limited subset of exponents.

Web server layer? - We get free logging, connection handling & web hosting features by staying with a CGI or ISAPI interface, but we can also cut it from the loop this time and take sockets directly. I did CGI for PrimeNet v1-v4 - sometimes if the system service layer stopped on an undecidable rule, we'd have a few hundred pnHttp.exe processes pile up in memory... ISAPI would better mitigate that. I frequently checked the www logs as a debug tool showing URL hits against the other layers. I suppose I'm leaning toward keeping the web server layer status quo for continued flexibility.

Pushing GIMPS upgrades to opt-in machines? - It's possible to do this safely and reasonably easily, but does it make sense? It would amount to delivering a compressed, encrypted & signed DLL to the machines folks setup as opt-in for upgrades.

Open source client builds, trusted or open? - Do we propose handling this differently? Today, only folks holding a trusted key can build binaries the server will accept. If we keep it, the key & encoding method should be strengthened. If not, how do we manage providing open source without giving folks a direct door for malicious network access?
Old man PrimeNet is offline  
Old 2003-01-22, 22:39   #178
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

7,537 Posts
Default

> Web site reporting - what do we want to do here?

This is "easy". There are many other DC projects out there to emulate and improve upon. The reason it should be easy is once we create the database layout I think we can get several volunteers to write a plethora of reports: Project-wide tables and graphs, Team & User standings (forever, last year, last month, overtake reports, by project (LL, dblchk, 10M tests, P-1, TF), recent activity), computers, etc.

> Web site manual forms access

A sandbox or per-hour limit or some such. We do need these forms.

We also need a retrieve my password form (emailed to the user).

> Web server layer?

I'm not qualified to comment.

> Pushing GIMPS upgrades to opt-in machines?

Might be nice, might scare some people to death. I vote to ignore for now.

> Open source client builds, trusted or open? - Do we propose handling this differently?

Trusted builds could use a key to gain trusted access.

All open source builds could generate a random key at compile time. We can then limit activity on that key to a few transactions per day. To guard against a really bad guy creating thousands of executables or generating a new key every connect we can also limit the total non-trusted accesses per hour. This is still security through obscurity, but as good as I can come up with in an open source arena.

Another security idea: each reservation should come with a 32-bit key written to worktodo.ini. Without this key you cannot unreserve the exponent (or report a result and obtain cpu credit?). This is to fight malicious unreserving and poaching reserved exponents.
Prime95 is online now  
Old 2003-01-22, 22:42   #179
QuintLeo
 
QuintLeo's Avatar
 
Oct 2002
Lost in the hills of Iowa

1110000002 Posts
Default

> > Not sure about other small projects. Database structure is going to be very mersenne-centric (or otherwise it will be fat and slow).

> The reservations and results will be Mersenne-centric. The teams / users / computers / stats databases should not

Look at the distributed.net stats code? It's stable, it seems pretty robust, and it DOES handle a lot more load than Mersenne is likely to ever need.

> Pushing GIMPS upgrades to opt-in machines?

As an opt-in, perhaps - but the client will need a LOT of testing before being put into the "push" que if this is done.

> I like the team managed at server/web-site idea

Seems to be the "common" method, and it works.

> This is a good question for the users - configure everything in one place (the client) vs. two places (the client and the server).

Web-based *client* configuration is a PITA in my opinion - it was one of the things I disliked the most about United Devices. Client doesn't need to know about teams, just your user ID and your machine ID and what it's working on, IMO.

I suspect that a lot of new users won't have any interest in a team - those that do probably got recruited by an existing team, which should be easily capable of guiding them through the "join team" process....

9-)
QuintLeo is offline  
Old 2003-01-23, 00:24   #180
asdf
 
asdf's Avatar
 
Sep 2002

22·3·5 Posts
Default

Quote:
Originally Posted by Prime95
Another security idea: each reservation should come with a 32-bit key written to worktodo.ini. Without this key you cannot unreserve the exponent (or report a result and obtain cpu credit?). This is to fight malicious unreserving and poaching reserved exponents.
Extremely good idea! :D
asdf is offline  
Old 2003-01-23, 03:25   #181
aga
 
Oct 2002

2016 Posts
Default

Quote:
Originally Posted by Old man PrimeNet
ok. if we can mirror databases, shouldn't we be able to build snapshots from replayed transaction logs to run stats for a web server? How often and how large would a log file segment be? Can it be configured for hourly new log file segments that can be quickly post-processed (elsewhere) and then published on a web site (perhaps elsewhere again)?
It sounds easiest to deliver data from core server(s) to slave stats server(s) using realtime MySQL replication. Probably no need to go into mess with batches, at least not with v5. MySQL replication allows to filter out data unneeded by stats servers - sounds like it does not cause security problems in this particular case. (Hereby I assume that there will be at least one core server that runs with a private MySQL engine; thus there is GIMPS-only replication log available). If don't filter replication log, then stats servers will also act as backups (even failovers?) for core servers, thus eliminating need of explicit backing up.

This will not work if we want to avoid keeping users passwords at stats servers (i.e. when someone logins at stats server, the stats server checks authorization at one of the core servers). The question is, do we need that? A public stats server could simply log all seen logins anyway. So it sounds better if instead only trustful enough stats servers are used?

But if core servers don't wipe out processed messages quickly, then it might be possible to feed stats servers with customized highly-compressed data batches (factually, only result returns matter). In this particular case, I believe the bandwith saved will not worth the efforts (and extra disk space, and RAM for db caching, used at core servers).

Quote:
Originally Posted by Old man PrimeNet
I also recommend we use a different base URL for each version of Prime95 v23+, different in DNS name, path or executable name
I suggest using different subdomains, not paths. Subdomains allow most flexible routing of requests using DNS, but paths could be served by different servers only with proxy installed, which is not too robust.

Also, it sounds more natural to use subdomains based on protocol version, not protocol of client. First, if there is a new release of mprime/prime95 that improves LL speed 2fold :) but without any other changes, spawning a new subdomain is strange. Second, if/when there is other primenet-enabled software becomes avaialble, it will be a whole mess to maintain all the versions of indemendent clients. Also, it will be virtually impossible forget changing the domain name with new release - if protocol has changed, it will not work, that's all.

Quote:
Originally Posted by Old man PrimeNet
Security - How about using HTTPS with HTTP fall-back? Or did the visibility of the HTTP data help GIMPS stay low-profile/harmless? What about trusted source builds vs. public open source builds accessing the network?
HTTPS does not sound good. It gets CPU usage on the server noticeably higher, requires additional roundtrip(s) to establish connection, takes additional RAM to track secure sessions (which is not needed at all), and tripples network traffic. In addition, use of HTTPS precludes use of transparent comperssion protocols ilke MNP5/V42b/ppp-deflate; thus modem users might experience up to 10x worse network performace - with usually overloaded modem links, that might start causing troubles.

And for all that, gives no gain. The usual use of HTTPS solves only 2 tasks - authenticating web site (making sure that there is no DNS spoofing), and prevents third parties from intercepting/altering traffic en-route. Latter is not needed for GIMPS at all, former is a very rare problem, and unlikely will ever affect GIMPS except maybe few internet users who use untrusted DNS server.

Potentially, HTTPS could be used to authentificate users/computers; but for that we need a centralized mean of distributing/signing keys. It's going to cause much more headache than usefulness.

HTTPS could be useful for distributing client-side automatic upgrades. But it sounds better to just cryptographically sign all binary modules/plugins, ship public key with client-side software, and then use unsecure inexpensive means of software distribution.
aga is offline  
Old 2003-01-23, 04:17   #182
aga
 
Oct 2002

25 Posts
Default Re: A few more loose ends

Quote:
Originally Posted by Old man PrimeNet
Web site reporting - what do we want to do here? - Luke Welsh helped me design the PrimeNet hourly status reports layouts, particularly the ranges & counts columns (we did it over beers at the Faultline microbrewery in Silicon Valley).
BTW, this report can be easily made live, assuming that stats servers receive live stream of returned results (unreserved expoenents, too). What it boils down to, is keeping an integer array 15x300 in static object. When stats server receives event, it substracts 1 from one cell and adds it to another; periodicatelly checkpointing array and data stream onto external database. Then, generating HTML page with up-to-a-second correct numbers based on the array will be virtually as fast as serviing out a static page. Not that the realtime numbers are really neccessary, but it might be quite easier implement it that way, instead of tracking the hourly boundaries.

Quote:
Originally Posted by Old man PrimeNet
Web site manual forms access - I've seen a few posts here and previously on the Mersenne list suggesting we limit hits by IP-count-per-hour or similar. Is there a quick & easy way to configure this? Otherwise, it's something we could more simply avoid by making a 'sandbox' table exposing a limited subset of exponents.
I insist on using JSP/Servlets technology. Aside of the frequently mentioned benefits (excellent portability, great performance resulting from powerful realtime compiler (BTW java runtime includes FFT based math package - it's trivial and fast to check in realtime if the factor submitted is a correct one, or alike)), all the tricks are implemented trivially. Here is basically the idea:

We define an application-wide attribute, container object (hashmap based?). As the very first action of processing web request, is quering the container regarding the amount of accesses seen last hour (or last minute). If it exceeds threshold, the user sent to the forest. Otherwise add a new record into the hashtable and serve the request in a usual manner. Periodically (when container becomes large, or just was not maintained a while), the container is scanned and too old records are wiped out. If that is not sufficient, records regarding ip addresses that made just few accesses (an expected case) are removed too; those ip addresses that submit noticeable traffic might have records aggregated using small intervals.

Things like that will add just few dozens of CPU cycles to processing web request. Which other web technology can provide similar flexibility (except maybe embedding application-level C code into Apache)?

Quote:
Originally Posted by Old man PrimeNet
Web server layer? - We get free logging, connection handling & web hosting features by staying with a CGI or ISAPI interface, but we can also cut it from the loop this time and take sockets directly.
Let's instead use server-side java. Its performance is so good that things like coding a custom HTTP implementation are not needed at all, even for highest traffic tasks.

Also, JSP/Servlets can be extremely easily mapped (deployed) in a way to emulate CGI, to support the old clients.

Quote:
Originally Posted by Old man PrimeNet
Open source client builds, trusted or open? - Do we propose handling this differently? Today, only folks holding a trusted key can build binaries the server will accept. If we keep it, the key & encoding method should be strengthened. If not, how do we manage providing open source without giving folks a direct door for malicious network access?
Why don't allow easily registering new client build keys at the server? (I don't mean allow it everyone in the world, this should be coordinated, and there should be stict limit on how many keys an independent developer can have hosted at server). But I'm still not sure if it's really neccessary.
aga is offline  
Old 2003-01-23, 04:53   #183
aga
 
Oct 2002

25 Posts
Default

Quote:
Originally Posted by Prime95
Another security idea: each reservation should come with a 32-bit key written to worktodo.ini. Without this key you cannot unreserve the exponent (or report a result and obtain cpu credit?). This is to fight malicious unreserving and poaching reserved exponents.
Sounds good. But maybe it's sufficient to accept exponent result only if submitted by user (or user+computer?) who was assigned the exponent?
aga is offline  
Old 2003-01-23, 05:31   #184
aga
 
Oct 2002

25 Posts
Default Re: A few more loose ends

Quote:
Originally Posted by Old man PrimeNet
how do we manage providing open source without giving folks a direct door for malicious network access?
Basically, if GIMPS:

- start doing 3, or 4 (or 5? or 6? hmm) tests for each expoenent;
- accepts results only from user (user+computer?) work was assigned to;
- assigns exponents in a way that each exponent is tested at least once by some user who have long-standing participation history and thus can be trusted enough;
- has zero tolerance to the fake results submitted, and if that happens then results, stats, and account itself are promptly wiped off from the database (make discount for hardware failures that came unnoticed tho);
- makes testing results public only after all the 3 (4, 5) tests have been independently made, to ensure that one can not steal a correct result found by someone else

This all combined should be pretty safe (at least, assuming that over 50% participants do not cheat). The question is, will such painful step toward 100% open software increase GIMPS speed notwithstanding the extra tests done? At the moment it does not look so at all; but who knows, it might change eventually - when that happens we might go into details on this.

From the other side, accepting practice that residues become public only after doublecheck completes might be neccessary right now. Currently, PrimeNet server masks the residues, but with every database synchronization they go into George's database, and there residues are not masked. There might be quite a few expoenents there that are considered as double checked but factually LL test have been run only once (or even zero times, with lower probability).
aga is offline  
Old 2003-01-23, 18:47   #185
Paulie
 
Paulie's Avatar
 
Aug 2002

223 Posts
Default Re: A few more loose ends

Quote:
Originally Posted by aga
- accepts results only from user (user+computer?) work was assigned to;
Ouch, that would hurt intrateam trading. If I have a computer die and won't be able to bring it back up, I would wish those exponents to be finished by a team member.

I really like the idea of adding a "security code" to an assignment. Something like an MD5/SHA of a date/time+random seed+M(xxx) should work well.

One minor problem with that would be if I had a machine die, and I don't have the security code, I couldn't give that assignment to someone else. Maybe allow for the "owner(s)" of a Team to allow reassignment to a particular user within the team. Either that, or I see a team keeping a DB of assignments and security codes. :)

That's an idea. If a client is a member of a team, have a couple of options:

1) says get assignment from Server and/or get assignment from Team Pool. If there are exponents that are not available in the team pool (if option checked), then get some from the server and assign it to the team and the user.

2) Allow for "Team Reassignment" checkbox/option. This way the team owners could move work from rogue borgs or dead machines to other people, but only for those who would want to have this feature. Then, if the client comes back and re-checks in, it could see's a status flag when trying to update that says "team reassigned", so it would dump that work assignment. If you don't have the allow team reassignment clicked, then the team owner can't do anything with it, letting an exponent expire normally.

This would help the poaching issue, as well as allow for mass assignments of exponents (say for a Gauntlet or other competition with a block of released exponents given to the team via the server).
Paulie is offline  
Old 2003-01-23, 19:23   #186
aga
 
Oct 2002

408 Posts
Default Re: A few more loose ends

Quote:
Originally Posted by Paulie
Quote:
Originally Posted by aga
- accepts results only from user (user+computer?) work was assigned to;
Ouch, that would hurt intrateam trading. If I have a computer die and won't be able to bring it back up, I would wish those exponents to be finished by a team member.
If that's the only computer person possesses, he/she might have problems giving checkpoint file to someone else. Is there is no checkpoint, just let exponent to timeout. There is an infinite supply of exponents, so other team members will not starve having few exponents timed out.

If that's not only computer, why don't just move checkpoint to another computer?

If it is really really neccessary move checkpoint file to a different user's computer, I suggest moving it together with ini files and installing a new copy of client-side software in order to finish the run.

Do computers die that often?

After all, does it really worth involving human labor in moving exponents around? (Isn't it better to spend the time earning few extra bucks, and then install an additional computer to run GIMPS software - that will be much better time contribution for GIMPS, even if it will take years to purchase the new computer. This will also provide financial support for CPU vendor(s) to further optimize and speed up CPUs.)

Do I miss something?
aga is offline  
Old 2003-01-23, 20:12   #187
Maybeso
 
Maybeso's Avatar
 
Aug 2002
Portland, OR USA

2·137 Posts
Default Re: A few more loose ends

Quote:
Originally Posted by Old man PrimeNet
Web site manual forms access - I've seen a few posts here and previously on the Mersenne list suggesting we limit hits by IP-count-per-hour or similar. Is there a quick & easy way to configure this? Otherwise, it's something we could more simply avoid by making a 'sandbox' table exposing a limited subset of exponents.
Many of the suggestions for this involve some decision making and then checking hits against a threshold.

I think we should compare the threshold to a rolling average. This could get messy -- time or memory consuming. But we could fine tune it by increasing or decreasing the rolling window and the update rate. It could even be self adjusting based on resource consumption.

We could also keep an IP list for each user (hidden/encrypted?), with the threshold linked to the user, not the IP. The threshold would start low and go up with the users contributions. That way teams and farms or labs would automatically earn a higher threshold, and attacks with multiple IPs/machines/userIds would get choked off early. The attacker would have to contribute to the cause to increase his threshold!
Maybeso is offline  
Closed Thread



Similar Threads
Thread Thread Starter Forum Replies Last Post
Report of monitoring primenet server unavailability Peter Nelson PrimeNet 13 2005-10-18 11:17
Is Entropia in trouble? ekugimps PrimeNet 1 2005-09-09 16:18
mprime stalls if primenet server is unavailable :( TheJudger Software 1 2005-04-02 17:08
Primenet Server Oddity xavion PrimeNet 28 2004-09-26 07:56
PrimeNet server replacement PrimeCruncher PrimeNet 10 2003-11-19 06:38

All times are UTC. The time now is 14:55.


Mon Aug 2 14:55:47 UTC 2021 up 10 days, 9:24, 0 users, load averages: 2.54, 3.01, 3.44

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.