![]() |
My opinion is that just use a dedicated/colocated server with a Tier 1 provider instead of running a server on a Cable/DSL. The latter are not stable enough for 'mission critical' server use and cost more than colocating. Getting a dedicated connection (eg ATM) is too costly.
As I see the usage isnt all that high I am pretty sure you can get away with just a 50GB/month plan, which doesnt cost all that much even with a good ISP I believe. |
why not use distributed.net
i didn't see if this discussion came up, so forgive me if i repeat other people.
why not "transfer" this project to distributed.net? dnet has just finished rc5-64; what some people one this board have called "waste of cpu". dnet says they're looking for new projects and have to release a new client anyhow. sure, gimps runs only on x86 now, but adding in the old lucas code for other architectures would be easy (and the others would eventually be optimized). dnet has tons of clients, is an established tax-free org (needed if gimps wins the $100k anyhow), has a world-wide network of proxies, has a master server, a dba, some web guys, lots of donated bandwidth, gigs of scsi drives, and a statistics system. it has alot of the work done already, so why not ask for their help. two established projects merging for the better of both. the only non-plus that i see is the loss of "ownership" that the small gimps community has now. plus the prize money of $100k is a bit more than the RSA $10k, so the disbersement of the prize money might have to change, but george can negotiate with dnet for the optimal solution. opinions? -j |
Re: why not use distributed.net
[quote="veggiespam"]why not "transfer" this project to distributed.net?[/quote]
1. Culture clash. People running d.net generally don't have the patience to run LL tests taking weeks or months to complete; and most people from that crowd would be interested in the prize, but not so much in doing initial (or double) checks on exponents below 30M. 2. Architecture. The problems which d.net was designed to attack have lots of tiny blocks which are simply "done" or "not done". Computers are assigned many blocks at a time; and they inform the server when they are done, but don't pass any answer back. 3. Independence. This discussion arose partly out of a desire to make GIMPS independent of Entropia; going from being dependent upon Entropia to being dependent upon United Devices wouldn't solve that issue. (Yes, d.net is officially a separate organization; but with all their major people employed by United Devices, that's rather a sham IMHO.) While the above problems aren't insurmountable, I think they surpass the difficulty of setting up a separate server to coordinate GIMPS. |
Re: why not use distributed.net
[quote]1. Culture clash. People running d.net generally don't have the patience to run LL tests taking weeks or months to complete; and most people from that crowd would be interested in the prize, but not so much in doing initial (or double) checks on exponents below 30M.[/quote]
with dnet, you can turn off projects that you do not wish to run. so, those with out patience, would turn off mprime. if it took 5 years to solve the last problem, i don't think taking 3 months to complete a single "block" would be a leap in many people's minds - the novelty of checking your stats everyday wears off quickly. as for initial/double checks, dnet double checks now by sending the block to different people to prevent stat inflating fraud. it just hides this fact from the user. so, instead of actually telling users that they're doing a check, just tell them they're doing a value. the point of this checking for mprime is to see if a bad cpu said "not-prime" when in fact is was prime. so a double check has the ability to win the prize still. it is just a matter of convincing users or just not telling them all the information. [quote]2. Architecture. The problems which d.net was designed to attack have lots of tiny blocks which are simply "done" or "not done". Computers are assigned many blocks at a time; and they inform the server when they are done, but don't pass any answer back.[/quote] this is simply not true. dnet has many projects, one of which is OGR (what it is: [url]http://www.distributed.net/ogr/[/url]. with OGR, you basically take a permutation of a bunch of numbers, record the sum, permutate again, record again. in the end, you save the lowest value for your "branch" and send the results back. [quote]3. Independence. This discussion arose partly out of a desire to make GIMPS independent of Entropia ...[/quote] which arose due to a lack of reliability on entropia's servers. dnet's stat servers seem to have load problems once and a while, but stats are not escential. proxies may go down, but your client will just try another proxy. if the master goes down, you'll never notice, since you're connecting to a proxy. [quote]While the above problems aren't insurmountable, I think they surpass the difficulty of setting up a separate server to coordinate GIMPS.[/quote] we're talking about setting a whole infrastructure here, coordination between the dba, the web guy, a new network code layer, the server client, and the people we politely beg to give us free space inside of the hosting site. this sounds like more work. if gimps does not want "transfer" or ask for dnet's help, then let us not do it for the right reasons. i'm willing to help with an independent (non-dnet,non-entropia) gimps project, but i do think asking for dnet's help is the best avenue. -j |
Re: why not use distributed.net
[quote="veggiespam"]with dnet, you can turn off projects that you do not wish to run. so, those with out patience, would turn off mprime.[/quote]
Yes, but they'd probably turn it off *after* being assigned an exponent to test. [quote]as for initial/double checks, dnet double checks now by sending the block to different people to prevent stat inflating fraud. it just hides this fact from the user. so, instead of actually telling users that they're doing a check, just tell them they're doing a value. the point of this checking for mprime is to see if a bad cpu said "not-prime" when in fact is was prime. so a double check has the ability to win the prize still. it is just a matter of convincing users or just not telling them all the information.[/quote] First, I don't think d.net does double checks; if they do, that's something I wasn't aware of. Second, the odds of a double-check finding a prime are rather absurdly low. Third, I don't think "not telling them all the information" is a workable strategy when you're asking people to donate their spare cycles. [quote]OGR[/quote] I hadn't forgotten about OGR, but I had forgotten that they get useful (non-boolean) results back. [quote]we're talking about setting a whole infrastructure here, coordination between the dba, the web guy, a new network code layer, the server client, and the people we politely beg to give us free space inside of the hosting site. this sounds like more work.[/quote] I think people are overestimating the difficulty of setting up a server for something like this. When I ran PiHex, I had a "server" consisting of 200 lines of C running on my home PC -- a windows 95 box connected to the internet via my cable modem. And for a recent paper (Computational investigations of the Prouhet-Tarry-Escott problem) I had 300 machines querying a 50 line script which I was running in my university-provided webspace. While I don't suggest anything quite so crude for GIMPS, the point remains that distributed computing -- especially on such coarse-grained problems as GIMPS -- is *not* a hard thing to automate. |
As ebx (and Mr T) said: [b]We need a plan![/b]
I would like to suggest that we start by designing proxies. Teams/individuals could set up proxies that buffer work-to-do and results when primenet is down and communicate when primenet is up. These proxies could pass all the authentification information back and forth so that we would still have accountability. This would require a slight change to primenet to recognize and authenticate the proxies so that they could pass info for multiple machines. As far as checking out blocks, they could initially be checked out to the proxy as a machine, and then transferred to the individual machine by the proxy that would tell primenet what it had done. A second, parallel effort, would be to mirror the stats, either on George's site or on this one. Primenet could forward info to this new site, either as it comes in or periodically in batches. Everyone would be encouraged to use the mirror site, reducing the load on primenet. Once these two initiatives are complete, we could re-examine the possibility of moving. We would be in a much better position to do so, if we so chose. Even if we stayed, we would still be in a much better position. Joe. |
In difference to most other DC projects, GIMPS is to a certain extent dominated by universities. I think the we should appeal to one of those learned institution for help.
Apart from that; implemeting a cache - proxy server functionality is smart. This project should be able to run on medium bandwidth network. 58 or even 200MB a day is not all that much. I believe the problem is the server(s). The project needs to be built on a high performance plattform - specially if we are to introduce more advanced server supplied statistics. I could personally host a site like that from a network bandwidth point of view, but I am uncertain that I have a server with the necessary performance available. We need more data on what it is doing. Meanwhile I think it would be a good idea to collect, without prejudice, all ideas for future functionality. This is important, because this functionality is part of the reason people chose a project - client stability is another. After making up a long and impossible list we can calmly remove the ones that are either too expensive or too difficult to implement. That is the time someone can make a reasonable guestimate about real server requirements and also real network requirements. As far as I know there are already a few on the table. So let me start by listing them. Feel free to add as many as you can, nothing is stupid at this point. 1. Linux 2. Apache 3. PostgreSQL (or similar) 4. PHP (or similar) 5. Teams with members having their own userids 6. P-1 as a separate work type 7. Better charts and graphs 8. Better synchronization 9. Security (network - machine - os etc) 10. Queue , cache or proxy server functionality 11. Bandwidth 12. Seperate (own) server(s) - architectually probably better with 2, web and db 13. More updated (PrimeNet stats) 14. Graphical front-end to the client 15. Better support for team stats - see TPR as an example 16. Maybe 2 server locations (master and mirror) for availability reasons 17. DB server with lots of memory - performance issue 18. Raid, SCSI and all the paraphenelia of a high availability server solution 19. More stats (what does people want here, what is useful - list please) 20. ? PS: Remember we are dealing with 30-40.000 users and hopefully growing. This is not a system you run on a win95 box :rolleyes: Alf |
[quote="Prime Monster"]19. More stats (what does people want here, what is useful - list please)[/quote]
Ability to sort status report and cleared exponents report by various criteria. For example, here's a list of ones I run regularly that I'd love to see automated: Exponent status - sort by days run - sort by days to expired - sort by account ID - sort by assignment type (ie, F, D, D*, DF, *, " ") Cleared exponents - sort by date returned - sort by factor (for factored exponents) - sort by account ID By extension, the ability to sort by -any- column in either of the reports, either ascending or descending, would be nice. Even better would be the ability for the user to mix and match sorts, combined with the ability to specify ranges. For example, here's a sort I do daily on the exponent status list: - Find all exponents less than 7.1M - Merge with all exponents less than 12M whose assignment type is * or " " - Sort by days to expired This gives me a quick snapshot of which small exponents are expiring each day. Another one I do occasionally is a cross-check between the cleared exponents list and a slightly older version of the exponent status list, looking for recently-cleared exponents in which the cleared report account ID doesn't match the status report account ID. Good for finding poachers. |
Since there is a time to market issue and we will never be able to release a perfect server in the first shot, it is important for us to classify all the requirements. What are the must haves, what are the highly desires and what are the would be nices. I say phase 1 is a prototype to prove ideas. Phase 2 is at least as good as current server and start serving the cummunity. Phase 3 is high flying.
The platform is the easiest to decide. But we need to find a way to fund it. Next comes to the hosting. That can, again, be phased in. We dont need any big pipe for the developing even the beta stages. The server code is the most time consuming. If we are sure where we are heading, it is good to set up a battlebase(CVS, mailing list, etc) so we can hand out working items just like gimps itself. 58MB daily traffic surprised me by large. I was affraid that I myself along would generate this much. :D |
Would someone want to set up a sourceforge account, or should it be hosted privately?
Also, what about asking Team Prime Rib donate some of their stats code for a baseline or starting point? Andrew |
[quote="adpowers"]Would someone want to set up a sourceforge account, or should it be hosted privately?
[/quote] I guess not. This is different in the sense that not everyone would want to run the code on his/her basement machines. Will sourceforge host non GPL projects? A description/diagram of current server would save us a lot of effort. |
| All times are UTC. The time now is 12:58. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.