mersenneforum.org  

Go Back   mersenneforum.org > Prime Search Projects > Prime Sierpinski Project

Reply
 
Thread Tools
Old 2005-12-14, 21:35   #23
ltd
 
ltd's Avatar
 
Apr 2003

22×193 Posts
Default

After testing the virtual server, which should take place in the next two weeks i will transfer the llrnet server over to that location. There should be only an outage of maximum an hour or so and there will be no need to modify anything at the client as i will reroute the "no-ip" domains over to the new server. But i will have to test the procedure first with dummy data i have to create to make sure that everything works without any problems.

Next step will need some knowledge with PHP/html/apache and i think also linux.


I want to set up the database also on the new server. Which will allow us to implement the following features available in the future.
As i am a beginner in PHP i have no clue so far how to implement my ideas and have the setup also secure. At the moment all things run on one PC where nobody from the outside world has access so i did not do anything to make access sevcure.

The structure i want to implement first needs the following things.
1. MySQL as RDBMS
2. Public HTML pages (NO DB access)
3. PHP scripts which should be run every hour/day
These scripts generate the pages under 2 and some other internal things
4. no public HTML pages to have the posibility to run special PHP scripts
to make special checks or generate special reservation files,......
These pages should never be visible to the public
5. directory structure that stores public files like workunit files to be reserved. (must be accessed by PHP to write the data)
6. directory structure for no public files. ( like returned residues) These files must be also visible to PHP but never to the public
7. The PHP scripts need to access some files from the llrnet server.
So again access to a no public diretory but this time it is not within the apache scope.

Now to the plans for the future:

1. Some pages that allow partly access to the DB to make it possible to create new teams or to join an existing team.
This means that we will have to implement some authentication methods.
2. Pages to allow returning of sieved results via a web interface and not by mail/forum anymore.
3. Pages that allow returning PRP results via web interface and not by mail/forum anymore.

I will need every help that i can get to set up the apache server in the right way so that there are no security problems and no way to damage the DB from the outside world.

First thing i will have to find out is: How to implement the job automation (automated PHP script running)
Is there a way to do it with the apache server or
will i have to set up some cron jobs?

I am open for every suggestion and also for additional ideas what the server should do,

Lars

Last fiddled with by ltd on 2005-12-14 at 21:38
ltd is offline   Reply With Quote
Old 2005-12-14, 22:18   #24
magnav0x
 
magnav0x's Avatar
 
Oct 2005

2010 Posts
Default

Quote:
Originally Posted by ltd
1. MySQL as RDBMS
Just setup MySQL with localhost access only and a conjob to backup and transfer database remotely for security and integrity (just in case). No need for a full RDBMS.

Quote:
2. Public HTML pages (NO DB access)
3. PHP scripts which should be run every hour/day
These scripts generate the pages under 2 and some other internal things
Easily done setting up PHP scripts in crontab. E.g. setup PHP files that generate stat pages, or whatever you want hourly/daily or whatever. A bit vague on what you really want here, so I'm unsure what you are wanting exactly.

Quote:
4. no public HTML pages to have the posibility to run special PHP scripts
to make special checks or generate special reservation files,......
These pages should never be visible to the public
Are you saying you want a page that say only the project admins have access to where they can do do these special checks and file generations?

Quote:
5. directory structure that stores public files like workunit files to be reserved. (must be accessed by PHP to write the data)
Easily done. I'm assuming the PHP scripts that will need access to write to this directory will be for uploading ranges? Again a tad vague.

Quote:
6. directory structure for no public files. ( like returned residues) These files must be also visible to PHP but never to the public
Returned results could be sent to any place in the file system you like. PHP can access them. You could have them all accessable from a central Admin returned results page.

Quote:
7. The PHP scripts need to access some files from the llrnet server.
So again access to a no public diretory but this time it is not within the apache scope.
PHP can do this, no need to configure anything special in Apache. Assuming whatever user Apache is run under has premissions to access the given fold, such as /opt .

Quote:
Now to the plans for the future:

1. Some pages that allow partly access to the DB to make it possible to create new teams or to join an existing team.
This means that we will have to implement some authentication methods.
Easy enough, just need to know what sort of verification you would like. E-mail verification?

Quote:
2. Pages to allow returning of sieved results via a web interface and not by mail/forum anymore.


3. Pages that allow returning PRP results via web interface and not by mail/forum anymore.
Both of these can be done with PHP.

Quote:
First thing i will have to find out is: How to implement the job automation (automated PHP script running)
Is there a way to do it with the apache server or
will i have to set up some cron jobs?
Just write your PHP scripts and run them in a cron job e.g "php /somefolder/somescript.php" and have them run how ever often you would like. I assume you mean scheduling a PHP script to run every so often. But you will need the PHP CLI installed, which is no big deal.

Last fiddled with by magnav0x on 2005-12-14 at 22:19
magnav0x is offline   Reply With Quote
Old 2005-12-14, 22:40   #25
ltd
 
ltd's Avatar
 
Apr 2003

22×193 Posts
Default

Quote:
2. Public HTML pages (NO DB access)
3. PHP scripts which should be run every hour/day
These scripts generate the pages under 2 and some other internal things


Easily done setting up PHP scripts in crontab. E.g. setup PHP files that generate stat pages, or whatever you want hourly/daily or whatever. A bit vague on what you really want here, so I'm unsure what you are wanting exactly.
You are right this should implement the stats creation and also the recreation of workunit files after factors are submitted

Quote:
Are you saying you want a page that say only the project admins have access to where they can do do these special checks and file generations?
Thats exactly what is needed.

Quote:
Easily done. I'm assuming the PHP scripts that will need access to write to this directory will be for uploading ranges? Again a tad vague.
You are very good in guessing what i mean. Again thats needed.
The post was only thought to give a short overview on my plans and not to describe all the things in detail. If you are interested i can write something done which should describe in detail what i have at the moment and what i think i will need.

It looks like most of my problems come only from not knowing the apache config ( allow/deny access)
and also not knowing linux permissions well enough. But lets see it this way. Its a good way to learn something new.

For the authentication i think it should be possible to make online changes on your personal setup after login in. That should be create/join/leave a team and return some results.

Lars
ltd is offline   Reply With Quote
Old 2005-12-14, 23:20   #26
magnav0x
 
magnav0x's Avatar
 
Oct 2005

22·5 Posts
Default

Quote:
Originally Posted by ltd
You are right this should implement the stats creation and also the recreation of workunit files after factors are submitted



Thats exactly what is needed.



You are very good in guessing what i mean. Again thats needed.
The post was only thought to give a short overview on my plans and not to describe all the things in detail. If you are interested i can write something done which should describe in detail what i have at the moment and what i think i will need.

It looks like most of my problems come only from not knowing the apache config ( allow/deny access)
and also not knowing linux permissions well enough. But lets see it this way. Its a good way to learn something new.

For the authentication i think it should be possible to make online changes on your personal setup after login in. That should be create/join/leave a team and return some results.

Lars
Lars,

Ok I understand those questions now. I'd love to read in detail everything you are wanting done. Post it here for others and/or e-mail me and we will discuss it further. I'd love to help you out. I'm no Linux genious but I know my away around good enough. Projects like these are always a good way to teach yourself. I learned PHP/SQL about 2 years ago when I wanted stats for Chessbrain (distributed computing project) but no one had any. So I decided to make my own. It took a while, but I found my way over hurdles and learned quite a bit about PHP and SQL. Now I'm doing it for a living. I'm still learning new things every day even at this point in time I can help you with just about any aspect of what you want above, just let me know and I'll pull up my sleeves and dive in with you.

The hardest thing about helping is knowing how your database is/will structured. Typically I base code on how the database is layed out. Are you going to try to do everything from scratch (the database and all)? Or were you wanting to keep the current database structure and build around it? Sometimes it's better to start from scratch (redesign), but it depends on wether or not the current setup limit's the expansion in any way.

Last fiddled with by magnav0x on 2005-12-14 at 23:25
magnav0x is offline   Reply With Quote
Old 2005-12-15, 08:06   #27
ltd
 
ltd's Avatar
 
Apr 2003

22×193 Posts
Default

The description of the structures i build are a little bit long for the forum so i will mail them to you when the update is ready. (Everytime the same if you are the only programmer on a project no uptodate documentation. )

For the structure of the DB i am not sure if we should make a complete new structure. The original structure has its weak points but i am not sure if the load of addtional joins that is needed when using another structure are the better choice.
For example the result table holds the status of the k/n pair and also the test information like (residue,factor and userids of the contributor who send in the result)
The other way would be in my opinion to have a result table only holding a numeric key the k/n pair and some status information like (sieved, first prp test, second prp test) and then tables storing the testresults for each type using the numeric key as foreign key.

Lars
ltd is offline   Reply With Quote
Old 2005-12-15, 08:41   #28
magnav0x
 
magnav0x's Avatar
 
Oct 2005

22×5 Posts
Default

Quote:
Originally Posted by ltd
The description of the structures i build are a little bit long for the forum so i will mail them to you when the update is ready. (Everytime the same if you are the only programmer on a project no uptodate documentation. )

For the structure of the DB i am not sure if we should make a complete new structure. The original structure has its weak points but i am not sure if the load of addtional joins that is needed when using another structure are the better choice.
For example the result table holds the status of the k/n pair and also the test information like (residue,factor and userids of the contributor who send in the result)
The other way would be in my opinion to have a result table only holding a numeric key the k/n pair and some status information like (sieved, first prp test, second prp test) and then tables storing the testresults for each type using the numeric key as foreign key.

Lars
Yes some table joins will help and having correct data types specified helps as well (such as using 'tinyinit' for a field that will only hold small numbers). More importantly for a large database is indexing of all primary keys and and any fields used in conditional joins or 'WHERE' clauses. However it will drastically increase hard drive space usage on very large databases (such as ones with millions of rows of data). I use this in any stats project that I track (daily stats for all users) for a 30 day period (especially if there are thousands of users on the project). Indexing is something that can be done any time after everything else is in place though. Joins will not increase server load too much. How many rows do you have in your current database?
magnav0x is offline   Reply With Quote
Old 2005-12-15, 11:56   #29
OmbooHankvald
 
OmbooHankvald's Avatar
 
May 2005
Copenhagen, Denmark

172 Posts
Default

Quote:
Originally Posted by ltd
3. Pages that allow returning PRP results via web interface and not by mail/forum anymore.
Maybe you should consider using something like 12121 Search do.

Quote:
Originally Posted by OmbooHankvald from 321 thread
3) The 12121 and 2721 have a brilliant way of reserving numbers! but wouldn't it be possible for some person to just submit a fake lresults.txt and destroy the whole thing?
Quote:
Originally Posted by justinsane answer
You are absolutely correct. However there is a lot more to the reservation system than meets the eye. For example, any webmaster searching any K could feed it a sieve file of any size, and it will automatically be parsed into chunks of any given size, sorted into the reservation system and made available to the public with a click of a button. It is pretty nice when I am limited in my time availability for making new work available. I have offered this to a few other groups running similar searches, but was met with little or no desire to use this approach. And to answer your question, there is a multifold response:

1. There is absolutely nothing to gain by doing this, except for maybe a few laughs.
2. It would take a reasonable amount of time to generate a faken lresults.txt file that looks real enough to even be accepted by this parser. (Doable yes, but worth it?)
3. All submitted results are stored on two different servers and are all subject to double checking and/or verification if (and sometimes if not) flagged as a possible fake.
Maybe you could use the info

OH
OmbooHankvald is offline   Reply With Quote
Old 2005-12-15, 13:30   #30
ltd
 
ltd's Avatar
 
Apr 2003

14048 Posts
Default

Number of entries in the master table is 3653513. That is not that much. Due to the usage of index data most of the queries are quite fast.
Problem are only some request that can not make good use of the index data.

For example:
Code:
select kvalue,min(nvalue) as nmin from testresult
  where distributionstate<30 and prptest1=0 and sievestatus=0
  group by kvalue
This searches for the lowest untested n value where no factor is found and no first PRP test is returned.
Primary key is kvalue,nvalue
Secondary keys on distributionstate,sievestatus and prptest1
Explain shows usage of index sievestatus.

Lars

Last fiddled with by ltd on 2005-12-15 at 13:31
ltd is offline   Reply With Quote
Old 2005-12-15, 14:22   #31
ltd
 
ltd's Avatar
 
Apr 2003

77210 Posts
Default

Oh by the way forgot to write:
The different column use the smallest possible datastructure like tinyint or mediumint.
Only lines like the factor column use dynamic length types.(text)

Lars

Edit: I did some more tests with a different index structure using a combined index. Now response time is much better. I had used that index in the beginning but had to change it due to some bad timing. But in the meantime i had restructured the calculations that forced me to drop the combined index.
And i forgot to give it a try once more.

Now the timing and resource load of the DB is good again. I think there will be no problem anymore to run that beast on a virtual server.

Last fiddled with by ltd on 2005-12-15 at 14:32
ltd is offline   Reply With Quote
Old 2005-12-15, 14:56   #32
ltd
 
ltd's Avatar
 
Apr 2003

22·193 Posts
Default

@magnav0x:

Can you recomment some good literature/webpage about mysql performance tuning.
I tried some tthing in the last minutes and got some unexpected results.
For example i found that it is faster to use "distributionstate in (0,10,20)" instead of "distributionstate<30". This makes no difference when used in oracle. There is even a chance that the later is faster in oracle.

So i need to read something about the special behaviours of mysql to come up with additional performace gains.

Lars
ltd is offline   Reply With Quote
Old 2005-12-15, 19:09   #33
magnav0x
 
magnav0x's Avatar
 
Oct 2005

22·5 Posts
Default

Quote:
Originally Posted by ltd
@magnav0x:

Can you recomment some good literature/webpage about mysql performance tuning.
I tried some tthing in the last minutes and got some unexpected results.
For example i found that it is faster to use "distributionstate in (0,10,20)" instead of "distributionstate<30". This makes no difference when used in oracle. There is even a chance that the later is faster in oracle.

So i need to read something about the special behaviours of mysql to come up with additional performace gains.

Lars
You know, I've never used anything other than MySQL, so I don't know anything about Oracle. I will try to dig up some eBooks for you and send them your way via e-mail.

Chris
magnav0x is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
new server setup discussion mdettweiler No Prime Left Behind 15 2009-09-08 09:42
NPLB LLRnet server discussion em99010pepe No Prime Left Behind 229 2008-04-30 19:13
CRUS LLRnet server discussion em99010pepe Conjectures 'R Us 181 2008-02-04 19:51
P-1 discussion AntonVrba Prime Cullen Prime 5 2007-04-04 04:59
New Server Hardware and price quotes, Funding the server Angular PrimeNet 32 2002-12-09 01:12

All times are UTC. The time now is 16:06.


Fri Jul 16 16:06:31 UTC 2021 up 49 days, 13:53, 1 user, load averages: 1.56, 1.86, 1.81

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.