mersenneforum.org  

Go Back   mersenneforum.org > Prime Search Projects > No Prime Left Behind

Reply
 
Thread Tools
Old 2009-05-03, 09:42   #12
IronBits
I ♥ BOINC!
 
IronBits's Avatar
 
Oct 2002
Glendale, AZ. (USA)

111310 Posts
Default

No objections from me at all. All the data needs to be in the database.

Every offline work record will be flagged, so they can be excluded from the online processing stats reports.

Your project's bread and butter is getting folks to participate in some automated fashion.
They can run the client and get work from a server and return work and see their name in the online stats system setup for such things.

There will be a different stats system that will show combined offline+online work completed. simple...
IronBits is offline   Reply With Quote
Old 2009-05-03, 11:55   #13
Mini-Geek
Account Deleted
 
Mini-Geek's Avatar
 
"Tim Sorbera"
Aug 2006
San Antonio, TX USA

17·251 Posts
Default

After composing my post, I reread IronBits's reply right above mine and realized it's basically the exact same idea, so...I'll post this anyway to just say
Perhaps offline work could be marked as such when it's entered, so that we can keep the database as one but have three different stats displays: Both online and offline, online only, and offline only.
I'm not sure how it would be marked exactly, perhaps just make a new data field to mark it as offline and enter the results so that they'll all be marked as offline.
I do agree that we should hold off on importing manual stats until May is over or until we have something like what I said so that there's no jump in the online stats to throw the Free-DCers off.
Mini-Geek is online now   Reply With Quote
Old 2009-05-03, 13:54   #14
mdettweiler
A Sunny Moo
 
mdettweiler's Avatar
 
Aug 2007
USA (GMT-5)

3·2,083 Posts
Default

Quote:
Originally Posted by Mini-Geek View Post
Perhaps offline work could be marked as such when it's entered, so that we can keep the database as one but have three different stats displays: Both online and offline, online only, and offline only.
I'm not sure how it would be marked exactly, perhaps just make a new data field to mark it as offline and enter the results so that they'll all be marked as offline.
I do agree that we should hold off on importing manual stats until May is over or until we have something like what I said so that there's no jump in the online stats to throw the Free-DCers off.
yet again

Regarding how to differentiate the two stats, how about this: Three columns would replace the current score column on the k/n pair stats pages. One for manual results, one for automatic results, and one for combined. The combined one, since it's the best and most total representation of the project's work, would be the one which it's sorted to by default, and the one exported to stats systems such as Free-DC's.

How does this sound to you guys? I agree that it would be nice to allow users to track their manual and automatic results separately, but nonetheless, I feel it is important to have the combined total as the primary stats metric, since as stated above, it is the most complete representation of the work done by the project.

Any fallout from "jipped" users at Free-DC could, as Gary suggested, be mitigated by holding off the import for 3 months. However, I must say, I think 3 months is a wee bit more than is necessary--Free-DC will be done with their push at the end of May. How about we start the imports mid-June? That way, Free-DC will have racked up enough points to somewhat counter ROLP's boost from the inclusion of manual results, and we'll be doing it at a somewhat non-crucial time for the project at large.

Max
mdettweiler is offline   Reply With Quote
Old 2009-05-03, 14:16   #15
Mini-Geek
Account Deleted
 
Mini-Geek's Avatar
 
"Tim Sorbera"
Aug 2006
San Antonio, TX USA

17·251 Posts
Default

Quote:
Originally Posted by mdettweiler View Post
Regarding how to differentiate the two stats, how about this: Three columns would replace the current score column on the k/n pair stats pages. One for manual results, one for automatic results, and one for combined. The combined one, since it's the best and most total representation of the project's work, would be the one which it's sorted to by default, and the one exported to stats systems such as Free-DC's.

How does this sound to you guys? I agree that it would be nice to allow users to track their manual and automatic results separately, but nonetheless, I feel it is important to have the combined total as the primary stats metric, since as stated above, it is the most complete representation of the work done by the project.
I agree that the combined total should be the default, but I think it would be best to have three separate tables on the stats site. After all, why stop at having three columns for only the score and not the number of k/n pairs or prime score or number of primes? To keep it in one table would lack available information and/or be confusing, in my opinion.
Quote:
Originally Posted by mdettweiler View Post
Any fallout from "jipped" users at Free-DC could, as Gary suggested, be mitigated by holding off the import for 3 months. However, I must say, I think 3 months is a wee bit more than is necessary--Free-DC will be done with their push at the end of May. How about we start the imports mid-June? That way, Free-DC will have racked up enough points to somewhat counter ROLP's boost from the inclusion of manual results, and we'll be doing it at a somewhat non-crucial time for the project at large.
I think that first all the manual files should be prepared and the DB and stats pages prepared to handle manual results. I honestly don't know how long that will take. If it's still before June 8-ish when all that is ready, then I say we wait until June 8-ish so that Free-DC can finish their month-long BBQ and have some time to look at all the stats before we add the manual results and throw them off (unless they look at the online result table). Or maybe add the stuff as soon as it's ready but default to the online results until June 8-ish.

Last fiddled with by Mini-Geek on 2009-05-03 at 14:22 Reason: added last sentence
Mini-Geek is online now   Reply With Quote
Old 2009-05-03, 14:35   #16
gd_barnes
 
gd_barnes's Avatar
 
May 2007
Kansas; USA

2·5·1,009 Posts
Default

Good, I'm glad to see that everyone is OK with things here.

Doing it sooner is fine if Free-DC agrees. I don't think we'd need 3 different scores fields as that would make the page a little too busy IMHO. Although I'll leave that up to the more techie folks to decide. What I was thinking is one score field, which defaults to the total score of manual/online stats combined. But at the top of the page there could be links that allows only manual or online stats to be displayed. Of course this would require that each result have an online/manual field internally stored in it. This would be the same for total k/n pairs and primes as it would total k/n pairs score and primes score.

On another note; there is one thing and one person that we're all forgetting in all of this: The huge amount of manual work done by Beyond on the 1st three drives, which is the starting point for our manual file import.

David and other Free-DCers: When we import the manual files for the 1st thru 3rd drives, I THINK Free-DC will be the one to benefit the most! With Beyond and Carlos combined doing a HUGE amount of manual work on those drives long before you got here, I believe Free-DC will get a much larger boost than ROLP.

Where you might complain is when we start to import the individual-k drive results. Even then, though, it's very possible that ROLP's score boost from that combined with Free-DC's boost from the first 3 drives will only be moderately more. In other words, Free-DC folks have been with us from the very beginning crunching away even if some of you came in later.

Max, if you have the time to do an analysis of just the pairs manually processed by team with the files that I sent you, that might help confirm that.

David, either way, we'll hold off on importing them. If Free-DC has more from those 1st three drives, we don't want to make it easier for you guys to catch us. If we have more, we don't want to irritate the Free-DCers by making it more difficult. Therefore, at least for May anyway, things will stay as they are while we get thing ready for a future import "behind the scenes".

AMDave and Bok; here you probably thought you were done with the stats programming. Alas, the improvements never stop!


Gary

Last fiddled with by gd_barnes on 2009-05-03 at 14:36
gd_barnes is online now   Reply With Quote
Old 2009-05-03, 14:52   #17
AMDave
 
AMDave's Avatar
 
Jan 2006
deep in a while-loop

29216 Posts
Default

Heh.

I don't think that integrating the manual work will be difficult as long as the formats are consistent.

The manual work will be easy to discern in the database because there is no server ID or port#, however I will add a detail field to discriminate between them.

Producing separate and combined figures is also simple enough.

I am glad for the delay, however, as I my current focus is on migrating the stats site and integrating it with the new forum site. Work has commenced on this and is progressing well.
AMDave is offline   Reply With Quote
Old 2009-05-03, 16:30   #18
IronBits
I ♥ BOINC!
 
IronBits's Avatar
 
Oct 2002
Glendale, AZ. (USA)

3·7·53 Posts
Default

You're all missing the point about offline and online.
You want folks to join, you want to make it easy for them, you want them to have stats to encourage more participation.
Most new participants will know nothing about the project, they just want points.

We created a system (some parts not fully automated yet) where they could download the client, edit the config file and choose which project they want to crunch, connect to a server, recieve and send work, see their name in the stats, and soon, be able to pick their team, if they have registered on the NPLB Forum.
The online clients are slower than the offline clients.
Now users can compare their online participation output with other team mates and be competitive with other online members and teams.

Offline is done mostly by veterans that know how to do manual reservations
Setup the client, add the work, and manually manage all the computers and cores.
Not friendly to folks to who manage large amounts of computers/cores.
Must zip and upload/email results and report when work is completed
The offline line clients are much faster than the online clients giving it an unfair advantage.

The information needs to be put into the database so make project management easier and complete, and ensures accurate processing of all k/n pairs so nothing is missed.
This allows the database to become a very useful tool.

The stats system we setup is designed around the online processing of the work.

The three column approach is fine, but, it should be sorted by the online totals.
Those folks that have done offline work need to have a way to compare with the offline folks, and a grand total column that combines both online+offline totals for the veterans to keep track of.

Think of a sporting event, track for example. You have 50 and 100 yard dashes.
Each has a 1st 2nd and 3rd place winners.
The races are different, however, they all are just running...

Import the work as quickly as you can, as soon as you can, it's for the administrative side of the project and needs to be done.

Those that have done the offline crunching are patient veterans, the stats for the offline folks can be developed for it later and ignored for the current online crunching members.
IronBits is offline   Reply With Quote
Old 2009-05-03, 21:48   #19
mdettweiler
A Sunny Moo
 
mdettweiler's Avatar
 
Aug 2007
USA (GMT-5)

3·2,083 Posts
Default

@IB: well, first of all, the manual client is not "much faster" than the online client. It's only about 6-10%, which is definitely a boost, but not really enough to make much of a noticeable difference in someone's stats. They're quite similar enough that the comparison of different lengths of running lengths you used is not quite accurate; in reality it's more like a 1000 meter run vs. 990 meters. Not much difference. That's why I was thinking they should be counted together as the primary metric--especially considering as how down the road, we'll likely be transitioning our LLRnet servers into PRPnet servers anyway, which are just as fast as manual LLR.

Here's how I was thinking I would set it up: I would send you guys CSV files just like are automatically imported for LLRnet, except that servercode would be "MN" (for manual) and port would be 0. Dave, as soon as you can get around to setting the stats pages to screen out all results from servercode MN (at least until we can get some real pages set up for the manual work), then we can start importing the data any time and it won't affect the displayed stats until we're ready to switch it on.

As for how to display it, it sounds like users on the whole are more in favor of having separate pages, rather than separate columns on the same page, for displaying manual vs. automatic counts. How about this:

-If you click on the link for k/n pairs stats by user or team, you get a page displaying the combined manual+automatic stats. As stated earlier, I believe the stats are so similar that this should be the best and least discriminatory metric to export to Free-DC and other stats sites.

-Links on the top of the combined stats page would go to pages that are laid out the same way, but they only display manual or automatic results, respectively.

A similar system would be set up for the prime stats.

Guys, does this sound like a workable solution?

Max
mdettweiler is offline   Reply With Quote
Old 2009-05-03, 22:22   #20
Mini-Geek
Account Deleted
 
Mini-Geek's Avatar
 
"Tim Sorbera"
Aug 2006
San Antonio, TX USA

17·251 Posts
Default

Sounds fine to me. We might want to have the links temporarily default to online-only if this is prepared before Free-DC finishes their month-long rally. (I feel like I've said that before )
Mini-Geek is online now   Reply With Quote
Old 2009-05-03, 22:30   #21
mdettweiler
A Sunny Moo
 
mdettweiler's Avatar
 
Aug 2007
USA (GMT-5)

3×2,083 Posts
Default

Quote:
Originally Posted by Mini-Geek View Post
Sounds fine to me. We might want to have the links temporarily default to online-only if this is prepared before Free-DC finishes their month-long rally. (I feel like I've said that before )
Yes--actually, I was thinking that we could simply not have the website even display any manual results until Free-DC's done with their rally. Though, come to think of it, your suggestion would probably be somewhat easier to implement, not to mention more versatile.
mdettweiler is offline   Reply With Quote
Old 2009-05-04, 00:02   #22
gd_barnes
 
gd_barnes's Avatar
 
May 2007
Kansas; USA

2·5·1,009 Posts
Default

Quote:
Originally Posted by IronBits View Post
The three column approach is fine, but, it should be sorted by the online totals.
Those folks that have done offline work need to have a way to compare with the offline folks, and a grand total column that combines both online+offline totals for the veterans to keep track of.

Those that have done the offline crunching are patient veterans, the stats for the offline folks can be developed for it later and ignored for the current online crunching members.
Quote:
Originally Posted by mdettweiler View Post
As for how to display it, it sounds like users on the whole are more in favor of having separate pages, rather than separate columns on the same page, for displaying manual vs. automatic counts. How about this:

-If you click on the link for k/n pairs stats by user or team, you get a page displaying the combined manual+automatic stats. As stated earlier, I believe the stats are so similar that this should be the best and least discriminatory metric to export to Free-DC and other stats sites.

-Links on the top of the combined stats page would go to pages that are laid out the same way, but they only display manual or automatic results, respectively.

Max

Max, I'm not sure you got to the above part of David's post. He is saying that he thinks that the stats should be sorted by the online stats as a default. You are saying (well, implying) that they should be sorted by the combined stats as a default...two opposite things.

I could go either way.

Here's another solution: On the main menu on the left side, have 3 selections: One for online stats, one for off line stats, and one for combined stats. When people come into the main stats "area", don't default to anything. Set it up to where people can choose their own method of stats display and competing.

Guys, isn't that what it's all about? Choosing your own method of competing against others? I don't have a snowball's chance in heck of competing with the Beyond's or Benson's of the world on top-5000 score but I can try to compete on # of primes (well, at least against Beyond anyway, lol), so that's what I use. For NPLB, it makes sense for us to use all other non-BOINC projects to compete against. To compete with BOINC projects, we'd have to have too many non-math types that know nothing about what we are doing; which most here agree would not be very fun.

I personally like this "non default" method the best. That way everyone can compete in a manner that they would like.

David, whatever we do won't be implemented until well after the end of May. We've already agreed with that. Also, keep in mind that Free-DC will likely benefit the most from any manual stats import from our 1st 3 drives, which are our first area that we will be concentrating on.

That said, I do agree that perhaps we are missing the point so let me say this: I now hear you! Many of the people that are coming to us from Free-DC have huge #'s of resources and would not want the extremely time-consuming task of manual reservations and posting or Emailing of results. So combining manual and online stats as a default would not sit well with them.

I hope a "non default" method of stats display will work best in the future. That way, no one will feel like we are forcing our method of competing on them.

Guys, does this sound fair? AMDave, could such a method of stats display be implemented? Not right away of course. I'm only asking if it is reasonably feasible for doing somethnig like that perhaps sometime in mid-to-late June.


Gary

Last fiddled with by gd_barnes on 2009-05-04 at 00:04
gd_barnes is online now   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Oddity in P-1 data import into mersenne.ca DB Syntony PrimeNet 5 2015-12-10 12:22
CSVs for stats available + New combined stats opyrt Prime Sierpinski Project 3 2010-05-31 08:13
GMP-ECM Manual GrK GMP-ECM 2 2007-12-25 01:55
Manual ecm jasong Factoring 7 2005-08-29 19:17
P4 On Stats HiddenWarrior Hardware 2 2003-08-13 14:39

All times are UTC. The time now is 00:42.

Thu Apr 2 00:42:25 UTC 2020 up 7 days, 22:15, 3 users, load averages: 1.36, 1.23, 1.23

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.