View Single Post
Old 2010-04-17, 10:49   #14
A Sunny Moo
mdettweiler's Avatar
Aug 2007

3×2,083 Posts

Originally Posted by gd_barnes View Post
Why are we making this so extremely difficult? I have now stepped in and recommended that we dispense with this entire reloading of every result that we have into the DB.

The intent was that this be a manual results import not an all results import, which would take months.

If some stuff got overwritten, that is very bad news and shows that we did not properly analyze the situation ahead of time. If that is what happened, is there a way that we can restore them?

There should be very few "unknown" results. Perhaps no more than 50-100 on the entire project. For manual results, we should know who did all of them and it's just a matter of looking it up in the 1st post of each drive. The only ones that should end up entered in the DB as "unknown" are where the server might have somehow missed 1-2 of them and either Max, Karsten, or I ran the pairs manually and put them in our file that we keep so that they matched up with original sieve file.

Max, all results for the entire 5th thru 10th drives and mini-drive are now on Jeepford. Please analyze which ones of them were done manually, associate who did them and when they were done, and load only those into the DB. For each drive, the manual loading should go fairly quickly. Since this project started the 5th drive, 95-98% of all results have been done by the servers. The lion's share of the manual stats import is coming from the 1st thru 3rd drives but even those were largely done by servers.

Going forward, the only drives that should take a little while to load into the DB are the fully manual ones; that is the individual-k and mini drive. Everything else after the 3rd drive should go very fast.

In the future, before loading any manual pairs into the DB, I want to review what is being loaded.

Let me clarify: nothing has been overwritten. All that happened was that some rather large chunks of work that had never been loaded the first time around were imported just now under "Unknown", therefore uncovering for us a gaping hole in our DB--an almost 40K LLRnet range from the early 3rd Drive was missing entirely. Since Karsten has all the results on file, all we have to do is "upsert" Karsten's files into the DB (as Dave termed it) so that the "real" results replace the "Unknown"s where applicable. Easy peasy.

Rest assured, though, yes, I will not try to import the LLRnet ranges along with the manual ones for the later drives; it was only the 1st, 2nd, and 3rd I was worried about since a lot about our servers was in a state of flux at that time and there was a very high probability of errors having been made that had gone unnoticed in the years since, and indeed that's what we turned up with this. But by the time of anything later, the server process had been cleaned up to essential clockwork, so there shouldn't be anything significant missing from there on out and therefore no need to further "re-import" any LLRnet ranges.

BTW, regarding the prime issue which we uncovered: having the "Unknown" results imported like this actually revealed a bug in Dave's duplicate-screening process which would have otherwise gone unnoticed. Now a fix is on the way.

Meanwhile, per Dave's suggestion I'm splitting off all posts related to this to a separate thread since they don't really belong in the News thread.
mdettweiler is offline   Reply With Quote