mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Data

Reply
 
Thread Tools
Old 2003-09-22, 20:04   #1
GP2
 
GP2's Avatar
 
Sep 2003

22×3×5×43 Posts
Default Many username mismatches between database and Primenet

As already mentioned in the Duplicate userids thread, the Sept 1 LUCAS_V.TXT file showed all entries for "GW" (George himself) replaced by the mysterious account id "S122546".

In the next (Sept 9) version of this data file, this was changed back to GW. Presumably manually changed back by George.


However, further investigation shows numerous instances where Primenet's list of cleared exponents (cleared.txt, updated hourly) and the data file (LUCAS_V.TXT) disagree on the user name for a returned result.


For instance,

cleared.txt
(Sept 22 00:00 with fixed-width changed to comma-separated fields)

Code:
9975863,65,D ,0x974857CF7BDC15__,20-Sep-03 09:45,TempleU-CAS,FL-SLE
LUCAS_V.TXT
(Sept 21 data)

Code:
9975863,TempleU-DI,FL-SLE,WY5,974857CF7BDC15A8,7195223,00000000

In general, there are a bunch of results where

- cleared.txt says "TempleU-CAS" but LUCAS_V.TXT says "TempleU-DI"
- cleared.txt says "PaJaSoft" but LUCAS_V.TXT says "S112790"
- cleared.txt says "haveland" but LUCAS_V.TXT says "S104597"
- etc. (many more examples)

for the same result that was returned.

To add to the confusion, there are a few cases where cleared.txt also says "TempleU-DI" or "S112790" or "S104597" . Usually these are associated with only one or a few specific machines: S112790/texi-router and S104597/meron for instance are the only machines for those account ids in cleared.txt.


Looking in HRF5.TXT (list of users) we find:

S112790,Ing. Pavel PaJaSoft Janousek
S104597,Andrew Haveland-Robinson

but no entries for account ids "PaJaSoft" or "haveland".

It seems unlikely that either of these users knowingly switched from a meaningful username to an obscure numerical account_id (or that George temporarily did the same). So this must have happened accidentally somehow.

Note there are numerous other examples for other account ids. PaJaSoft and haveland are just given as two frequent examples.

Finally, as mentioned in the User name change not reflected in BAD file thread, the "BAD" file still contains mostly the old meaningful account ids (PaJaSoft, haveland, etc) while the LUCAS_V.TXT file contains the new numerical account ids (S112790, S104597). Note though BAD does contain 2 entries for S112790 (and 58 for PaJaSoft).


So in summary, the files LUCAS_V.TXT probably needs to be cleaned up in terms of reassigning the right user names.

But also, does anyone know how these apparently accidental inconsistencies happen and how they can be prevented in the future?
GP2 is offline   Reply With Quote
Old 2003-09-22, 23:29   #2
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

17×421 Posts
Default

When prime95 contacts the server to submit results it sends the userid & password. If prime95 gets the response "bad password", then it asks for a new userid to submit the results. The original theory was having the results recorded was of paramount importance. Somehow there are rare times when a valid userid/password is sent, and the bad password error is returned. That's what happened to me.

Now when this happens, the server thinks these are two separate users using two separate userids. Unless the user merges the two accounts the server will forever treat them as separate users. However, the primenet database is different than my database. My results processing program notices the two userids have the same user name and email address. It then assumes these two accounts should be merged and randomly picks one of the two userids. All past results are transferred to the picked userid.
Prime95 is offline   Reply With Quote
Old 2003-09-23, 00:56   #3
GP2
 
GP2's Avatar
 
Sep 2003

22·3·5·43 Posts
Default

Quote:
Originally posted by Prime95
My results processing program notices the two userids have the same user name and email address. It then assumes these two accounts should be merged and randomly picks one of the two userids. All past results are transferred to the picked userid.
Hmmm... a couple of things:

The past results from BAD are not transfered to the picked userid, they keep using the original userid. This prevents calculating "badness ratios" for many machines, because the userids in BAD don't match those in LUCAS_V.TXT.

If we could fully match all the results in BAD and LUCAS_V.TXT by userid and computer id, we could identify which machines are the most error-prone. LL results from such machines should be targeted for early double-checking. Currently, this is done for results that have returned a nonzero error code, but many bad results have a zero error code.

Of the 9154 results in BAD with non-blank error code, 5044 have zero error code (55%).
This rises to nearly 60% if we include error code 80000000 and other relatively benign error codes. So relying on nonzero error code to identify candidates for early double check will miss over half of them.


It's a shame that the userid is picked randomly, because in the case of PaJaSoft and haveland (and others), it's clear that a single errant machine sent in a new userid and the random pick was the wrong one.


It would be very desirable to have BAD and LUCAS_V.TXT agreeing on userids (and hopefully the "right" userid). And maybe Primenet too...

Do you by any chance keep a log of cases where userids were randomly merged?
GP2 is offline   Reply With Quote
Old 2003-09-23, 02:43   #4
GP2
 
GP2's Avatar
 
Sep 2003

A1416 Posts
Default

Quote:
Originally posted by GP2
If we could fully match all the results in BAD and LUCAS_V.TXT by userid and computer id, we could identify which machines are the most error-prone. LL results from such machines should be targeted for early double-checking. Currently, this is done for results that have returned a nonzero error code, but many bad results have a zero error code.

On second thought there's a bit of a problem with this... the HRF3.TXT file only stores the userid, it doesn't store the computer id. So while we can use BAD and LUCAS_V.TXT to identify error-prone machines, we don't know which specific machine returned any given single-checked result in HRF3.TXT. So this hampers identifying candidate exponents for early double-check.

Still, this computer-id information for single-checked exponents must be known internally (or perhaps it could simply be added to HRF3.TXT).
GP2 is offline   Reply With Quote
Old 2003-09-24, 15:40   #5
GP2
 
GP2's Avatar
 
Sep 2003

22×3×5×43 Posts
Default

There is probably a way to largely automate the matching of mismatched usernames, by systematic comparison of exponents in cleared.txt, BAD, and LUCAS_V.TXT as well as comparison of unique machine names in common between the two users, and a final visual inspection in HRF5.TXT for extra confirmation.

This would automatically detect S112790=PaJaSoft and maybe a few dozen others.

Once that's done, we can match usernames in cleared.txt and the data files, and then could run a script to see if they're fully in sync.

I'll give this a try and report the results.
GP2 is offline   Reply With Quote
Old 2003-09-24, 21:15   #6
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

17×421 Posts
Default

I have a file containing all merged userids. Send me an email if you want it.
Prime95 is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Get username through public name? UBR47K PrimeNet 0 2015-10-11 16:47
PrimeNet Database backup? Dubslow PrimeNet 26 2011-12-20 03:39
DC mismatches LaurV GPU to 72 14 2011-12-02 07:31
I used the wrong username dchmelik Information & Answers 0 2010-12-15 08:40
username and password in url? stars10250 PrimeNet 30 2009-07-02 14:13

All times are UTC. The time now is 23:06.

Fri Sep 25 23:06:25 UTC 2020 up 15 days, 20:17, 1 user, load averages: 1.48, 1.49, 1.40

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.