mersenneforum.org > Data Many username mismatches between database and Primenet
 Register FAQ Search Today's Posts Mark Forums Read

 2003-09-22, 20:04 #1 GP2     Sep 2003 2,579 Posts Many username mismatches between database and Primenet As already mentioned in the Duplicate userids thread, the Sept 1 LUCAS_V.TXT file showed all entries for "GW" (George himself) replaced by the mysterious account id "S122546". In the next (Sept 9) version of this data file, this was changed back to GW. Presumably manually changed back by George. However, further investigation shows numerous instances where Primenet's list of cleared exponents (cleared.txt, updated hourly) and the data file (LUCAS_V.TXT) disagree on the user name for a returned result. For instance, cleared.txt (Sept 22 00:00 with fixed-width changed to comma-separated fields) Code: 9975863,65,D ,0x974857CF7BDC15__,20-Sep-03 09:45,TempleU-CAS,FL-SLE LUCAS_V.TXT (Sept 21 data) Code: 9975863,TempleU-DI,FL-SLE,WY5,974857CF7BDC15A8,7195223,00000000 In general, there are a bunch of results where - cleared.txt says "TempleU-CAS" but LUCAS_V.TXT says "TempleU-DI" - cleared.txt says "PaJaSoft" but LUCAS_V.TXT says "S112790" - cleared.txt says "haveland" but LUCAS_V.TXT says "S104597" - etc. (many more examples) for the same result that was returned. To add to the confusion, there are a few cases where cleared.txt also says "TempleU-DI" or "S112790" or "S104597" . Usually these are associated with only one or a few specific machines: S112790/texi-router and S104597/meron for instance are the only machines for those account ids in cleared.txt. Looking in HRF5.TXT (list of users) we find: S112790,Ing. Pavel PaJaSoft Janousek S104597,Andrew Haveland-Robinson but no entries for account ids "PaJaSoft" or "haveland". It seems unlikely that either of these users knowingly switched from a meaningful username to an obscure numerical account_id (or that George temporarily did the same). So this must have happened accidentally somehow. Note there are numerous other examples for other account ids. PaJaSoft and haveland are just given as two frequent examples. Finally, as mentioned in the User name change not reflected in BAD file thread, the "BAD" file still contains mostly the old meaningful account ids (PaJaSoft, haveland, etc) while the LUCAS_V.TXT file contains the new numerical account ids (S112790, S104597). Note though BAD does contain 2 entries for S112790 (and 58 for PaJaSoft). So in summary, the files LUCAS_V.TXT probably needs to be cleaned up in terms of reassigning the right user names. But also, does anyone know how these apparently accidental inconsistencies happen and how they can be prevented in the future?
 2003-09-22, 23:29 #2 Prime95 P90 years forever!     Aug 2002 Yeehaw, FL 23·5·173 Posts When prime95 contacts the server to submit results it sends the userid & password. If prime95 gets the response "bad password", then it asks for a new userid to submit the results. The original theory was having the results recorded was of paramount importance. Somehow there are rare times when a valid userid/password is sent, and the bad password error is returned. That's what happened to me. Now when this happens, the server thinks these are two separate users using two separate userids. Unless the user merges the two accounts the server will forever treat them as separate users. However, the primenet database is different than my database. My results processing program notices the two userids have the same user name and email address. It then assumes these two accounts should be merged and randomly picks one of the two userids. All past results are transferred to the picked userid.
2003-09-23, 00:56   #3
GP2

Sep 2003

2,579 Posts

Quote:
 Originally posted by Prime95 My results processing program notices the two userids have the same user name and email address. It then assumes these two accounts should be merged and randomly picks one of the two userids. All past results are transferred to the picked userid.
Hmmm... a couple of things:

The past results from BAD are not transfered to the picked userid, they keep using the original userid. This prevents calculating "badness ratios" for many machines, because the userids in BAD don't match those in LUCAS_V.TXT.

If we could fully match all the results in BAD and LUCAS_V.TXT by userid and computer id, we could identify which machines are the most error-prone. LL results from such machines should be targeted for early double-checking. Currently, this is done for results that have returned a nonzero error code, but many bad results have a zero error code.

Of the 9154 results in BAD with non-blank error code, 5044 have zero error code (55%).
This rises to nearly 60% if we include error code 80000000 and other relatively benign error codes. So relying on nonzero error code to identify candidates for early double check will miss over half of them.

It's a shame that the userid is picked randomly, because in the case of PaJaSoft and haveland (and others), it's clear that a single errant machine sent in a new userid and the random pick was the wrong one.

It would be very desirable to have BAD and LUCAS_V.TXT agreeing on userids (and hopefully the "right" userid). And maybe Primenet too...

Do you by any chance keep a log of cases where userids were randomly merged?

2003-09-23, 02:43   #4
GP2

Sep 2003

257910 Posts

Quote:
 Originally posted by GP2 If we could fully match all the results in BAD and LUCAS_V.TXT by userid and computer id, we could identify which machines are the most error-prone. LL results from such machines should be targeted for early double-checking. Currently, this is done for results that have returned a nonzero error code, but many bad results have a zero error code.

On second thought there's a bit of a problem with this... the HRF3.TXT file only stores the userid, it doesn't store the computer id. So while we can use BAD and LUCAS_V.TXT to identify error-prone machines, we don't know which specific machine returned any given single-checked result in HRF3.TXT. So this hampers identifying candidate exponents for early double-check.

Still, this computer-id information for single-checked exponents must be known internally (or perhaps it could simply be added to HRF3.TXT).

 2003-09-24, 15:40 #5 GP2     Sep 2003 2,579 Posts There is probably a way to largely automate the matching of mismatched usernames, by systematic comparison of exponents in cleared.txt, BAD, and LUCAS_V.TXT as well as comparison of unique machine names in common between the two users, and a final visual inspection in HRF5.TXT for extra confirmation. This would automatically detect S112790=PaJaSoft and maybe a few dozen others. Once that's done, we can match usernames in cleared.txt and the data files, and then could run a script to see if they're fully in sync. I'll give this a try and report the results.
 2003-09-24, 21:15 #6 Prime95 P90 years forever!     Aug 2002 Yeehaw, FL 23×5×173 Posts I have a file containing all merged userids. Send me an email if you want it.

 Similar Threads Thread Thread Starter Forum Replies Last Post UBR47K PrimeNet 0 2015-10-11 16:47 Dubslow PrimeNet 26 2011-12-20 03:39 LaurV GPU to 72 14 2011-12-02 07:31 dchmelik Information & Answers 0 2010-12-15 08:40 stars10250 PrimeNet 30 2009-07-02 14:13

All times are UTC. The time now is 18:53.

Wed Jul 15 18:53:02 UTC 2020 up 112 days, 16:26, 2 users, load averages: 1.14, 1.46, 1.71