Thread: LLRnet servers for NPLB View Single Post
 2009-08-20, 07:34 #1209 gd_barnes     May 2007 Kansas; USA 244658 Posts After further investigation, I see that the joblist.txt file is updated continuously. Although it is possible that something could slip through the cracks during a power outage and cause a minimal # of pairs to be assigned twice, the problem should be very isolated. I was able to nail down to the power outage just 10 rejected pairs in port G8000 that had been handed out twice. That is certainly a reasonable explanation for a small # of them. The numerous rejected results since then is another matter and may NOT be as a result of the server dropping and then coming back up. Now, we have another problem today. Karsten, you have already returned 19 rejected results as of 2:00 AM CDT Aug. 20th (7:00 AM GMT), all of which already had returned results by you from about 11:45 AM CDT (4:45 PM GMT) Aug. 19th. Here is a list of them: Code: 345 971680 339 971681 321 971682 345 971686 339 971769 321 971773 315 971774 339 971777 321 971858 339 971859 327 971861 339 971861 345 971931 327 971936 345 971937 321 971941 339 972019 345 972019 327 972024 Here is what I found out: In checking stdout.log, it appears that all of the offending pairs had been previously originally assigned to you (or me in 2-3 cases), then they appeared to be cancelled in error, and then they were re-assigned to you. Karsten, at this point, I have 2 revelations for you: Revelation #1: All of the pairs were originally returned by you in < 3 mins., some in as few as 36 seconds and further...they were the only pairs on that day with such short timings!! Now, unless you've invented some new software or mathetmatics that none of the rest of us are aware of, that would not be possible unless you crunched those pairs before you got them from the server. Would you care to enlighten us on how you managed that? Revelation #2: In every case above, it appears that you reserved them in 30-pair chunks BUT...only the last 4 in EVERY case ended up with a problem. Conclusion: Something is going on wrong with the last 4 pairs out of the 30 that Karsten caches each time. Max, here is an example from the stdout.log file: Code: connection closed (socket 4) connection reqeust from 14e48554:1242 (socket 5) (20 misc. proposed pairs) Proposing pair 327/971841 to kar_bon Proposing pair 333/971843 to kar_bon Proposing pair 345/971844 to kar_bon Proposing pair 315/971845 to kar_bon Proposing pair 327/971845 to kar_bon Proposing pair 345/971847 to kar_bon Proposing pair 321/971858 to kar_bon } Proposing pair 339/971859 to kar_bon } all were eventually cancelled, handed out Proposing pair 327/971861 to kar_bon } a 2nd time, and ultimately rejected Proposing pair 339/971861 to kar_bon } connection closed (socket 5) connection reqeust from 14e48554:1247 (socket 4) In every case above, it was always the last 4 pairs before the connection was closed that the pairs were cancelled and handed out a 2nd time. This caused the results to be returned twice with the 2nd set of results being rejected. Revelation #3: All of the offending pairs were cancelled and reassigned in one fell swoop in consecutive fashion! To demonstrate my great predictive capality (lol), I will predict that before today is done, the following pair will also be rejected: 333 972027 Because that is the final pair in the final group of 4 above and is also the final pair that was cancelled in error and reassigned in the group of 20 consecutive cancelled-reassigned pairs. Max, Karsten, or anyone else...any thoughts on this? Gary Last fiddled with by gd_barnes on 2009-08-20 at 08:26