mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU to 72 (https://www.mersenneforum.org/forumdisplay.php?f=95)
-   -   Assignment died of dysentery (https://www.mersenneforum.org/showthread.php?t=16352)

kladner 2012-11-07 06:23

[QUOTE=flashjh;317347]Can you post a screen shot? Is there a reason you can't just not run the DC instance until you want more exponents?[/QUOTE]

I could post a screen shot, on the next occasion. The thing is, I haven't run the "slaved" (DC) instance in over a month. It is the full-time, P-1 [U]3rd worker[/U] which has acquired DC assignments. I most recently moved the DC assignments to my CuLu worktodo, and manually balanced P-1 assignments among the three workers. It will take a day or so to see how things happen next.

(However, those CuLu assignments are going to languish for quite a while.)

garo 2012-11-07 11:08

I find that if you have two instances on the same CPU but with different work preferences, the Primenet server gets confused sometimes.

kladner 2012-11-07 16:28

[QUOTE=garo;317397]I find that if you have two instances on the same CPU but with different work preferences, the Primenet server gets confused sometimes.[/QUOTE]

It has occurred to me that perhaps I should use a different machine name for the second instance. I can see how that might cause confusion.

garo 2012-11-08 09:55

I am not sure that even that would work. Prime95/net generates a CPUID for each machine and if ID is the same for both instances they will essentially be the same machine in Prime95's eye. One thing you could do is manually check that those IDs in the prime.txt (or local.txt) file are NOT the same and if they are, delete the ID from one instance file and see if that helps.

kladner 2012-11-09 00:07

ATM, I have followed a suggestion from chalsall, which was to change the work type for the thread which was getting DCs instead of P-1. Specifically, I changed it from P-1 to DC, then had P95 phone home to update the information. I then immediately changed it back to P-1, and forced another update with the server. Since the worktodo sections for each worker were filled and balanced (manually), no assignments were gotten during this exercise.

However, as assignments have completed, a few more have been obtained and they have all been correct P-1s so far. I'm going to observe for a while before I consider anything else.

kracker 2012-11-09 00:10

[QUOTE=kladner;317609]ATM, I have followed a suggestion from chalsall, which was to change the work type for the thread which was getting DCs instead of P-1. Specifically, I changed it from P-1 to DC, then had P95 phone home to update the information. I then immediately changed it back to P-1, and forced another update with the server. Since the worktodo sections for each worker were filled and balanced (manually), no assignments were gotten during this exercise.

However, as assignments have completed, a few more have been obtained and they have all been correct P-1s so far. I'm going to observe for a while before I consider anything else.[/QUOTE]

[OT]
Happy 1000'th post b'day btw!
[/OT]

kladner 2012-11-09 00:48

[QUOTE=kracker;317610][OT]
Happy 1000'th post b'day btw!
[/OT][/QUOTE]

Thanks! I hadn't noticed.

kladner 2012-11-13 17:49

[QUOTE=kladner;317609]ATM, I have followed a suggestion from chalsall, which was to change the work type for the thread which was getting DCs instead of P-1. Specifically, I changed it from P-1 to DC, then had P95 phone home to update the information. I then immediately changed it back to P-1, and forced another update with the server. Since the worktodo sections for each worker were filled and balanced (manually), no assignments were gotten during this exercise.

However, as assignments have completed, a few more have been obtained and they have all been correct P-1s so far. I'm going to observe for a while before I consider anything else.[/QUOTE]

A subsequent assignment run (manual, so I can watch it) turned up another DC on the 3rd worker. Checking the CPU settings on my PrimeNet account showed that the 3rd worker had not switched back to P-1 when I changed the setting in P95. I changed that and am waiting to see what happens with further assignment runs.

kladner 2012-12-30 16:39

1 Attachment(s)
I just noticed that the charts for Overall Worker Progress and LLTF progress do not agree as to how many assignments I have out. The Overall line agrees with all the other pages I have checked, such as "View Assignments" and Individual Overall Statistics."

petrw1 2013-01-02 19:45

[QUOTE=chalsall;315696]OK, OK... I'll look into it.

It will be a Stupid Programmer Error.[/QUOTE]

Getting worse . . .
Workers Double Check Testing Progress: reports 29 out and 532 done
If I click on me for my personal report is shows 9 out and 600 done.
9 out is correct. I'm not sure what is correct for done.

Could it be related to when or how I unreserve GPU72 assignments? I tend to do that. Lately I either unreserve directly from the client so that the proxy will catch it OR if I have to unreserve from the server I will also unreserve from GPU72. I didn't always do it this way.

chalsall 2013-01-02 20:51

[QUOTE=petrw1;323402]Could it be related to when or how I unreserve GPU72 assignments? I tend to do that. Lately I either unreserve directly from the client so that the proxy will catch it OR if I have to unreserve from the server I will also unreserve from GPU72. I didn't always do it this way.[/QUOTE]

Thanks -- that was (part) of the problem. If you unreserved the candidate from the server, but not from GPU72 nor your client, it would set a status code in the Assignment record which some of the other reports didn't recognize. Note that this was only for those using the Proxy. And only you were effected by this.

I'm having to go through about 50 records by hand to reverse out of this mix-up. Everything should be nominal in a few hours.

(The other part had to do with a quick-and-dirty fix I made to one of the back-end scripts to ignore LMH work until a more scaleable solution was implemented. Naturely, it was late and I didn't test it, so for the last couple of days the Workers report was not being updated for DC and LL...)


All times are UTC. The time now is 09:55.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.