mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Software (https://www.mersenneforum.org/forumdisplay.php?f=10)
-   -   Unintended assignment assimilation bug (https://www.mersenneforum.org/showthread.php?t=21584)

ckdo 2016-09-17 16:24

Unintended assignment assimilation bug
 
A little over an hour ago, I went ahead and registered four assignments on M7351 and another four on M8389. I was somewhat surprised to see that I actually got [I]five[/I] assignments on M7351, and that a pre-existing anonymous assignment on M7351 was now mine.

This is the contents of Prime95's window (AIDs redacted, more worker lines skipped):
[code]
[Main thread Sep 17 16:49] Optimizing for CPU architecture: Core i3/i5/i7, L2 cache size: 256 KB, L3 cache size: 6 MB
[Comm thread Sep 17 16:49] Registering assignment: ECM M7351
[Main thread Sep 17 16:49] Starting workers.
[Worker #1 Sep 17 16:49] Worker starting
[Worker #3 Sep 17 16:49] Waiting 10 seconds to stagger worker starts.
[Worker #1 Sep 17 16:49] Setting affinity to run worker on logical CPU #1
[Worker #2 Sep 17 16:49] Waiting 5 seconds to stagger worker starts.
[Worker #4 Sep 17 16:49] Waiting 15 seconds to stagger worker starts.
[Comm thread Sep 17 16:49] Assignment registered as:
[Comm thread Sep 17 16:49] Registering assignment: ECM M8389
[Comm thread Sep 17 16:49] Assignment registered as: 7335[...]342D
[Comm thread Sep 17 16:49] Sending expected completion date for M8389: Oct 14 2016
[Comm thread Sep 17 16:49] Registering assignment: ECM M7351
[Comm thread Sep 17 16:49] Assignment registered as: 21B9[...]9D55
[Comm thread Sep 17 16:49] Sending expected completion date for M7351: Oct 05 2016
[Comm thread Sep 17 16:50] Registering assignment: ECM M8389
[Comm thread Sep 17 16:50] Assignment registered as: 1315[...]2298
[Comm thread Sep 17 16:50] Sending expected completion date for M8389: Oct 14 2016
[Comm thread Sep 17 16:50] Registering assignment: ECM M7351
[Comm thread Sep 17 16:50] Assignment registered as: 7366[...]B72A
[Comm thread Sep 17 16:50] Sending expected completion date for M7351: Oct 05 2016
[Comm thread Sep 17 16:50] Registering assignment: ECM M8389
[Comm thread Sep 17 16:50] Assignment registered as: 9C90[...]971D
[Comm thread Sep 17 16:50] Sending expected completion date for M8389: Oct 14 2016
[Comm thread Sep 17 16:50] Registering assignment: ECM M7351
[Comm thread Sep 17 16:50] Assignment registered as: F536[...]AF20
[Comm thread Sep 17 16:50] Sending expected completion date for M7351: Oct 05 2016
[Comm thread Sep 17 16:50] Registering assignment: ECM M8389
[Comm thread Sep 17 16:50] Assignment registered as: 41CD[...]B0DC
[Comm thread Sep 17 16:50] Sending expected completion date for M8389: Oct 14 2016
[Comm thread Sep 17 16:50] Registering assignment: ECM M7351
[Comm thread Sep 17 16:50] Assignment registered as: 162A[...]B6E0
[Comm thread Sep 17 16:50] Sending expected completion date for M7351: Oct 05 2016
[Comm thread Sep 17 16:50] Done communicating with server.
[/code]

Take note of (a) the "Assignment registered" line with no AID and (b) the out-of-order (re-?) registration that was last in line.

My prime.log agrees:
[code]
[Sat Sep 17 16:49:57 2016 - ver 28.5]
Registering assignment: ECM M7351
Assignment registered as:
Registering assignment: ECM M8389
Assignment registered as: 7335[...]342D
Sending expected completion date for M8389: Oct 14 2016
Registering assignment: ECM M7351
Assignment registered as: 21B9[...]9D55
Sending expected completion date for M7351: Oct 05 2016
Registering assignment: ECM M8389
Assignment registered as: 1315[...]2298
Sending expected completion date for M8389: Oct 14 2016
Registering assignment: ECM M7351
Assignment registered as: 7366[...]B72A
Sending expected completion date for M7351: Oct 05 2016
Registering assignment: ECM M8389
Assignment registered as: 9C90[...]971D
Sending expected completion date for M8389: Oct 14 2016
Registering assignment: ECM M7351
Assignment registered as: F536[...]AF20
Sending expected completion date for M7351: Oct 05 2016
Registering assignment: ECM M8389
Assignment registered as: 41CD[...]B0DC
Sending expected completion date for M8389: Oct 14 2016
Registering assignment: ECM M7351
Assignment registered as: 162A[...]B6E0
Sending expected completion date for M7351: Oct 05 2016
[/code]

This is what the top part of /workload lists:
[code]
CPU Name Core Exp. Work Type Age Assigned Updated Next Update Est. Compl. Days to go

durandal 1 7,351 ECM 25 2016-08-23 2016-08-23 2017-02-09
durandal 2 7,351 ECM 0 2016-09-17 2016-09-17 2016-09-18 2016-10-05 18
durandal 3 7,351 ECM 0 2016-09-17 2016-09-17 2016-09-18 2016-10-05 18
durandal 4 7,351 ECM 0 2016-09-17 2016-09-17 2016-09-18 2016-10-05 18
durandal 1 7,351 ECM 0 2016-09-17 2016-09-17 2016-09-18 2016-10-05 18
[/code]

And the lower part of /workload has:
[code]
ECM2=D5CB[...]1D45,1,2,7351,-1,44000000,4400000000,150
ECM2=21B9[...]9D55,1,2,7351,-1,44000000,4400000000,635
ECM2=7366[...]B72A,1,2,7351,-1,44000000,4400000000,635
ECM2=F536[...]AF20,1,2,7351,-1,44000000,4400000000,634
ECM2=162A[...]B6E0,1,2,7351,-1,44000000,4400000000,635
[/code]

The 150 curves assignment is not mine (well, I guess it is now, technically). I'm unsure whether this is a client-side bug, a server-side one, or a combination of both. For now, I will leave the extra assignment untouched to aid the bug hunting. I would be glad if those capable would fix this bug.

GP2 2016-09-17 18:49

The only situation I can think of where an ECM assignment is spontaneously generated is if you have the setting ContinueECM=1 in prime.txt and a factor is found.

For instance, if you have an assignment for 500 curves and a factor is found on the 293rd curve, then this setting will create a new assignment for 207 curves, so that it completes the full original allotment of 500. Otherwise, without ContinueECM=1 it just moves on to the next exponent immediately, having only done 293 curves instead of 500.

The [URL="http://www.mersenne.org/report_exponent/?exp_lo=7351&full=1"]exponent report for M7351[/URL] says there was an assignment to you on 2016-08-23. What do your prime.log and results.txt files show for that date?

ckdo 2016-09-17 21:47

Let me restate that: up until earlier today that 150 curves assignment from 8/23 was an anonymous one, seemingly reserved via the manual assignment form and never since updated. No new assignment was created - the ownership of an existing one changed.

As far as the log files on my machines are concerned, they don't mention M7351 except for my previous assignments, which have finished.

I didn't have M7351 listed on my workload page just before I edited the worktodo.txt to reserve it four times, and I certainly wouldn't have requested a single 150 curves assignment on that exponent.

ckdo 2016-10-12 07:42

The bug has manifested itself again. Different machine, different version of Prime95, different exponent, same story.

Prime.log:

[code]
[Sun Oct 09 20:12:47 2016 - ver 28.9]
Registering assignment: ECM M8923
Assignment registered as:
Registering assignment: ECM M8999
Assignment registered as: 4CA6[...]7127
Sending expected completion date for M8999: Aug 25 2017
Registering assignment: ECM M8681
Assignment registered as: 6D46[...]3CC6
Sending expected completion date for M8681: Oct 20 2016
Registering assignment: ECM M8923
Assignment registered as: F1B6[...]76F3
Sending expected completion date for M8923: Oct 29 2016
Registering assignment: ECM M8681
Assignment registered as: 1C68[...]EE6A
Sending expected completion date for M8681: Oct 19 2016
Registering assignment: ECM M8923
Assignment registered as: C76E[...]6301
Sending expected completion date for M8923: Oct 29 2016
Registering assignment: ECM M8923
Assignment registered as: 2034[...]8032
Sending expected completion date for M8923: Oct 19 2016
Registering assignment: ECM M8999
Assignment registered as: 35F8[...]4AC6
Sending expected completion date for M8999: Oct 28 2016
Registering assignment: ECM M8923
Assignment registered as: 240B[...]ADE9
Sending expected completion date for M8923: Oct 20 2016
[/code]

/workload, top:

[code]
leela 4 8,923 ECM 2 2016-10-09 2016-10-11 2016-10-12 2016-10-18 7
leela 1 8,923 ECM 2 2016-10-09 2016-10-11 2016-10-12 2016-10-19 7
leela 2 8,923 ECM 2 2016-10-09 2016-10-11 2016-10-12 2016-10-28 16
leela 3 8,923 ECM 2 2016-10-09 2016-10-11 2016-10-12 2016-10-27 16
leela 1 8,923 ECM 36 2016-09-05 2016-09-05 2017-02-22
[/code]

/workload, bottom:

[code]
ECM2=D3AD[...]9B62,1,2,8923,-1,44000000,4400000000,150
ECM2=F1B6[...]76F3,1,2,8923,-1,44000000,4400000000,500
ECM2=C76E[...]6301,1,2,8923,-1,44000000,4400000000,500
ECM2=2034[...]8032,1,2,8923,-1,44000000,4400000000,500
ECM2=240B[...]ADE9,1,2,8923,-1,44000000,4400000000,500
[/code]

As usual :sad:, the 150 curves assignment was anonymous before.

Come to think of it, I guess another occurence of the bug was way back in 2014 (on yet another machine): M11251.

Sadly, I couldn't find a backup copy of prime.log from the relevant timeframe and it has since been overwritten, so all I have is an expired assignment that has been the next expected one for that machine since 2014 (no kidding).

But that is yet another bug that's been around for ages (and I have more).

In summary... #FIXME, maybe?

ckdo 2016-10-20 07:56

And today, I snatched M9227:

[code]
[Thu Oct 20 08:53:10 2016 - ver 28.9]
Registering assignment: ECM M9227
Assignment registered as:
Registering assignment: ECM M9227
Assignment registered as: 21C5[...]F949
Sending expected completion date for M9227: Nov 04 2016
Registering assignment: ECM M9227
Assignment registered as: 59A1[...]520A
Sending expected completion date for M9227: Nov 04 2016
Registering assignment: ECM M9227
Assignment registered as: D6BB[...]9AE7
Sending expected completion date for M9227: Oct 31 2016
Registering assignment: ECM M9227
Assignment registered as: C897[...]3D2F
Sending expected completion date for M9227: Oct 31 2016
[/code]

Since I'm not gonna process any of those 150 curves assignments anyway, I unreserved them - have enough "expired" assignments already (whereever these came from - ca. 15,000 that I know of, probably more).

May the AIDs aid in tracking this bug down:

[code]
ECM2=D5CBD485D2A3023A8EFCB3F847961D45,1,2,7351,-1,44000000,4400000000,150
ECM2=D3ADC9DB04E8110B5A343F9E53119B62,1,2,8923,-1,44000000,4400000000,150
ECM2=898467170D9AE57BD200F0C1A8EC770E,1,2,9227,-1,44000000,4400000000,150
[/code]

Bonus info: The faulty assignment was always on worker #1, and reserved right after starting Prime95 (hence the absurd completion dates, yet another bug). I do not intend to switch to manual communication to see if I can get the bug to occur again that way.

Prime95 2017-03-08 06:58

I do not understand why prime.log reported an empty string as the assimilated assignment ID. I could replicate the assimilation, but not the empty assignment ID.

I did change the register assignment PHP code so that it will not transfer an anonymous ECM assignment.

Transferring assignments from the anonymous account was implemented for manual LL tests that were accidentally assigned to users that forgot to log in to the web site. It was also put in place for assignments that were received prior to the Primenet v5 server going live. Perhaps we should consider disabling this "feature" completely.


All times are UTC. The time now is 18:21.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.