mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > PrimeNet

Reply
 
Thread Tools
Old 2015-10-13, 20:54   #1
fivemack
(loop (#_fork))
 
fivemack's Avatar
 
Feb 2006
Cambridge, England

2×29×109 Posts
Default What does 'expected completion date' even mean?

Code:
[Work thread Oct 13 21:18] Iteration: 34620000 / 40049099 [86.44%], ms/iter:  8.061, ETA: 12:09:25
[Comm thread Oct 13 21:19] Updating computer information on the server
[Comm thread Oct 13 21:19] Sending expected completion date for M40049099: Oct 15 2015
[Comm thread Oct 13 21:19] Sending expected completion date for M36400057: Oct 26 2015
Why, when mprime has been getting at least 23x12 CPU-hours of compute per 24 hours realtime for the last 48 hours, does it still put in a fudge factor of more than two when estimating the expected completion date? Why does it think the next number will take eleven days when this one took not quite four?
fivemack is offline   Reply With Quote
Old 2015-10-13, 21:05   #2
Madpoo
Serpentine Vermin Jar
 
Madpoo's Avatar
 
Jul 2014

29·113 Posts
Default

Quote:
Originally Posted by fivemack View Post
Code:
[Work thread Oct 13 21:18] Iteration: 34620000 / 40049099 [86.44%], ms/iter:  8.061, ETA: 12:09:25
[Comm thread Oct 13 21:19] Updating computer information on the server
[Comm thread Oct 13 21:19] Sending expected completion date for M40049099: Oct 15 2015
[Comm thread Oct 13 21:19] Sending expected completion date for M36400057: Oct 26 2015
Why, when mprime has been getting at least 23x12 CPU-hours of compute per 24 hours realtime for the last 48 hours, does it still put in a fudge factor of more than two when estimating the expected completion date? Why does it think the next number will take eleven days when this one took not quite four?
I've found the ETAs generated by Prime95/mprime to be wildly erratic and inaccurate.

It all has to do with the rolling average that it calculates as it goes, but even then, if left to it's own devices for a while, the ETAs it spits out are off by (as you've noted) a factor of two or even more sometimes.

I stuff work into all my systems manually and I try not to give any machine more than 2-3 weeks of stuff at a time so that I can add in new things here and there without too much waiting. So being able to look at the stats and get at least a vague idea of how many days I have is kind of important.

It's to the point now where I can guess, more or less, but it's still weird when a system decides to freak out and the rolling average drops below 1000 for some reason.

On a certain system I can make the rolling average extremely wrong by doing one large (70M) and one small (35M) exponent on different workers. For some reason that really makes Prime95 have a hard time figuring out just how fast it's going.

Plus I think the automatic rolling average tops out at 4000, so on a few dual 6-core systems it hits 4000 and then stops, but to be accurate it would need to be somewhere around 4500. I can set it to that but then the auto thing kicks in at some point and puts it back to 4000 so I just give up.

I wouldn't worry about it too much, most of the time. There are the exceptions where an unusually optimistic setting will pick up what it thinks is a few weeks of work but then the client is chugging along so slowly that it finishes a single LL test in a year. We saw that with some of those grandfathered assignments.

Those systems either run something else that's a CPU hog, or they have their client configured really weird, or the system is only powered on for 5 hours a day, 5 days a week or something bizarre. Either way, the rolling average only goes down so low as well and at some point it's ETAs are so overly optimistic as to be ludicrous. LOL
Madpoo is offline   Reply With Quote
Old 2015-10-13, 21:15   #3
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

2·43·83 Posts
Default

The ETA value you posted is based on the 8.061 ms/iter average over the last few thousand iterations.

The expected completion dates sent to the server are based on a weighted moving average of your computer's actual performance vs. the expected performance over many 12-hour intervals. If your computer has been off or doing other work during anytime during the last several days, then it may take some time for the moving average to get back to where it should be for a computer that is on 24x7.

The good news is that the estimated completion dates aren't too important to the primenet server. They affect when your next work unit is queued up.

Last fiddled with by Prime95 on 2015-10-13 at 21:17
Prime95 is online now   Reply With Quote
Old 2015-10-13, 22:13   #4
petrw1
1976 Toyota Corona years forever!
 
petrw1's Avatar
 
"Wayne"
Nov 2006
Saskatchewan, Canada

442410 Posts
Default

Quote:
Originally Posted by Prime95 View Post
The ETA value you posted is based on the 8.061 ms/iter average over the last few thousand iterations.

The expected completion dates sent to the server are based on a weighted moving average of your computer's actual performance vs. the expected performance over many 12-hour intervals. If your computer has been off or doing other work during anytime during the last several days, then it may take some time for the moving average to get back to where it should be for a computer that is on 24x7.

The good news is that the estimated completion dates aren't too important to the primenet server. They affect when your next work unit is queued up.
I have an odd case: I have a 4-core CPU that is a very impressive i7-4770 but is so RAM constrained that if I run even 2 cores on LL/PC/PM1/ECM its efficiency drops by almost 50%. So I run LL on 2 cores and TF on 2 other cores. Running TF on the other 2 cores is not screaming fast either but it does not impact the 2 cores doing LL and I just hate wasting cores.

Here are some specs:
Code:
Model	Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz
Features	4 core, hyperthreaded, Prefetch,SSE,SSE2,SSE4,AVX,AVX2,FMA,
Speed	3.392 GHz (24.503 GHz P4 effective equivalent)
Computer Memory	4008 MB   configured usage 800 MB day / 800 MB night
CPU rolling average	2083 / 1000 (208%)   Execution priority 1 (default)
It has been running with this setup for much of the past year. LL in the 60M range takes about 22 days for a daily thru-put of about 11.5 GhzDays/Day for these 2 cores; the 2 cores doing TF have thru-put of about 9.5 GhzDays/Day.

However to this day, a year later, it continues to tell me it will complete a 60M LL in 8 days (not 22) on the PrimeNet web assignments page.
petrw1 is offline   Reply With Quote
Old 2015-10-14, 00:06   #5
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

2·43·83 Posts
Default

Quote:
Originally Posted by petrw1 View Post
However to this day, a year later, it continues to tell me it will complete a 60M LL in 8 days (not 22) on the PrimeNet web assignments page.
The moving average code certainly has its deficiencies.
Prime95 is online now   Reply With Quote
Old 2015-10-14, 14:44   #6
petrw1
1976 Toyota Corona years forever!
 
petrw1's Avatar
 
"Wayne"
Nov 2006
Saskatchewan, Canada

10001010010002 Posts
Default

Quote:
Originally Posted by Prime95 View Post
The moving average code certainly has its deficiencies.
It works great for every other PC....something about this one if confusing it
petrw1 is offline   Reply With Quote
Old 2015-10-14, 15:51   #7
Madpoo
Serpentine Vermin Jar
 
Madpoo's Avatar
 
Jul 2014

29×113 Posts
Default

Quote:
Originally Posted by petrw1 View Post
It works great for every other PC....something about this one if confusing it
It's probably the different work types you're doing. Like, would it base the rolling average on the LL work, or the TF work, or some mixture of both?

It sounds like it's seeing the TF workers and it may be going a bit faster than expected for that system, so that's affecting the estimations for the LL work also.

In the end though, unless it could somehow maintain a separate rolling average for each worker thread instead of an aggregate, my advice is to just ignore it and do your own estimations on how long things will take.
Madpoo is offline   Reply With Quote
Old 2015-10-15, 22:59   #8
fivemack
(loop (#_fork))
 
fivemack's Avatar
 
Feb 2006
Cambridge, England

18B216 Posts
Default

So, I decided that probably it was more sensible to run 12 double-checks, one per thread, rather than one double-check over 12 threads.

I edited local.txt to have WorkerThreads=12 ThreadsPerTest=1 rather than the other way around, and restarted mprime

It started collecting trial-factor-to-67-bits jobs for numbers around 212.383 million, fifteen for each of the eleven threads that weren't working on the single double-check that I had assigned. Each of these jobs seems to take about 55 minutes; presumably they'd take a few minutes on a GPU, so I can't see why I'm doing them at all.

This is odd, because I have WorkPreference=101 which I thought meant 'only give me double-checks'.

Last fiddled with by fivemack on 2015-10-15 at 22:59
fivemack is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Expected completion dates for LL wildrabbitt PrimeNet 3 2015-08-17 10:57
ETA VS Expected Completion Date QuickCoder Information & Answers 2 2014-09-05 00:19
Expected completion dates herschko Information & Answers 3 2009-11-28 16:21
Disable sending expected completion date? Xyzzy PrimeNet 1 2009-04-24 07:50
Expected Completion Date Orgasmic Troll Software 4 2003-07-19 02:08

All times are UTC. The time now is 10:10.

Sat Oct 24 10:10:38 UTC 2020 up 44 days, 7:21, 0 users, load averages: 1.34, 1.38, 1.39

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.