mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Lounge (https://www.mersenneforum.org/forumdisplay.php?f=7)
-   -   Prime95 Stops Mid-Test, Starts New One (https://www.mersenneforum.org/showthread.php?t=10482)

jinydu 2008-07-15 15:53

Prime95 Stops Mid-Test, Starts New One
 
A dreaded bug appears to have reared its ugly head again:

[CODE]39405757 70 21461505 211.6 11.5 71.5 11-Jul-08 02:10 17-Dec-07 00:57 C040FD929 2659 v19/v20
40792421 69 16308479 59.0 25.0 61.0 15-Jul-08 15:42 17-May-08 15:34 C040FD929 2659 v19/v20
[/CODE]

Apparently, on July 11, a certain instance of Prime95 stopped working on M39405757 and since then has instead been working on M40792421. There's probably more to the story than that though, as it couldn't possibly have finished over 16 million iterations in just 4 days. I don't know what happened to the backup file for M39405757, or even whether it still exists. Note that I always set Prime95 to contact the server daily, so that I can detect problems quickly.

This is really starting to get on my nerves. I've already suffered wasted CPU years from this sort of problem before (a computer stopping work on an exponent before it has finished and starting a new one). In the past, I tolerated the problem; I just let the old exponent expire and accepted the CPU work as lost. But my patience is wearing thin.

Any suggestions?

Thanks

S485122 2008-07-15 19:07

Is there anything in your prime.log or results files ? you could increase the rate at which results file are written via the Options / Preferences menu and you could use the following options :[code]You can have the program generate save files every n iterations. The files will
have a .XXX extension where XXX equals the current iteration divided by n.
In prime.ini enter:
InterimFiles=n

You can have the program output residues every n iterations. The default value
is the InterimFiles value. In prime.ini enter:
InterimResidues=n[/code]
In this specific case anyway by querying the V5 server you get the following result :[code]Exponent User name Computer name Residue Error code (if any) Date found
39405757 Jin Y Du C040FD929 753D0A98BE5DC5__[/code]So I suppose you got the credit and did not loose anything...

Jacob

Brian-E 2008-07-15 20:52

To add to what Jacob says:
The status file which I collected from the v4 server on 10 July 2008 gave the following information:

[code]39405757 70 18566145 206.6 -22.3 37.7 02-Jun-08 08:48 17-Dec-07 00:57 jinydu C040FD929[/code]last updated on [B]2 June[/B]. So it seems that your machine was [B]not[/B] updating its information every day for some reason.

It seems that, as Jacob indicates, your test of M39405757 did in fact finish. It does seem strange though that the status which you give from the update of 11 July indicates a progress of only 21+ million iterations.

jinydu 2008-07-16 10:23

Strangely enough, the prime and results files are very short and contain only one entry each, both from 2007. But I know that many exponents have come and gone since then.

Also, M39405757 is on the worktodo file.

One other thing is that this is a dual-core computer; so I am running two instances of Prime95. However, I only see one prime and one results file.

Thanks

S485122 2008-07-16 16:26

Which version of the program are you running ?
Can you check in which directory they run (under windows see Services 'Path to executable'.
Could you post or PM your configuration files (after removing password and e-mail address ?

Jacob

robo_mojo 2008-07-16 16:31

permissions

jinydu 2008-07-18 10:05

I'm running version 24.14.

Not so sure what you mean by your second question; there is a folder C:/windows, but no sub-folder called Services in there. Also, the Task Manager does have a tab called Services, but Prime95 is not listed there.

Here is the prime file:

Windows95Service=0
OldUserID=
OldUserPWD=
UserPWD=***************
UserName=Jin Du
UserEmailAddr=****************
Newsletters=0
UserID=jinydu
AskedAboutMemory=1
UsePrimenet=1
DialUp=0
DaysOfWork=5
WorkPreference=2
OutputIterations=1
ResultsFileIterations=999999999
DiskWriteTime=30
NetworkRetryTime=2
NetworkRetryTime2=60
DaysBetweenCheckins=28
TwoBackupFiles=1
SilentVictory=0

... and the local file:

OldCpuType=12
OldCpuSpeed=2660
CPUHours=24
DayMemory=300
NightMemory=300
DayStartTime=450
DayEndTime=1410
ComputerID=C040FD929
LastEndDatesSent=1185395015
RollingStartTime=0
Affinity=0
SelfTest2048Passed=1
RunOnBattery=1

And finally, here is the results.txt file. As you can see, it definitely can't be right:

[Thu Jul 26 04:23:01 2007]
UID: jinydu, User: Jin Du, *****************
[Thu Jul 26 05:25:44 2007]
Self-test 2048K passed!

(******** means that I am omitting the email address or password)

S485122 2008-07-18 16:21

I was a bit criptic to say the least. I meant "If you are running windows and running Prime95 as a service, go to the services applet (via Manage or via the Control Panel, Administrative Tools, Services). Then look at the Prime95 service(s) properties one non editable fiels is 'Path to executable'."

How are running your two instances ? In different directories ? By setting affinity ? But then you should have a prim0000.ini and a prim0001.ini file.

Jacob

jinydu 2008-07-18 16:38

When I first set up Prime95 on the computer (around a year ago), I set up one instance normally (i.e. it just appeared after the installation finished). I got a second instance running by going to Command Prompt, typing cd c:\Program Files\Prime95 and then typing prime95 -A1. Finally, I opened the Advanced tab on the first instance, went to Affinity, unchecked "Let program run on any CPU, set it to run on CPU 0; and did the same with the second instance (except I set it to run on CPU 1 instead).

I don't really know what is going on when I do that (other than one instance affects only one core while the other instance only affects the other core); I was just following the instructions from one of the more obscure sections of the Prime 95 readme.

S485122 2008-07-18 16:54

You should have used two different directories... When you start Prime95 -a1 or Prime95 -a0 the configuration files are loca0000.ini and loca0001.ini, instead of local.ini. the same for prime.ini (prim0000.ini and prim0001.ini), prime.log (prim0000.log and prim0001.log), result.txt (resu0000.txt and resu0001.txt) and worktodo.ini (work0000.ini and work0001.ini). You should try to reconstruct the files in two different directories. What happens now is that both instances mix up their work, reporting time, results... Make a backup copy of the directory first to be shure you do not lose to much work if you make a mistake while setting up the two directories.

Jacob

jinydu 2008-09-05 16:45

Sorry to resurrect this thread; but I have three facts that are together quite amusing.

1) According to Primenet v5, I already finished a first-time LL test on M39405757.

2) M39405757 still appears on my Primenet v4 Individual account report under "Exponents Assigned".

3) The relevant computer has Windows Vista and 5 user accounts (let's call them a, b, c, d and e). If user b is not logged on and I log into a, two instances of Prime95 automatically start up (neither of which is testing M39405757). If user b is logged on and I log into a, nothing happens. If user a is not logged on and I log into b, one instance of Prime95 automatically starts up and tests M39405757. If user a is logged on and I log into b, nothing happens. Nothing ever happens with the other 3 users, as far as I know.

Interesting, The two user accounts prevent each other from running Prime95 simultaneously; but seem otherwise unaware of each other. The workaround seems easy enough though; I just make sure to login to a first every time I turn on the computer, and not log off until I turn off the computer.

cheesehead 2008-09-05 17:14

[quote=S485122;137979]You should have used two different directories...[/quote]... or one could do what I do, using only one directory:

Start Prime95 without "-a0", but with the affinity set to 0 ("opened the Advanced tab on the first instance, went to Affinity, unchecked 'Let program run on any CPU', set it to run on CPU 0"),

then start a second instance of Prime95 with "-a1".

Both of mine coexist quite happily using the same directory.

(Note: I manually start the two instances after booting the system.)

[quote]When you start Prime95 -a1 or Prime95 -a0 the configuration files are loca0000.ini and loca0001.ini, instead of local.ini. the same for prime.ini (prim0000.ini and prim0001.ini), prime.log (prim0000.log and prim0001.log), result.txt (resu0000.txt and resu0001.txt) and worktodo.ini (work0000.ini and work0001.ini).[/quote]But if one does it my way, there are no "...0000" files. The instance running on CPU 0 (but without "-a0", remember) simply uses the same file names as default-single-CPU Prime95, namely local.ini, prime.ini, prime.log, result.txt, and worktodo.ini. So the instance on CPU 0 would have just continued the work started by the initial no-affinity setup jinyu had.

The "-a1" instance does use the "...0001" file names. Also, its save file names all have ".001" appended. If one were to have the "-a1" instance take over a test started on the CPU 0 instance, one would need to add ".001" to the existing save file name for that exponent, or else the "-a1" instance would start over from the beginning to test that exponent, unaware of what the CPU 0 instance had already accomplished.

Note again: I manually start the two instances, so I don't actually know if anything would go wrong in my setup if I had them start automatically instead.

jinydu 2008-09-06 01:04

[QUOTE=cheesehead;140984]... or one could do what I do, using only one directory:

Start Prime95 without "-a0", but with the affinity set to 0 ("opened the Advanced tab on the first instance, went to Affinity, unchecked 'Let program run on any CPU', set it to run on CPU 0"),

then start a second instance of Prime95 with "-a1".[/QUOTE]

Funny. That's precisely what I do, except that the instances start automatically.

[QUOTE=cheesehead;140984]
Note again: I manually start the two instances, so I don't actually know if anything would go wrong in my setup if I had them start automatically instead.[/QUOTE]

Apparently something does. For instance, there is only one results file; and it doesn't have any results since the first self-test, even though both instances have completed plenty of exponents. Of course, the data that should be in the files must be on the hard drive somewhere, or else the instances of Prime95 wouldn't work. But I haven't been able to find them, even using Windows Search. To be sure, I haven't tried that hard because the instances are generally working fine.

jinydu 2008-09-06 07:32

Is there a way to force an instance to reveal where in the hard drive it is writing its save files, recording its results, etc.? Because it definitely isn't in C:\Program Files\Prime95.

S485122 2008-09-06 07:44

Jinydu,

What you must do is go to "Control Panel, Administrative Tools, Services" or to "My Computer, Manage and chose the Services and Appplications, Services" line. then you ask for the properties of the Prime95 services that are listed. The field "Path to executable" should provide the answers to your questions.

If you want I can provide you with a .reg, file that starts all necessary instances and the acompanying local.ini and prime.ini files...

What OS are you using XP or Vista ?

Jacob

jinydu 2008-09-06 08:16

That computer is using Vista. I couldn't find Prime95 in the list of services, despite the fact that 2 instances are running.

It's funny... There are 3 dual-core computers on which I am running 2 instances of Prime95. The only one on which the files are clearly being saved correctly (with one file called results and another called resu0001, with both files where I expect them to be) is on the computer running XP.

EDIT: I just found all relevant files for one of the other two computers. They were hiding in an obscure Hidden Folder. C:\Users\user\AppData\Local\VirtualStore\Program Files\Prime95. Weird; I definitely didn't type that in when I was installing Prime95.

S485122 2008-09-06 09:24

[QUOTE=jinydu;141099]That computer is using Vista.

EDIT: I just found all relevant files for one of the other two computers. They were hiding in an obscure Hidden Folder. C:\Users\user\AppData\Local\VirtualStore\Program Files\Prime95. Weird; I definitely didn't type that in when I was installing Prime95.[/QUOTE]Like I said in another post (and perhaps another thread) : with Vista it is best to run Prime95 from a directory inside Documents. Vista does not allow users to write in \Program Files", instead it uses two different paths : \ProgramData and the one you found. That last one can also contain \Users\<user>\AppData\Local\VirtualStore\Windows to accomodate programs like SAP that still want to write their .ini files to the Windows directory. I suppose those features are documented, but not for end-users.

You should try an "Advanced Search" for "prim*.ini" and or "work*;ini" where you ask the search to look at non indexed files and locations, including hidden directories. (Vista is really annoying for searches that are not those imagined by Microsoft, i.e. documents in the user profile and mail stores.)

Since Prime95 starts automatically on the Vista machines and since it is not a service, it must be started because it is in one of your Startup directories (the common one or the user one. \Users\<user>\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Startup) You can also use msconfig or Windows Defender to look at the startup programs.

You might considering upgrading your computers to Windows XP : [url=http://dotnet.org.za/codingsanity/archive/2007/12/14/review-windows-xp.aspx]Review: Windows XP[/url] :-)

Jacob

jinydu 2008-09-06 13:47

1 Attachment(s)
Thanks for the advice. I think I've got a solid handle on what's going on with two of the computers.

Now for more weirdness... This is about my laptop (one of the two computers that runs Vista).

I run two instances of Prime95. The one that uses Core 0 is testing M37209241 while the one that uses Core 1 is testing M21164029. Instance 1 appears to work just fine; it starts automatically at boot-up and picks up right where it left off.

Instance 0 on the other hand has problems. It does not start up automatically at boot-up; I have to use the Command Prompt to open it, despite the fact that I checked "Start at Bootup" under Options. More importantly, it always starts at 11.57% every time I open it; and I'm not sure why (I suspect it lost the ability to overwrite save files. No idea why it would do so at 11.57%).

I've attached a picture of the Prime95 folder. The fact that the file p2C09241 hasn't been updated since Aug. 15 does look suspicious; but I'm not sure what to do about it.

Thanks again

Prime95 2008-09-06 15:44

Start at Bootup does not work under Vista. In the name of security, MS no longer supports services with a GUI.

S485122 2008-09-06 18:11

Like I told you, with Vista just move all files to a subdirectory of Documents.

If you install Prime95 on a Vista machine with the installer, change the destination directory to a subfolder of Documents. You will have less trouble.

Jacob

jinydu 2008-09-07 06:09

I copy-pasted the Prime95 folder to My Documents; but I still have the same problems. However, I suspect that deleting everything and installing Prime95 in My Documents from scratch could work.

Before I try anything like that though... Is there a way to patch up the issue with Instance 0 without losing 11.57% of the M37209241 test? I think the Prime95 instance may be writing to one file and reading from another, causing it to always start from the same place.

S485122 2008-09-07 07:51

Before moving the program around you should have stopped it and exited.

Then you do an advanced search (as described in a previous post) for all possible configuration files (prim*.ini, loca*.ini, resu*.txt and work*.ini) and save files (p2C09241 and q2C09241 and even a possible r2C09241 if there are problems writing files in the Prime95 working directory. A search for ?2C09241 should do the trick.)

Now you must decide which of the save files and other files to keep. Best to move them to different subdirectories of your future Prime95 directory, be carefull not to overwrite one or the other.

Decide which configuration files you want to use.

Choose the save files you are going to use based on their modification date. Another way to do it is to simply try each file in turn : COPY (not move) it to from one of the directories where you moved them, start Prime95, you will immediately see how far the test was complete at the time of saving that particular file. Use the menu File / Stop / Continue to stop and restart Prime95, there is no need to exit the program.

Jacob

jinydu 2008-09-07 14:21

I backed up the files, uninstalled Prime95, and reinstalled it in 2 separate folders. Now I think I've almost solved the problem.

A new problem has appeared though unfortunately. Instance 0 is incapable of contacting the server. I keep getting Error 2250: Server Unavailable, even though the server is online and the other instance has no trouble contacting the server.

S485122 2008-09-07 14:51

primenet.ini ? Firewall ? Which user account ? Guest may not have the rights to get to Internet ?...

Jacob

jinydu 2008-09-07 15:33

I don't think it can be a firewall or the wrong user account. The other instance is running side-by-side with it and is able to communicate just fine.

Here is the local file:

OldCpuType=12
OldCpuSpeed=2493
CPUHours=10
DayMemory=1024
NightMemory=1024
DayStartTime=450
DayEndTime=1410
ComputerID=CECA33825
LastEndDatesSent=1218864918
RollingStartTime=1220797504
RollingAverage=737
Affinity=0
SelfTest2048Passed=1

Here is the prime.ini file:

Windows95Service=1
OldUserID=
OldUserPWD=
UserPWD=caltech
UserName=Jin Du
UserEmailAddr=*******(starred out)********
Newsletters=1
UserID=jinydu
AskedAboutMemory=1
UsePrimenet=1
DialUp=0
DaysOfWork=5
WorkPreference=0
Left=295
Top=100
Right=1255
Bottom=659
OutputIterations=1
ResultsFileIterations=999999999
DiskWriteTime=30
NetworkRetryTime=2
NetworkRetryTime2=60
DaysBetweenCheckins=1
TwoBackupFiles=1
SilentVictory=0
Advanced=1
ManualComm=0

mdettweiler 2008-09-08 02:35

[quote=jinydu;141288]I don't think it can be a firewall or the wrong user account. The other instance is running side-by-side with it and is able to communicate just fine.[/quote]
Are you sure that Windows Firewall isn't blocking access to the new copy of Prime95 since it's in an "unrecognized" location? (That is, maybe it had already been programmed to allow the old version access to the internet, but not the new one?)


All times are UTC. The time now is 08:29.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.