![]() |
Monitoring multiple client machines?
How are you tracking all of your client machines to make sure they are up and running properly?
Has anyone cobbled together a simple monitoring tool that you can glance at and see that all of your client machines are up and running properly? I'm running Win7 and all of my clients are on the same network. |
Not sure how "automated" you want.
I simply have them all set to send in new completion dates daily. Then I check my assignment list from the V5 Server to see if any are "late". On a related topic I have written VB programs that parse data from a couple V5 web pages so I guess one write a program to parse the assignment details and look for updates more than 24 hours old. |
For a small number of machines like myself, I manually run mprime over ssh in a screen session, and run conky over x11 forwarding to watch cpu & temps.
For a larger number of machines, and especially if you don't care about watching progress & stats, then you'd probably want something a little better. Gentoo has an initscript for mprime which I'll probably modify eventually to automatically connect to my workstation and tell me about progress. If you just want to check if its up... you might just: ssh x.x.x.x ps -e | grep "mprime" |
If you want to be able to run different jobs from different machines at different times, yet keep only one instance of your data files, mount the remote machine over nfs and symlink the datafiles into the gimps directory. That should avoid having to scp the files back and fourth.
Unfortunately I don't think there is support for multiple worktodo files to organize different sets of tasks, but thats easy to manage anyway. It would be useful to have something which does the following: [list][*]If a gimps directory on a server is mounted by fstab on a remote machine, reduce the number of work threads on the server, append the tail of workthreads to the client machine and start processing them.[*]If the client machine is offline, detect this, and increase the work threads on the server again to take over. The overall effect is that only one machine is working on a job at a time.[/list] Would be useful for people with a few computers with multicore processors that aren't always online, and a server which is. Reducing the threads on the server makes use of all the available resources, even though we're reducing throughput, we're still working on the same total number of jobs. |
[QUOTE=Smorg;200721]I just got another idea that I'll probably use.
[/QUOTE] Great idea. However you could also, if there is network storage that is accessible to all machines use this feature of Prime95. [quote=old help file]In prime.ini you can force the program to use different filenames for 6 files. This is in response to a user that is running security software that prevents writing to any file with a .ini extension. There may well be other uses. You can also change the working directory (identical to the -W command line argument). prime.ini=your_filename local.ini=your_filename worktodo.ini=your_filename prime.log=your_filename prime.spl=your_filename results.txt=your_filename WorkingDir=your_directory_name[/quote]You can then set each machine with a unique folder/directory. \\server\my_user_name\personal\prime95\machine_name_01 etc. Then you can manage the machines and check the various files. I think that I need to do this on my future borgim boxen. |
Another way might be to use unionfs or aufs in combination with nfs.
On the server, create /var/gimps/clientbox containing all datafiles and symlink those the server is working on into /var/gimps (the default location for mprime on Gentoo). sshfs/nfs /var/gimps/clientbox into /mnt/* on a client, then union mount that into /var/gimps on the client. What would be really interesting would be to work in a way so that you have a worktodo file which gets union mounted on the server over the top of its normal worktodo file when a client comes online so that the entire process works with no scripting or toggling of config files whatsoever. |
| All times are UTC. The time now is 23:23. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.