
Anyone managing an HPC cluster has probably wondered at some point about the overall performance and usage of his/her cluster. How many jobs were completed last month, what was the average job duration time, how long were they pending in queue, how many CPU slots did jobs require…? These are all good questions with answers buried somewhere in your DRM’s accounting files.
If you are using the Grid Engine, and assuming you have the usual “default cell” installation, the relevant file is $SGE_ROOT/default/common/accounting. The corresponding command that extracts information from this file is “qacct”. When you type something like “man qacct”, you will notice that qacct produces a summary of information for wall-clock, cpu and system time, and for different categories of such as hostname, queue-name, owner-name, etc., so that there is a good chance that information you are looking for is readily available. If, however, you happen to look for something that qacct does not provide, the accounting file is formatted for easy parsing. Each line in the file corresponds to one computing task, and there are more than different 40 accounting fields (separated by the ‘:’ character) on each line. The meaning of different fields is documented in the man pages (“man accounting”), so that getting information you need with standard UNIX tools should not be difficult at all.
Blogs are so interactive where we get lots of informative on any topics nice job keep it up !!
Posted by: uk dissertation help | July 08, 2009 at 03:23 AM
Excellent Blog every one can get lots of information for any topics from this blog nice work keep it up.
thanks a lot
Posted by: custom dissertation | July 24, 2009 at 06:54 AM
Excellent Blog every one can get lots of information for any topics from this blog nice work keep it up.
Posted by: dissertation writing help | August 19, 2009 at 03:59 AM
Excellent post and wonderful blog, I really like this type of interesting articles keep it up.
Best Regards!
Posted by: UK dissertation | August 28, 2009 at 01:15 AM