A watchdog utility to keep track of job execution

Bruno, R. (INFN)


Abstract: The current gLite and LCG middleware services do not offer an easy 'access' to the Worker Nodes (WN) that are running users' jobs. This difficulty is due to the middleware design itself since is not a good practice to give to the users a direct access to Grid resources. Unfortunately there are some jobs that require a long run time to complete their execution and users may need to check the produced files or hestimate how long still the job will take to finish. Both gLite and LCG middlewares offer several possibilities to keep track about job execution via the use of interactive or checkpointable jobs or using metadata or information system commands and/or APIs. Unfortunately all these system are not very user friendly, require a deep knowledge of the middleware and may require several code changes as well to existing code. A brief and easy solution to solve the problem in order to track job execution can be given by a set of scripts called watchdog to be sent to the WN within the job.

Keyword(s): gLite ; grid services ; Watchdog


 Record created 2011-01-31, last modified 2011-01-31

external link:
Download fulltext
Access to Fulltext
Rate this document:

Rate this document:
(Not yet reviewed)