ROCKS clusters usually operate under the supervision of PBS, allowing users to receive exclusive node partitions for their jobs. On the other hand, Domus was originally conceived for shared clusters. It is possible, however, to operate Domus in a way that is compatible with the operation of PBS. In order to do so, one must keep in mind the following:
- Whichever are the cluster nodes given to the "user" by PBS, the Domus monitorization services (see 3 - Monitorization Services) must be already running inthere; the best way to ensure this is for the cluster administrator (or for the "domus" user, if empowered to do so) to deploy those services in the entire ROCKS cluster or, at least, in the nodes managed by PBS;
- A "user" may ask PBS for a node set where to deploy a Domus (logical) cluster by running "qsub domus_qsub.sh", where "domus_qsub.sh" is a pseudo-job, used to grab a certain number of nodes, for a certain amount of time, to be used at will by Domus. Tipically, the "user" will make local copies of the script "~domus/.domus/bin/domus_qsub.sh", which will modify to suit its own needs; the relevant modifications will include the redefinition of:
- the number of cluster nodes (nodes=...)
- the number of processors per node (ppn=...)
- the duration of the time slot (in seconds) asked for the nodes (walltime=... and _WALLTIME=...)
- the path of the file that will hold the names of nodes selected by PBS (_PBS_NODES=...)
- If the "user" wants to operate a Domus cluster named "mycluster", the related configuration file (usually ~/.domus/etc/domus#mycluster#conf) should then be modified to suit the requisites and results of the previous step:
- DOMUS_CLUSTER_INTERFACES_FILE_PATH is expected to have be the same value as _PBS_NODES
- DOMUS_SUPERVISOR_LIFETIME and DOMUS_SERVICE_LIFETIME are expected to have a common value, and such value should be less than _WALLTIME
- If the Domus cluster is deactivated, the only way to later reactivate it will be to have PBS give the same node set to the "user"; to ensure this, the "user" must modiffy the "domus_qsub.sh" script in order to ask for the nodes by their names (known from the previous _PBS_NODES file); this is done using a directive like "#PBS -l nodes=node1:node2:..."
Prev | Top | Next