TORQUE is an open source resource manager providing control over batch jobs and distributed compute nodes. It is a community effort based on the original *PBS project and, with more than 1,200 patches, has incorporated significant advances in the areas of scalability, fault tolerance, and feature extensions contributed by NCSA, OSC, USC , the U.S. Dept of Energy, Sandia, PNNL, U of Buffalo, TeraGrid, and many other leading edge HPC organizations.[1]
1. Install TORQUE packages
$sudo yum install torque* libtorque
Configure TORQUE on the server
$cd /usr/share/doc/torque-2.1.10/ $sudo vi torque.setup
change the lines
qmgr -c 'create queue batch' qmgr -c 'set queue batch queue_type = execution' qmgr -c 'set queue batch started = true' qmgr -c 'set queue batch enabled = true' qmgr -c 'set queue batch resources_default.walltime = 1:00:00' qmgr -c 'set queue batch resources_default.nodes = 1' qmgr -c 'set server default_queue = batch'
as
qmgr -c 'create queue batch' qmgr -c 'set queue batch queue_type = execution' qmgr -c 'set queue batch started = true' qmgr -c 'set queue batch enabled = true' qmgr -c 'set queue batch resources_default.walltime = 72:00:00' # walltime = 72:00:00 means that every job has 72 hours to execute as default qmgr -c 'set queue batch resources_default.nodes = 1' qmgr -c 'set queue batch max_running = 2' # max_running = 2 means there are two jobs running at any time qmgr -c 'set queue batch max_user_run = 5' # max_user_run meas there are five jobs in the queue qmgr -c 'set server default_queue = batch'
then execute it as
$sudo ./torque.setup root
for root as the administrator.
3. setting the server nodes
the default TORQUE configuration folder on Fedora 12 is /var/torque
make a file server_priv/nodes
like this
node01 np=2
node01
is your hostname, np=2
means 2 processors on the node
4. Initialize/Configure TORQUE on Each Compute Node
make a file mom_priv/torque.cfg
like this
$pbsserver localhost # note: hostname running pbs_server $logevent 255 # bitmap of which events to log
5. Start the daemon service
$sudo chkconfig pbs_mom on $sudo chkconfig pbs_sched on $sudo chkconfig pbs_server on
6. Test service configuration
verify all nodes are correctly reporting
$pbsnodes -a
view additional service configuration
$qmgr -c 'p s'
Finally, you finish the settings so that you want to work on it. Submitting a job in the queue is to use command qsub
$qsub batchjob
the batchjob
is a file containing some settings and command lines.
However, this is a simple configuration to use TORQUE on Fedora 12. A detailed configuration is on the site clusterresources.com
References
[1] http://www.clusterresources.com/products/torque-resource-manager.php
[2] ClusterResources. TORQUE Administrator's Guide. v2.3