I'm hoping that someone here knows a little about PBS...
It used to be that when I submitted a job on my cluster using the 'qsub' command I would have to use the fully qualified domain name to submit it to a selected node like:
qsub nodes=node1.domainname.com myjob.sh
instead of just doing
qsub nodes=node1 myjob.sh
So I went and chopped down the names of all my nodes in the /var/spool/pbs/nodes
file to use the short name instead. PBS is still working fine, but my /var/log/messages file keeps filling up with errors saying that the head node can't talk to the pbs_mom service on other nodes. Was I supposed to make this change in any file on the worker nodes? Am I missing something?