To whom it may concern:
target penguins
slurm user, who ran into 'Low RealMemory' with computational nodes and edited conf file.
Still sinfo shows
drained on these nodes (slurmctld.log is clean!), even after restarting compute and control nodes.
In the most of cases, stop and restarting slurmctld works.
However, this sequence choked
Code:
Starting slurm central management daemon: slurmctld failed!
complaining no permission for start/stop daemon.
What I did
Code:
aptitude reinstall slurmctld
was helpful,
I know it is NOT the best solution, but worked.