Linux Cluster - Random Node Crash !
hi ,
i have a peculiar problem with my linux cluster...i have an application that crashes my nodes randomly...the problem is i cannot identify the source of frequent crashes...so let me start by explaining my setup... i have a 16 node 32 cpu linux cluster with redhat 7.2, which runs an application called lsdyna thru batch software...this application causes the crash of my nodes...initially i had 1 gb of swap for 2gb of ram..i increased that to 2gb....which i thot wud solve the issue...but still it continues crashing...i tried to look up in log files for any signs..but cudnt come up with anything... is there any way i can find out the reason for the frequent crashes...some commands or some log files.. any advice, suggestion or comment will be highly helpful... thanks in advance.. insane.. |
All times are GMT -5. The time now is 01:16 AM. |