Machines in a production environment are supposed to be stable and not see much change. Therefore, when you encounter segfaults in a production environment, the first thing is to look for clues (aka auditing). This means looking at system and daemon logs for changes and anomalies, user login records, running processes, network connections, file integrity (package manager and applications) and such. The scope of auditing is not "fixing things" but building an understanding of the situation and trying to pinpoint the cause(s). Only when you know the cause you can decide if it is something to be "fixed", adjust, eradicate or whetever else. Making another machine take over this machines functionality, restore or reinstall should only be done if you know what the cause is because restoring w/o fixing loopholes will result in the same situation over and over again.
|