Use of logs to find unexpected corruption
The success story is the result of the detailed logs available in Linux.
The problem was actually trivial. Fstab had changed- I can't say how or why, but the fstab line relating to one of the possible swap partitions had been commented out- and the reference to the partition that was originally created for this purpose was missing altogether.
So my 512mB system was working well, until an application- typically a browser- pushed memory usage over this level.
Earlier in my Linux experience, I would have been tempted to try a fresh install. That would probably have worked- but then I would have needed to reinstall some of my applications with excursions into unstable repositories, which can be a time consuming adventure.
However, dmesg, kern.log and messages contained some thought-provoking lines, all of which either stated or pointed towards OOM errors or consequences thereof. It was a blurry picture, and early research suggested bugs that had been satisfied by kernel patches followed by updates. It was easy to install a new kernel- which, naturally, didn't touch the problem, but has resulted in smoother and cooler operation.
Finally, and after reading some material relating OOM to swapfile problems- and in which users were advised to make more memory available and not to try and tweak OOM settings- I moved on from the logs to use "top", "free" and, sensing a solution, "swapon". Finally, I added the required line to fstab and enabled it with swapon.
I hope that my future adventures in Linux will involve more lateral thinking- and that this experience and others like it will encourage all of us to make full use of the logs, and the wealth of material available to help interpret them.
|