LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Embedded & Single-board computer (https://www.linuxquestions.org/questions/linux-embedded-and-single-board-computer-78/)
-   -   Problem on Linux NAND root filesystem (https://www.linuxquestions.org/questions/linux-embedded-and-single-board-computer-78/problem-on-linux-nand-root-filesystem-873423/)

cfnielse 04-06-2011 03:08 PM

Problem on Linux NAND root filesystem
 
I'm have a JFFS2 root filesystem running on a large NAND device. Every once in a while, when I restart the computer, I get error messages about invalid ELF headers when /sbin/init or /bin/login try to load shared libraries like /lib/libpam.so.0

The errors cause a kernel panic and I'm stuck reflashing the NAND to get the computer up and running again.

Does anyone have any idea what might be causing this? Is the NAND being corrupted when it's running and then unable to restart? Is JFFS2 unreliable?

Example:

/sbin/init: error while loading shared libraries: /lib/libc.so.6: invalid ELF header Kernel panic - not syncing: Attempted to kill init!
Call Trace:
[dffc1d20] [c0007c74] (unreliable)
[dffc1d60] [c0020d68]
[dffc1db0] [c0025074]
[dffc1e80] [c00250bc]
[dffc1f40] [c000f340]

foottuns 04-06-2011 03:15 PM

follow this link, this might help you out...

http://www.linuxquestions.org/questi...raries-786638/

cfnielse 04-06-2011 04:29 PM

Thanks for the response.

You're right, the library files are getting corrupted.

I tested it out by corrupting a couple myself using dd and the behaviour was identical to what I have been observing.

Now I've got to figure out the cause of the corruption. Dynamic library files in the /lib folder are there to be used by the kernel during startup. I can't imagine that they ever get written to during runtime. Which I guess exonerates JFFS2 failing a write.

So what is left as a cause? Either JFFS2 has a terrible bug that causes it to ruin innocent bystander files? Or the NAND itself is becoming corrupt?

foottuns 04-07-2011 04:51 AM

what distro are u using now?

theNbomr 04-09-2011 11:34 AM

Just to throw a bit at the wall...
What if the error messages about invalid ELF headers is caused by corruption in the filesystem itself? If the filesystem results in a read that contains data that isn't actually a file, or data belonging to some other file, then that could account for the corruption, since the filesystem in general is a read/write system. Is there any way you can run fsck against the borked flash? Maybe create a separate partition that can be restored in isolation in order to get a bootable system, but still usable for diagnostics?

--- rod.

foottuns 04-10-2011 12:30 PM

you might try this...

http://www.mail-archive.com/linux-39.../msg57147.html


this week at work i had a problem with my xen server, the problem was on my filesystem, what i had to do is to restart xen and start a scanning to my filesystem. you might try to use fsck -A -v, try also fsck --help command.


All times are GMT -5. The time now is 02:17 AM.