Problems w/ Embedded Linux - SDCard corruption... (power cycling?)
Hi...
This thread may morph once or twice before conclusions can be made. I'll try to edit the title and / or start new threads to keep things on track. What I would like to know is what are people doing to avoid corrupting SDCards in an embedded Linux box? I know, don't power down the box. Well, I've no control over that. So far, I've had it suggested to me to mount the SDCard using the "sync" switch (I think) to inhibit disk caching. Hum, I'll have to try that. Do any of you have other ideas I might try? But what of the root directory? That is, on most embedded Linux boxes I have seen, the root directory is installed in a flash chip on the PCB. What experiences have others had with that? It occurs to me that if the SDCard can be corrupted, that the root directory can be just as easily corrupted. You know, I had thought the embedded Linux root directory was a tar ball extracted from FLASH and written to RAM upon every boot up. That would almost eliminate all possible permanent corruption problems. If the RAM image were corrupted, just reboot and all would be fine. But on the platform we are OEM'ing, I have see files stick around in the root directory tree. To me that means the root file system is just sitting in FLASH and if corrupted it will always be corrupted. Just what is the Standard Operating Procedure when it comes to booting an embedded Linux box?? -thanks |
I'm curious, what are your concerns here? Power failure? The card being pulled out? Poorly written software? I've booted computers off thumb drives many, many times and I've never had one corrupt on me. Is this a problem you're currently experiencing?
But for some general info, a lot of thumbdrive linux distros use squashfs to store a filesystem in a single file, then unpack it into RAM. This can help with your "restore to working state on reboot" issue. But please, tell us more about what you're trying to do. The more details we know, the better we can help. |
Quote:
Quote:
In a perfect system we would unmount the SDCard before powering down the Linux box. However this is probably not going to happen. Even if we make provisions for controlling mounting in the Qt program, it is more likely the power will simply be removed. To prevent corruption I am considering mounting the SDCard with out disk caching enabled. I believe the mount command switch is simply "sync". But I don't have any embedded Linux experience with this approach. It may slow down the Linux box drastically or worse cause a premature failure of the SDCard by increasing write events. I am also considering issuing the sync command several times in the script that kicks off the Qt application. But doubt that will help much. So I am here looking for other ways to mitigate SDCard corruption due to powering off the device with out unmounting the mass storage device. I thought of this forum because I thought you all might have had to wrestle with these problems already. I would think embedded Linux boxes, not being regarded as computers but rather appliances, would likely be power cycled all the time. -thanks |
OK, you seem to have left out some pertinent details in your first post.
1. You are using some specific, pre-made device that you appear to have little or no control over. 2. This system is running some sort of purpose-built software (using Qt) that you have little or no control over. 3. The system already runs on an SD card in some manner, and you're trying to augment/modify the system to prevent corruption. Am I right in these assumption? And I'll ask again, are you actually experiencing corruption? This is important, because the type of corruption can be very telling. |
Quote:
Quote:
Quote:
Quote:
Quote:
Going on, it gets a bit more difficult to explain. I have seen SDCards that the embedded Linux box will not mount. But the same cards can mount w/o issue on Win7, and WinXP boxes. One of them even mounted on my Ubuntu laptop. To make matters worse, I reformatted the SDCard I could not mount on the embedded Linux system. As expected it mounted (as before) on the Win7 box. But it STILL did not mount on the embedded Linux box. Not until after an image copy from a good SDCard in the SDCard duplicator did the bad SDCard mount on the embedded Linux box. I suspect the MBR because, I believe, reformatting only effects the partition and not the MBR. But why such varied behavior? Why does the embedded Linux box have such a hard time mounting the SDCard. Why are Win7, WinXP and Linux so tolerant, if the MBR is causing the problem? -thanks |
Jeez. You have the worst kind of problem to troubleshoot; intermittent and hard to reproduce. I can see how frustrating your issue is.
Unfortunately, it seems that your embedded system may not have full filesystem/mounting software installed. When you use a full OS (Win/Lin), they have more tools at their disposal to deal with potential issues. Your embedded box seems to be lacking, so if it encounters an error, it has no choice but to die (all speculation on my part, seeing as I don't really know your system). Is there any way for you to extract log files from your embedded device? Any sort of error message would be very helpful. Also, have you tried doing a bit comparison of a "corrupt" card with a "good" card? That might be able to tell you what's getting corrupted, though you may have to climb down into the sub-filesystem muck to get any usable info. |
Quote:
Quote:
|
That's interesting. What you're saying is that the device file (e.g. /dev/sdb1) is not being created when a "corrupt" SD card is being put in. Is that right?
If that's the case, I would almost wonder if it's the embedded device that's the problem. Have you tried several different units with the same "corrupt" card? I know that wouldn't explain why Ubuntu won't mount it, but it would be nice to try and rule that possibility out. And yes, AFAIK, when a mass storage driver loads properly, the kernel will generate the proper device file. A good way to check what's happening is to do a "dmesg | tail -f", then insert the card. This will display the kernel logs, which should indicate a new storage device has been added. You may want to try that and see what you get. You are correct on how the MBR works. There is some good info in the Wikipedia article on FAT. Basically, the boot section is just a jump vector to the first OS boot instruction sector. There is a bunch of other stuff there too, but none of it should be modified in normal read/write ops (AFAIK). It's certainly not obsolete, though. That's how bootable thumbdrives work. |
Thanks for all the help so far. I still have not tried the dmesg trick. It is an embedded system though and I don't know if that command is available.
The Wikipedia page is great. I'll have to take the time to read it thoroughly. Don't know if I mentioned these other problem constraints that we have found so far: That the dd image from a bad SDcard can be dd'ed to a good SDCard and the good card acts like the bad one. That a good dd SDCard image put on a bad SDcard will make the bad SDCard good again. And, of course, formatting the bad SDCard on a Win7 box makes not difference at all. Now, after looking briefly at the Wikipedia page, I am wondering what part the x86 code found in the MBR plays? I understand that this code is used to tell the computer how to boot up the OS on the mass storage device the MBR is on. We are not booting from the SDCard. But, we are also not running on an x86 target. This is an ARM based target. I wonder. All the large computers that can work w/the bad SDCard are x86 boxes. Sure there is Win & Linux this and that. But they are all running on x86 boxes. Could that be the reason the bad SDCards are not working on the target? |
It's entirely possible that something either within ARM or the ARM drivers is causing your issue, but unfortunately this is where my experience and knowledge ends. I've never done work this low-level before, and have only ever casually used ARM systems.
Your DD trick, along with the info that formatting (at least on windows) seems to have no effect, seems to reinforce the idea that it's something in the file table / MBR that's causing your issue. I'm sorry to say that that's all the help I can offer for now. The dmesg command should be in most linux systems. I would be surprised if it wasn't, but then again, this whole issue is a bit surprising. One suggestion I would make is to try to find some other ARM device (phone, PDA, etc.) and stick a "corrupted" SD card in it and see if it recognizes. |
Quote:
-thanks |
If it's a relatively newer device, it's likely to be running on ARM. But either way, it would be good to test the corrupt cards in a range of embedded/"limited" devices (i.e. not PCs).
|
All times are GMT -5. The time now is 10:11 AM. |