LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Hardware > Linux - Embedded & Single-board computer
User Name
Password
Linux - Embedded & Single-board computer This forum is for the discussion of Linux on both embedded devices and single-board computers (such as the Raspberry Pi, BeagleBoard and PandaBoard). Discussions involving Arduino, plug computers and other micro-controller like devices are also welcome.

Notices


Reply
  Search this Thread
Old 03-02-2018, 02:15 AM   #1
OkCalis
Member
 
Registered: Dec 2017
Posts: 34

Rep: Reputation: Disabled
Linux kernel panics on custom board: mtd->read(...) returned ECC error


Hi,

I've been working on a MityDsp-L138F (an SoM featuring TI's OMAPL138 ARM+DSP processor), as well as a custom-manufactured board that also contains OMAPL138, akin to MityDsp. So far, I've used Linux on MityDsp (ARM side) with kernel and file system installed on NOR and NAND flashes, respectively, and on our custom board, the file system has been an NFS. And both systems seem to work with the (almost) same kernel image and file system.

Recently, I've tried flashing the JFFS2 file system provided in the board support package (which is the same file system I've been using on MityDsp), but although I believe I followed the same steps, I failed to get the Linux kernel to work properly on the custom board. Here's the error I get when trying to boot. I've attached the boot log as well as my environment variables.

When I try booting the kernel, U-Boot appears to read the kernel correctly, and the kernel starts to boot, only to throw an ECC error. How can I overcome this problem?

Thanks in advance,
OkCalis
Attached Files
File Type: log kernel-panics-on-custom.log (165.8 KB, 72 views)
 
Old 03-02-2018, 03:05 AM   #2
business_kid
LQ Guru
 
Registered: Jan 2006
Location: Ireland
Distribution: Slackware, Slarm64 & Android
Posts: 15,984

Rep: Reputation: 2273Reputation: 2273Reputation: 2273Reputation: 2273Reputation: 2273Reputation: 2273Reputation: 2273Reputation: 2273Reputation: 2273Reputation: 2273Reputation: 2273
You're really on your own with a kernel error. ECC would point at ram, which is probably in general setup. But you have the config and the board. If it's any consolation, I'm on 4.9.45-dec8, meaning it's my 8th attempt at getting a kernel that runs everything.
 
Old 03-02-2018, 07:26 PM   #3
blue_z
Member
 
Registered: Jul 2015
Location: USA
Distribution: Ubuntu, Lubuntu, Mint, custom embedded
Posts: 104

Rep: Reputation: Disabled
Quote:
Originally Posted by OkCalis View Post
... a custom-manufactured board that also contains OMAPL138, akin to MityDsp.
Exactly how similar or dissimilar is this custom board to the MityDsp?
Since there is no Device Tree to differentiate each board, the boards have to essentially be identical in order to use a shared kernel.


Quote:
Originally Posted by OkCalis View Post
Recently, I've tried flashing the JFFS2 file system provided in the board support package (which is the same file system I've been using on MityDsp), but although I believe I followed the same steps, ...
Exactly what "steps" did you follow (i.e. there's more than one way to install a JFFS2 rootfs)?
Did you verify this installation, i.e. mount it and access files?


Quote:
Originally Posted by OkCalis View Post
When I try booting the kernel, U-Boot appears to read the kernel correctly, and the kernel starts to boot, only to throw an ECC error.
The kernel seems to be read from something other than NAND flash, so that provides little information.
Twice you mention a singular "ECC error", but the log indicates a plethora of ECC error messages as well as JFFS2 CRC and check_node_data messages.

Another poster guesses that "ECC would point at ram".
But the boot log clearly indicates that this is an incorrect guess.
First there is no ECC enabled for main memory:
Code:
Memory policy: ECC disabled, Data cache writethrough
Second the ECC errors are only being reported by mtd->read().
Third the numerous messages from jffs2_scan_eraseblock() and check_node_data indicates that the filesystem in NAND flash has issues.

What have you done to verify that this NAND flash chip is actually functional on this custom board?
Have you tried a simpler test of reading this JFFS2 rootfs, e.g. using U-Boot?

Regards

Last edited by blue_z; 03-02-2018 at 08:29 PM.
 
Old 03-05-2018, 12:21 AM   #4
OkCalis
Member
 
Registered: Dec 2017
Posts: 34

Original Poster
Rep: Reputation: Disabled
Hi,
Thanks for your response.

Quote:
Exactly how similar or dissimilar is this custom board to the MityDsp?
Since there is no Device Tree to differentiate each board, the boards have to essentially be identical in order to use a shared kernel.
To my knowledge, the custom board is different from MityDsp in the following aspects:
1. The OMAPL138 on the custom board does not use the SPI flash, unlike the one one MityDsp. Hence, both the kernel and file system have to be on the NAND flash.
2. The custom board uses UART1 as the serial console port, while MityDsp uses UART2.
3. As for Ethernet, the PHY ID of the custom board is "0:01", whereas that of MityDsp is "0:03".

I modified the kernel I was using with MityDsp according to these changes, and it worked properly on the custom board with a network file system.

Quote:
Exactly what "steps" did you follow (i.e. there's more than one way to install a JFFS2 rootfs)?
I followed the steps provided here. To be specific, I
- Partitioned the NAND flash into three as follows:
Code:
U-Boot > mtdparts
device nand0 <nand>, # parts = 3
 #: name                size            offset          mask_flags
 0: userfs              0x00e00000      0x00000000      0
 1: rootfs              0x08000000      0x00e00000      0
 2: unused              0x17200000      0x08e00000      0
- Downloaded the JFFS2 file supplied by Critical Link to board via TFTP.
- Flashed the downloaded file system onto the "rootfs" partition.
- Updated the bootargs so Linux would look for the file system in the respective partition.
- Downloaded the Linux kernel (modified version in the custom board case) to the board via TFTP.
- Booted the kernel.

Quote:
Did you verify this installation, i.e. mount it and access files?
No, I don't know how to do that. Do you mean I should mount the JFFS2 to the Linux working with an NFS to see if it is working? Is that possible?

Quote:
The kernel seems to be read from something other than NAND flash, so that provides little information.
You're right, it is read directly from the RAM (transferred through TFTP).

Quote:
Twice you mention a singular "ECC error", but the log indicates a plethora of ECC error messages as well as JFFS2 CRC and check_node_data messages.
You're right, it throws tons of ECC errors. I didn't mean it was a single error; sorry if I misled you.

Quote:
Another poster guesses that "ECC would point at ram".
But the boot log clearly indicates that this is an incorrect guess.
I thought so.

Quote:
What have you done to verify that this NAND flash chip is actually functional on this custom board?
The NAND chip is functional because there are multiple baremetal DSP application binaries on it, and I can confirm that I can correctly reflash those binaries, and the DSP reads the binaries correctly.

Quote:
Have you tried a simpler test of reading this JFFS2 rootfs, e.g. using U-Boot?
No, I didn't know it was possible to read a rootfs using U-Boot. I mean, sure, I can read it as raw data and see what's stored in each byte, but how can I actually use that to verify a file system is correctly installed?
 
Old 03-05-2018, 02:47 AM   #5
blue_z
Member
 
Registered: Jul 2015
Location: USA
Distribution: Ubuntu, Lubuntu, Mint, custom embedded
Posts: 104

Rep: Reputation: Disabled
Quote:
Originally Posted by OkCalis View Post
To my knowledge, the custom board is different from MityDsp in the following aspects:...
Is the NAND flash chip different?


Quote:
Originally Posted by OkCalis View Post
I followed the steps provided here. To be specific, I ...
The key step is to erase the flash first.
Forgetting to erase the flash is one possible cause of the symptoms you see.
Refer to the MTD FAQ for JFFS2.


Quote:
Originally Posted by OkCalis View Post
No, I don't know how to do that.
Since you wrote the rootfs image using U-boot, you could read it if U-Boot is configured with that capability (see below).

An alternative method of installing a JFFS2 rootfs that I find simpler and foolproof is to use Linux.
Erase the partition, mount the partition, and then untar an archive of the rootfs.
This works because an erased flash partition is treated as an empty JFFS2 when first mounted.
Boot (e.g. TFTP) a Linux kernel that uses an initramfs or a rootfs via NFS to perform this installation.



Quote:
Originally Posted by OkCalis View Post
The NAND chip is functional because there are multiple baremetal DSP application binaries on it, and I can confirm that I can correctly reflash those binaries, and the DSP reads the binaries correctly.
To me that's only a partial confirmation.
The read functionality by the TI OMAP chip is not tested.


Quote:
Originally Posted by OkCalis View Post
No, I didn't know it was possible to read a rootfs using U-Boot. I mean, sure, I can read it as raw data and see what's stored in each byte, but how can I actually use that to verify a file system is correctly installed?
You need a U-Boot configured/built with CONFIG_CMD_JFFS2, and maybe CONFIG_CMD_MTDPARTS.
Look at the end of cmd/jffs2.c for the syntax of the fsinfo, fsload, and ls (renamed fsls in newer versions) commands in U-Boot.
Also study the comments at the beginning of that file for salient environment variables.
(It's been a long time since I've used these commands, so I can't be more precise.)

BTW it's strange to still use a release candidate version (even worse a rc1) long after there's an actual release (i.e. U-Boot version 2017.05).

Regards

Last edited by blue_z; 03-06-2018 at 12:42 AM.
 
Old 03-07-2018, 01:47 AM   #6
rocq
Member
 
Registered: Jan 2013
Location: Netherlands
Distribution: XUbuntu
Posts: 57

Rep: Reputation: Disabled
If you boot with a NFS rootfs, are you then able to mount and access the filesystem on the NAND? Are you able to mount and access the filesystem from u-boot?

You may want to review the ECC settings of the flash: Software or hardware controlled, 1-bit correction, 4-bit correction, ...
 
Old 03-07-2018, 03:29 AM   #7
OkCalis
Member
 
Registered: Dec 2017
Posts: 34

Original Poster
Rep: Reputation: Disabled
blue_z,

Quote:
Is the NAND flash chip different?
I suspect it is different, but its capacity is 512 MB like the chip on MityDsp. Not sure about the other parameters, though, like erase, page sizes, and so on. (I'll send an update if I can find it out.)

Quote:
The key step is to erase the flash first.
I know; I always erase it before writing on it.

Quote:
To me that's only a partial confirmation.
The read functionality by the TI OMAP chip is not tested.
To verify the functionality of the NAND, I did the following test. First, I downloaded the JFFS2 onto the RAM via TFTP, say, at address 0xC0000000. Second, I read the previously-flashed JFFS2 from the NAND to address 0xC2000000 of RAM. Lastly, I compared the two memory portions of RAM, with command "cmp 0xC0000000 0xC2000000 0x0132E000", which did not return a difference until "0xC132E000 != 0xC332E000". Here, the size of the JFFS2 file is 0x0132D9D4, and 0x0132E000 is the rounded file size, so I infer that the file system can be written and read properly. Am I right?

Quote:
BTW it's strange to still use a release candidate version (even worse a rc1) long after there's an actual release (i.e. U-Boot version 2017.05).
My colleagues were working on the board last year, before I got hired. They mentioned they had to modify the U-Boot code, so it would work with the custom board despite the hardware differences, namely UART and Ethernet. I don't know much about the U-Boot running on it.



rocq,

Quote:
If you boot with a NFS rootfs, are you then able to mount and access the filesystem on the NAND?
I tried doing that and ran into the following 'superblock' error.
Code:
root@mityomapl138:~# mount -t jffs2 /dev/mtdblock1 nand/
Cowardly refusing to erase blocks on filesystem with no valid JFFS2 nodes
empty_blocks 0, bad_blocks 0, c->nr_blocks 8
mount: /dev/mtdblock1: can't read superblock
 
Old 03-08-2018, 01:24 AM   #8
OkCalis
Member
 
Registered: Dec 2017
Posts: 34

Original Poster
Rep: Reputation: Disabled
I noticed something odd about MityDsp.

Code:
device nand0 <nand>, # parts = 3
 #: name                size            offset          mask_flags
 0: userfs              0x00e00000      0x00000000      0
 1: rootfs              0x08000000      0x00e00000      0
 2: unused              0x17200000      0x08e00000      0

active partition: nand0,0 - (userfs) 0x00e00000 @ 0x00000000

defaults:
mtdids  : none
mtdparts: none
bootargs=mem=96M console=ttyS1,115200n8 mtdparts=nand:14M(userfs),128M(rootfs),-(unused) root=/dev/mtdblock0 rw noatime rootfstype=jffs2 ip=none
Although the "rootfs" is the partition #1, as seen in mtdparts, the Linux boot parameter that properly boots the kernel is "root=/dev/mtdblock0". Isn't this supposed to throw an error because the JFFS2 is flashed onto "mtdblock1"? Oddly, it does throw an error when I change the boot parameter to "mtdblock1".

On the other hand, the custom board needs its boot parameter to be set as "root=/dev/mtdblock1" to start booting (before printing ECC errors). Here's the bootargs and mtdparts of the custom board.

Code:
device nand0 <nand>, # parts = 3
 #: name                size            offset          mask_flags
 0: userfs              0x00e00000      0x00000000      0
 1: rootfs              0x08000000      0x00e00000      0
 2: unused              0x17200000      0x08e00000      0

active partition: nand0,0 - (userfs) 0x00e00000 @ 0x00000000

defaults:
mtdids  : none
mtdparts: none
bootargs=mem=96M console=ttyS2,115200n8 mtdparts=nand:14M(userfs),128M(rootfs),-
(unused) root=/dev/mtdblock1 rw noatime rootfstype=jffs2 ip=none
Moreover, I tried booting the custom board with the uImage provided by MityDsp, instead of the custom kernel I built. It still cannot boot properly, but interestingly, it does not give any ECC errors, either. Here's its bootlog:

Code:
Linux version 3.2.0 (root@mitydsp) (gcc version 4.5.4 20120305 (prerelease) (GCC
) ) #1 PREEMPT Mon Jan 13 11:06:16 EST 2014
CPU: ARM926EJ-S [41069265] revision 5 (ARMv5TEJ), cr=00053177
CPU: VIVT data cache, VIVT instruction cache
Machine: MityDSP-L138/MityARM-1808
Memory policy: ECC disabled, Data cache writethrough
DaVinci da850/omap-l138/am18x variant 0x1
Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 24384
Kernel command line: mem=96M console=ttyS2,115200n8 mtdparts=nand:14M(userfs),12
8M(rootfs),-(unused) root=/dev/mtdblock1 rw noatime rootfstype=jffs2 ip=none
PID hash table entries: 512 (order: -1, 2048 bytes)
Dentry cache hash table entries: 16384 (order: 4, 65536 bytes)
Inode-cache hash table entries: 8192 (order: 3, 32768 bytes)
Memory: 96MB = 96MB total
Memory: 91368k/91368k available, 6936k reserved, 0K highmem
Virtual kernel memory layout:
    vector  : 0xffff0000 - 0xffff1000   (   4 kB)
    fixmap  : 0xfff00000 - 0xfffe0000   ( 896 kB)
    vmalloc : 0xc6800000 - 0xfea00000   ( 898 MB)
    lowmem  : 0xc0000000 - 0xc6000000   (  96 MB)
    modules : 0xbf000000 - 0xc0000000   (  16 MB)
      .text : 0xc0008000 - 0xc052c598   (5266 kB)
      .init : 0xc052d000 - 0xc0555000   ( 160 kB)
      .data : 0xc0556000 - 0xc05b5a80   ( 383 kB)
       .bss : 0xc05b5aa4 - 0xc05e2490   ( 179 kB)
NR_IRQS:245
Console: colour dummy device 80x30
Calibrating delay loop... 148.88 BogoMIPS (lpj=744448)
pid_max: default: 32768 minimum: 301
Mount-cache hash table entries: 512
CPU: Testing write buffer coherency: ok
devtmpfs: initialized
DaVinci: 144 gpio irqs
print_constraints: dummy: 
NET: Registered protocol family 16
baseboard_pre_init: Entered
mityomapl138_setup_nand: using 16 bit data
baseboard_init [IndustrialIO]...
bio: create slab <bio-0> at 0
SCSI subsystem initialized
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
set_machine_constraints: VDCDC1: failed to enable
tps65023 1-0048: failed to register tps65023
tps65023: probe of 1-0048 failed with error -121
Advanced Linux Sound Architecture Driver Version 1.0.24.
Switching to clocksource timer0_1
musb-hdrc: version 6.0, ?dma?, otg (peripheral+host)
Waiting for USB PHY clock good...
musb-hdrc musb-hdrc: USB OTG mode controller at fee00000 using PIO, IRQ 58
NET: Registered protocol family 2
IP route cache hash table entries: 1024 (order: 0, 4096 bytes)
TCP established hash table entries: 4096 (order: 3, 32768 bytes)
TCP bind hash table entries: 4096 (order: 2, 16384 bytes)
TCP: Hash tables configured (established 4096 bind 4096)
TCP reno registered
UDP hash table entries: 256 (order: 0, 4096 bytes)
UDP-Lite hash table entries: 256 (order: 0, 4096 bytes)
NET: Registered protocol family 1
RPC: Registered named UNIX socket transport module.
RPC: Registered udp transport module.
RPC: Registered tcp transport module.
RPC: Registered tcp NFSv4.1 backchannel transport module.
JFFS2 version 2.2. (NAND) ? 2001-2006 Red Hat, Inc.
msgmni has been set to 178
io scheduler noop registered (default)
start plist test
end plist test
Serial: 8250/16550 driver, 3 ports, IRQ sharing disabled
serial8250.0: ttyS0 at MMIO 0x1c42000 (irq = 25) is a 16550A
serial8250.0: ttyS1 at MMIO 0x1d0c000 (irq = 53) is a 16550A
serial8250.0: ttyS2 at MMIO 0x1d0d000 (irq = 61) is a 16550A
console [ttyS2] enabled
brd: module loaded
at24 1-0050: 256 byte 24c02 EEPROM, read-only, 0 bytes/write
MityOMAPL138: Read Factory Config Failed: -110
ahci ahci: forcing PORTS_IMPL to 0x1
ahci ahci: AHCI 0001.0100 32 slots 1 ports 3 Gbps 0x1 impl platform mode
ahci ahci: flags: ncq sntf pm led clo only pmp pio slum part ccc 
scsi0 : ahci_platform
ata1: SATA max UDMA/133 mmio [mem 0x01e18000-0x01e19fff] port 0x100 irq 67
NAND device: Manufacturer ID: 0x01, Chip ID: 0xbc (AMD NAND 512MiB 1,8V 16-bit)
Bad block table not found for chip 0
Bad block table not found for chip 0
Scanning device for bad blocks
ata1: SATA link down (SStatus 0 SControl 300)
Bad block table written to 0x00001ffe0000, version 0x01
Bad block table written to 0x00001ffc0000, version 0x01
Creating 2 MTD partitions on "davinci_nand.1":
0x000000000000-0x000008000000 : "rootfs"
0x000008000000-0x000020000000 : "homefs"
davinci_nand davinci_nand.1: controller rev. 2.5
spi_davinci spi_davinci.1: DMA: supported
spi_davinci spi_davinci.1: DMA: RX channel: 18, TX channel: 19, event queue: 0
m25p80 spi1.0: m25p64-nonjedec (8192 Kbytes)
Creating 8 MTD partitions on "m25p80":
0x000000000000-0x000000010000 : "ubl"
0x000000010000-0x000000090000 : "u-boot"
0x000000090000-0x0000000a0000 : "u-boot-env"
0x0000000a0000-0x0000000b0000 : "periph-config"
No LCD configured
MII PHY configured
0x0000000b0000-0x000000100000 : "reserved"
0x000000100000-0x000000400000 : "kernel"
0x000000400000-0x000000600000 : "fpga"
0x000000600000-0x000000800000 : "spare"
spi_davinci spi_davinci.1: Controller at 0xfef0e000
CAN device driver interface
mcp251x spi1.1: MCP251x didn't enter in conf mode after reset
mcp251x spi1.1: Probe failed
mcp251x spi1.1: probe failed
davinci_mdio davinci_mdio.0: davinci mdio revision 1.5
davinci_mdio davinci_mdio.0: detected phy mask fffffffd
davinci_mdio.0: probed
davinci_mdio davinci_mdio.0: phy[1]: device 0:01, driver unknown
ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
ohci ohci.0: DA8xx OHCI
ohci ohci.0: new USB bus registered, assigned bus number 1
Waiting for USB PHY clock good...
ohci ohci.0: irq 59, io mem 0x01e25000
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 1 port detected
Initializing USB Mass Storage driver...
usbcore: registered new interface driver usb-storage
USB Mass Storage support registered.
mousedev: PS/2 mouse device common for all mice
omap_rtc omap_rtc: rtc core: registered omap_rtc as rtc0
omap_rtc: RTC power up reset detected
i2c /dev entries driver
cpuidle: using governor ladder
cpuidle: using governor menu
davinci_mmc davinci_mmc.0: Using DMA, 4-bit mode
usbcore: registered new interface driver usbhid
usbhid: USB HID core driver
dsd1791 spi1.2: Failed to add route LLOUT->Line Out
asoc: dsd1791 <-> davinci-mcasp.0 mapping ok
ALSA device list:
  #0: MityDSP-L138 INDIO
TCP cubic registered
NET: Registered protocol family 17
can: controller area network core (rev 20090105 abi 8)
NET: Registered protocol family 29
can: raw protocol (rev 20090105)
------------[ cut here ]------------
WARNING: at arch/arm/mach-davinci/da850.c:1108 da850_regulator_init+0x30/0x54()
Unable to obtain voltage regulator for CVDD; voltage scaling unsupported
Modules linked in:
[<c000d518>] (unwind_backtrace+0x0/0xe0) from [<c001d030>] (warn_slowpath_common
+0x4c/0x64)
[<c001d030>] (warn_slowpath_common+0x4c/0x64) from [<c001d0c8>] (warn_slowpath_f
mt+0x2c/0x3c)
[<c001d0c8>] (warn_slowpath_fmt+0x2c/0x3c) from [<c0013c34>] (da850_regulator_in
it+0x30/0x54)
[<c0013c34>] (da850_regulator_init+0x30/0x54) from [<c0014828>] (davinci_cpu_ini
t+0x34/0xbc)
[<c0014828>] (davinci_cpu_init+0x34/0xbc) from [<c021dd10>] (cpufreq_add_dev+0x1
20/0x220)
[<c021dd10>] (cpufreq_add_dev+0x120/0x220) from [<c0193c8c>] (sysdev_driver_regi
ster+0xb0/0x128)
[<c0193c8c>] (sysdev_driver_register+0xb0/0x128) from [<c021d118>] (cpufreq_regi
ster_driver+0xd0/0x17c)
[<c021d118>] (cpufreq_register_driver+0xd0/0x17c) from [<c019631c>] (platform_dr
v_probe+0x14/0x18)
[<c019631c>] (platform_drv_probe+0x14/0x18) from [<c019520c>] (driver_probe_devi
ce+0xd4/0x198)
[<c019520c>] (driver_probe_device+0xd4/0x198) from [<c0195330>] (__driver_attach
+0x60/0x84)
[<c0195330>] (__driver_attach+0x60/0x84) from [<c01944a0>] (bus_for_each_dev+0x4
c/0x78)
[<c01944a0>] (bus_for_each_dev+0x4c/0x78) from [<c0194b18>] (bus_add_driver+0x98
/0x214)
[<c0194b18>] (bus_add_driver+0x98/0x214) from [<c0195894>] (driver_register+0xa0
/0x120)
[<c0195894>] (driver_register+0xa0/0x120) from [<c0196728>] (platform_driver_pro
be+0x18/0x98)
[<c0196728>] (platform_driver_probe+0x18/0x98) from [<c000887c>] (do_one_initcal
l+0x90/0x168)
[<c000887c>] (do_one_initcall+0x90/0x168) from [<c052d7e8>] (kernel_init+0x78/0x
11c)
[<c052d7e8>] (kernel_init+0x78/0x11c) from [<c0009cb8>] (kernel_thread_exit+0x0/
0x8)
---[ end trace be3cd69ebcb0f4e8 ]---
console [netcon0] enabled
netconsole: network logging started
davinci_emac davinci_emac.1: using random MAC addr: ae:d1:b3:cc:e8:37
omap_rtc omap_rtc: setting system clock to 2000-01-01 00:00:00 UTC (946684800)
VFS: Mounted root (jffs2 filesystem) on device 31:1.
devtmpfs: error mounting -2
Freeing init memory: 160K
Kernel panic - not syncing: No init found.  Try passing init= option to kernel. 
See Linux Documentation/init.txt for guidance.
[<c000d518>] (unwind_backtrace+0x0/0xe0) from [<c0323c4c>] (panic+0x58/0x188)
[<c0323c4c>] (panic+0x58/0x188) from [<c00087c8>] (init_post+0xa0/0xc4)
[<c00087c8>] (init_post+0xa0/0xc4) from [<c052d85c>] (kernel_init+0xec/0x11c)
I'm rather confused. What do you suggest I should do next?

Thanks again,
OkCalis

Last edited by OkCalis; 03-08-2018 at 01:25 AM. Reason: Fixing typo
 
Old 03-08-2018, 03:26 AM   #9
rocq
Member
 
Registered: Jan 2013
Location: Netherlands
Distribution: XUbuntu
Posts: 57

Rep: Reputation: Disabled
From the link where you got the instructions:
Quote:
Given a root filesystem in JFFS2 format, you need to transfer the image to the sector/partition of NAND that you want to load. From U-boot, given the MityDSP-L138 stock partitioning (these can be examined by the "mtdparts" command from u-Boot), this can be accomplished by the following commands. These commands assume:

- A NAND root filesystem partition size of 128 MBytes (the default configuration in a CL supplied kernel)
- A root filesystem image size less than 96MiB (100663296bytes).

u-Boot> mw.b 0xC2000000 0xFF 0x06000000
u-Boot> tftp 0xC2000000 myserver:/path/to/root_filesystem.jffs2
u-Boot> nand erase 0 0x8000000
u-Boot> nand write.jffs2 0xC2000000 0 0x<rounded_filesize>

Please NOTE that the "<rounded_filesize>" must be a hex number of the filesize rounded up to the next page size of the NAND device. In the case of MityDSP, this is 0x800. This is critical as overwriting additional data/pages to NAND will result in the kernel JFFS2 subsystem to inject ECC errors as it begins to write data/files to the partition. For example, if your jffs2 image size is 0x45cde74, then the argument passed to the nand write.jffs2 command must be 0x45ce000. Example using wolframalpha, equation: ceiling(0x45cde74, 0x800)

Please NOTE the ".jffs2" extension on the "nand write" command. This is critical. The extension (also ".e" and ".i" can be used) causes the u-Boot program to skip over bad blocks while writing the filesystem image to the NAND device. This is how the JFFS2 drivers in linux (or any other OS) expect the data to be stored in NAND. Not using the extension causes a straight write, and the u-Boot program will attempt to use bad blocks in the device.
Quote:
u-Boot> nand erase 0 0x8000000
u-Boot> nand write.jffs2 0xC2000000 0 0x<rounded_filesize>
I'm not sure, but shouldn't the bold marked arguments be '1' in your case? From what I found that particular argument specifies the offset (0x..) or partition number.

I would try the following: Do a read of an amount of bytes at the offsets of both partitions and compare them with the contents of your rootfs image.
Code:
# Check partition 1 (rootfs), using partition number
mw.b 0xC2000000 0xFF 32
nand read 0xC2000000 1 32  <-- or nand read.jffs2???
md.b 0xC2000000 32

# Check partition 1 (rootfs), using address
mw.b 0xC2000000 0xFF 32
nand read 0xC2000000 0x08000000 32  <-- or nand read.jffs2???
md.b 0xC2000000 32

# Since you may have accidentally overwritten partition 0 (userfs) we check this as well
# Check partition 0 (userfs), using partition number
mw.b 0xC2000000 0xFF 32
nand read 0xC2000000 0 32  <-- or nand read.jffs2???
md.b 0xC2000000 32

# Check partition 1 (rootfs), using address
mw.b 0xC2000000 0xFF 32
nand read 0xC2000000 0x0 32  <-- or nand read.jffs2???
md.b 0xC2000000 32
You can use an hex editor to open the rootfs image so you can compare the u-boot outputs with the image.

Quote:
Please NOTE that the "<rounded_filesize>"
This could be a clue here. What is the size of your image and what value did you use for the <rounded size>
 
Old 03-08-2018, 05:26 AM   #10
OkCalis
Member
 
Registered: Dec 2017
Posts: 34

Original Poster
Rep: Reputation: Disabled
Hi rocq,
Thanks for your response.

Quote:
I'm not sure, but shouldn't the bold marked arguments be '1' in your case? From what I found that particular argument specifies the offset (0x..) or partition number.
In my case, I replaced it with "rootfs", the name of the partition, like:

Code:
nand erase rootfs 0x8000000
Quote:
I would try the following: Do a read of an amount of bytes at the offsets of both partitions and compare them with the contents of your rootfs image.
Do you suggest doing this on MityDsp (where the file system works), or the custom board (where it doesn't)?
If it is the former, I can't compare the file system on my flash to the JFFS2 image because that file system has been on that flash for months, and I've done so many changes on it since then. (Am I missing something here?)
If it is the latter, I've already done that as I mentioned in my last post.
Quote:
What is the size of your image and what value did you use for the <rounded size>
The image size is 0x0132D9D4, making the rounded file size 0x0132E000, i.e., ceil(0x132D9D4, 0x800), where 0x800 (2KB) is the erase size of my flash.
 
Old 03-12-2018, 02:53 AM   #11
rocq
Member
 
Registered: Jan 2013
Location: Netherlands
Distribution: XUbuntu
Posts: 57

Rep: Reputation: Disabled
Quote:
Do you suggest doing this on MityDsp (where the file system works), or the custom board (where it doesn't)?
Do whatever you want. I'm just suggesting that you should read and compare the NAND contents with the image you've written. I would read the contents from NAND on the device that isn't working and compare the output with the rootfs image file. You can use any hex editor to check. You just need to check at the offset so you only compare the first couple of bytes (32 in my example).

PS. From the documentation that I found the calls for nand read, write and erase the offset would either be an address or partition number, but I noticed you used the partition name instead. That's probably how it's supposed to be used.
 
Old 03-12-2018, 04:09 AM   #12
blue_z
Member
 
Registered: Jul 2015
Location: USA
Distribution: Ubuntu, Lubuntu, Mint, custom embedded
Posts: 104

Rep: Reputation: Disabled
Quote:
Originally Posted by OkCalis View Post
I suspect it is different, but its capacity is 512 MB like the chip on MityDsp. Not sure about the other parameters, though, like erase, page sizes, and so on. (I'll send an update if I can find it out.)
Rather than suspect that they are different, simply look at the boards and read off the chip labels!

Quote:
Originally Posted by OkCalis View Post
To verify the functionality of the NAND, ... Am I right?
That's a sufficient and reassuring test.

Quote:
Originally Posted by OkCalis View Post
My colleagues were working on the board last year, before I got hired. They mentioned they had to modify the U-Boot code, so it would work with the custom board despite the hardware differences, namely UART and Ethernet. I don't know much about the U-Boot running on it.
There are ways to generate a patch file of all of your colleagues' changes, and then apply those changes to the released version.
But that can probably wait for another day.

Quote:
Originally Posted by OkCalis View Post
Although the "rootfs" is the partition #1, as seen in mtdparts, the Linux boot parameter that properly boots the kernel is "root=/dev/mtdblock0". Isn't this supposed to throw an error because the JFFS2 is flashed onto "mtdblock1"? Oddly, it does throw an error when I change the boot parameter to "mtdblock1".
Instead of looking at U-Boot partitions or the bootargs/kernel-command-line, inspect the Linux kernel boot log for actual MTD partition information.
See next comment: /dev/mtdblock0 seems to be the rootfs per the actual list of MTD partitions.

Quote:
Originally Posted by OkCalis View Post
Moreover, I tried booting the custom board with the uImage provided by MityDsp, instead of the custom kernel I built. It still cannot boot properly, but interestingly, it does not give any ECC errors, either. Here's its bootlog:
You're not seeing any ECC errors because a completely different region of NAND is defined for the mounted partition.
Look at the kernel boot log, not U-Boot or the bootargs/command-line, for what the kernel is actually using as MTD partitions:
Code:
NAND device: Manufacturer ID: 0x01, Chip ID: 0xbc (AMD NAND 512MiB 1,8V 16-bit)
Bad block table not found for chip 0
Bad block table not found for chip 0
Scanning device for bad blocks
ata1: SATA link down (SStatus 0 SControl 300)
Bad block table written to 0x00001ffe0000, version 0x01
Bad block table written to 0x00001ffc0000, version 0x01
Creating 2 MTD partitions on "davinci_nand.1":
0x000000000000-0x000008000000 : "rootfs"
0x000008000000-0x000020000000 : "homefs"
This /dev/mtdblock1 does not occupy the same blocks as the /dev/mtdblock1 of your custom kernel's boot log.
The partition definitions in the kernel command line was ignored, possibly indicating that there's a hardcoded partition table in that MityDsp kernel.

Quote:
Originally Posted by OkCalis View Post
I'm rather confused. What do you suggest I should do next?
Several things to try:
A. Change the partition definition in the bootargs for your custom kernel to
Code:
mtdparts=nand:128M(rootfs),-(homefs)
OR

B1. Erase the NAND flash.
B2. Boot with the custom kernel and NFS rootfs, then mount the JFFS2 on the NAND.
B3. Write some files to the JFFS2, and verify.
B4. Unmount, reboot, mount and verify.
This test is outlined in the MTD FAQ.

Regards

Last edited by blue_z; 03-12-2018 at 04:36 AM.
 
1 members found this post helpful.
Old 03-13-2018, 03:15 AM   #13
OkCalis
Member
 
Registered: Dec 2017
Posts: 34

Original Poster
Rep: Reputation: Disabled
Hi. Here's what I did:

I edited the kernel configuration file where NAND partitions are defined and rebuilt the kernel, so the "creating n MTD partitions on davinci_nand.1" message matches the partitions defined in the U-Boot environment, that is:

Code:
device nand0 <nand>, # parts = 3
 #: name                size            offset          mask_flags
 0: userfs              0x00e00000      0x00000000      0
 1: rootfs              0x08000000      0x00e00000      0
 2: unused              0x17200000      0x08e00000      0
Then, I booted with this custom kernel and an NFS.

My "/proc/mtd" is:

Code:
dev:    size   erasesize  name
mtd0: 00e00000 00020000 "userfs"
mtd1: 08000000 00020000 "rootfs"
mtd2: 17200000 00020000 "unused"
mtd3: 00010000 00010000 "ubl"
mtd4: 00080000 00010000 "u-boot"
mtd5: 00010000 00010000 "u-boot-env"
mtd6: 00010000 00010000 "periph-config"
mtd7: 00050000 00010000 "reserved"
mtd8: 00300000 00010000 "kernel"
mtd9: 00200000 00010000 "fpga"
mtd10: 00200000 00010000 "spare"
The first three MTD's are my NAND partitions, which seem correct.

Then, I tried mounting the JFFS2 with the "mount" command.

Code:
mount -t jffs2 /dev/mtd1 ~/nand
This returned an error stating "mtd1" is not a block device. Hence, I tried the following:

Code:
mount -t jffs2 /dev/mtdblock1 ~/nand
And this one returned tons of ECC errors, even more than U-Boot does. (Do you know the difference between "mtdX" and "mtdblockX", btw?)

What can be inferred from this?

-----------------------------------

PS:
Quote:
B1. Erase the NAND flash.
B2. Boot with the custom kernel and NFS rootfs, then mount the JFFS2 on the NAND.
B3. Write some files to the JFFS2, and verify.
B4. Unmount, reboot, mount and verify.
I couldn't try this because the U-Boot itself is installed on NAND and I don't the address range where it resides. I had to come up with the above test.

-----------------------------------

Addendum:
I repeated this test on the MityDsp board (where flash file system works).
I booted with NFS, then issued the command: "mount -t jffs2 /dev/mtdblock1 ~/nand" and it successfully mounted the file system. I read and wrote files on the mounted file system, unmounted it, and verified that files can be written.
This process did not work on mu custom board.

Last edited by OkCalis; 03-13-2018 at 04:43 AM. Reason: Addendum
 
Old 03-13-2018, 08:41 PM   #14
blue_z
Member
 
Registered: Jul 2015
Location: USA
Distribution: Ubuntu, Lubuntu, Mint, custom embedded
Posts: 104

Rep: Reputation: Disabled
Quote:
Originally Posted by OkCalis View Post
Here's what I did:
...
And this one returned tons of ECC errors, even more than U-Boot does.

What can be inferred from this?
That you cannot follow sensible suggestions?

What you did was a waste of time.
The results were completely predictable, given that you only changed (a) the method of specifying the partitions and (b) when the partition is mounted.


Quote:
Originally Posted by OkCalis View Post
Do you know the difference between "mtdX" and "mtdblockX", btw?
Yes, I do.
You need to study the MTD documentation.
The answer you seek is in the first section titled "MTD overview", which means that it's a basic concept.


Quote:
Originally Posted by OkCalis View Post
I couldn't try this because ...
Yes, you could.


Quote:
Originally Posted by OkCalis View Post
... the U-Boot itself is installed on NAND ...
Wrong, where did you get this idea?
If you had studied the kernel boot log as previously suggested, the MTD partition names clearly indicate that the U-Boot binary and environment are stored in serial flash, rather than NAND flash.


Quote:
Originally Posted by OkCalis View Post
I don't the address range where it resides.
Yes, that is just part of the problem.
You don't seem to know where anything is stored in the various flash devices.
You probably weren't even aware of the existence of the serial flash chip.


Quote:
Originally Posted by OkCalis View Post
I had to come up with the above test.
No, you didn't have to. It was a waste of time that produced no salient information.

Quote:
Originally Posted by OkCalis View Post
I repeated this test on the MityDsp board (where flash file system works).
Other than the practice using the mount command, this is another waste of time that produced no salient information.

If you're going to ignore the suggestions of experts that wrote the MTD FAQ for JFFS2, then you're wasting everybody's time.

Regards

Last edited by blue_z; 03-13-2018 at 09:31 PM.
 
Old 03-14-2018, 12:57 AM   #15
OkCalis
Member
 
Registered: Dec 2017
Posts: 34

Original Poster
Rep: Reputation: Disabled
Quote:
Wrong, where did you get this idea?
If you had studied the kernel boot log as previously suggested, the MTD partition names clearly indicate that the U-Boot binary and environment are stored in serial flash, rather than NAND flash.
That's the problem. I'm 100% sure that U-Boot is located on the NAND flash. Although there is an SPI flash on the board, the OMAP chip doesn't use it; it's solely used by the FPGA side. Hence, the components expected to reside in SPI flash by the kernel (such as UBL, U-Boot, environment variables) are actually on NAND.

I've been aware of that partition table in the boot log, as I do my best to follow your sensible suggestions and I also appreciate them, I really do. Guess my best is sometimes not enough. Sorry; I didn't mean to waste your time nor anyone else's.
 
  


Reply

Tags
arm, ecc, kernel panic


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
install custom kernel and rootfs on board SMDKV210 lolhangman Linux - Embedded & Single-board computer 0 07-21-2016 03:45 PM
Is it needed to build target kernel with MTD when i am having these for my board? ayyasprings Linux - Embedded & Single-board computer 7 12-05-2014 05:27 AM
[SOLVED] Custom kernel on Dell Optiplex panics: cannot mount root partition (null) Gullible Jones Linux - Desktop 5 07-15-2014 03:12 PM
Kernel panics : trying to write / read on tiny tty driver diwsdiwa Linux - Kernel 0 05-13-2013 12:44 PM
custom kernel panics DAChristen29 Ubuntu 2 07-18-2005 07:36 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Hardware > Linux - Embedded & Single-board computer

All times are GMT -5. The time now is 11:17 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration