LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software > Linux - Kernel
User Name
Password
Linux - Kernel This forum is for all discussion relating to the Linux kernel.

Notices


Reply
  Search this Thread
Old 07-11-2018, 01:11 PM   #1
knopper1
LQ Newbie
 
Registered: Jul 2018
Location: Germany
Distribution: KNOPPIX
Posts: 1

Rep: Reputation: Disabled
Zero sized files after eject + remount, confusing behavior of __mark_inode_dirty()


Hello,

Since a few 4.x kernel versions, we observed a strange behavior in the scenario of (software-)ejecting a mounted device after a file copy, then replugging and remounting.

The quite reproducible scenario, in our case, is this:
  1. Code:
    mount /media/sdb1
    (vfat formatted USB storage device listed in /etc/fstab with standard options, no sync option)
  2. Code:
    cp files /media/sdb1/
  3. Code:
    eject /media/sdb1
    (Note that eject also umounts the partition properly)
  4. Replug same USB pen
    No errors up to here shown in "dmesg".
  5. Code:
    mount /media/sdb1
    At this point, after successful mount,"dmesg" shows the following trace...:
    Code:
    WARNING: at fs/fs-writeback.c:1196 __mark_inode_dirty+0x1d0/0x1d4() 
    bdi-block not registered 
    ... some unrelated vfs functions in the calling hierarchy ...
  6. Code:
    ls -l /media/sdb1
    ...but the files show up as copied OK, with correct file size. Content checksum is also OK.
  7. Code:
    cp files2... /media/sdb1/
    Copying some more files now (new ones).
  8. Code:
    eject /media/sdb1
  9. (Replug...)
  10. Code:
    mount /media/sdb1
    trace appears again, like in step 5.
  11. Code:
    ls -l /media/sdb1
    Now, all the files copied after the SECOND mount are shown as size zero, while those copied in the FIRST mount are still OK. The file system reports as being corrupt in dosfsck, space used by the copied files is occupied, yet they show up as zero size.

The "bdi-block not registered" message in the trace originates from the internal __mark_inode_dirty() in fs/fs-writeback.c, and seems to be somehow related to the observed wrong file size. It could indicate the device is gone already before dirty inodes are to be marked as such.

My best guess is that, "eject" causes some information about the umounted file system to get cached internally, which is later reused in a wrong way.

Now after many experiments, the following patch, seems to "fix" the problem, meaning that the kernel trace disappears after the second and consecutive mounts, and all file data and metadata arrives completely in the file system with no more zero sized files, using the same steps as before. I'm still fairly sure it is kind of wrong in some way, but I have not observed any bad effects yet, neither growing or shrinking of dirty page ratio, nor data loss of any kind.

Code:
--- linux-4.17.5/fs/fs-writeback.c.orig 2018-07-09 23:58:45.000000000 +0200
+++ linux-4.17.5/fs/fs-writeback.c      2018-07-11 05:36:23.000000000 +0200
@@ -2115,8 +2115,11 @@ void __mark_inode_dirty(struct inode *in
 {
        struct super_block *sb = inode->i_sb;
        int dirtytime;
 
+       /* skip if s_bdev is unset */
+       if(sb->s_bdev == NULL) return;
+        
        trace_writeback_mark_inode_dirty(inode, flags);
 
        /*
         * Don't do this for I_DIRTY_PAGES - that doesn't actually

        /*
The "bdi-block not registered" WARN() message seems to be quite popular in a web search, but would not bother me if there wouldn't be a real file system problem after it appears.

Now my questions:
  1. How and why can sb->s_bdev get zero at this point, anyways? It seems to happen quite frequently, though, even when "eject" is not used to unmount the storage device.
  2. Which bad effects could the patch have? I could think of inodes not backed by a block device, which will never get written because they won't be enqueued properly. Can this happen, and how?

Regards
KK

Last edited by knopper1; 07-11-2018 at 01:13 PM.
 
Old 07-22-2018, 09:48 AM   #2
Mara
Moderator
 
Registered: Feb 2002
Location: Grenoble
Distribution: Debian
Posts: 9,681

Rep: Reputation: 218Reputation: 218Reputation: 218
It looks like there is a kernel issue. You've done enough investigation to post your report to the Linux Kernel Mailing List what is the place to report such bugs. If you need assistance in doing it please let us know.
 
Old 07-24-2018, 02:56 AM   #3
AwesomeMachine
LQ Guru
 
Registered: Jan 2005
Location: USA and Italy
Distribution: Debian testing/sid; OpenSuSE; Fedora; Mint
Posts: 5,480

Rep: Reputation: 997Reputation: 997Reputation: 997Reputation: 997Reputation: 997Reputation: 997Reputation: 997Reputation: 997
I've never used eject. But perhaps if you run 'sync' before eject it might work.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Split Backup Created By DD into CD or DVD sized files burgsprinta Linux - General 6 12-09-2010 09:33 AM
when i use right click and then choose eject button eject: unable to open `/dev/hda' sanatkrtiwari86 Linux - Newbie 3 12-02-2008 06:10 AM
Eject CD/DVD Media using Eject Button - Hal/Udev help v00d00101 Linux - Hardware 4 03-17-2008 04:04 PM
Compress and split a big sized file into smaller files hicham007 Programming 3 07-28-2005 09:56 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software > Linux - Kernel

All times are GMT -5. The time now is 06:10 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration