LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software > Linux - Kernel
User Name
Password
Linux - Kernel This forum is for all discussion relating to the Linux kernel.

Notices


Reply
  Search this Thread
Old 08-21-2009, 09:28 AM   #1
jmcdonald_mcds
LQ Newbie
 
Registered: Apr 2008
Location: Houston, TX
Distribution: Fedora 8
Posts: 5

Rep: Reputation: 0
EBUSY failure from write() when writing large blocks (>256k) of data...


I suspect there is a simple answer to this and that it is just a problem of my own ignorance... so I've come here looking to correct that.

I am using the mhvtl Virtual Tape Library driver under Fedora 8 (kernel 2.6.23.9). (mhvtl, for those who aren't familiar with it, lets you emulate a virtual tape library... i.e. a set of tape drives with autoloaders, etc. using files on one of your hard disks.) I've downloaded the sources, built and loaded the kernel module and user-space drivers, and gotten the basics working fine. My defined 'tape drives' show up under /dev and seem to function, for the most part, like real tape drives.

But one problem has me stumped... if I try to write more than about 256K in a single write() call, I get a failure status with errno = EBUSY. Calls to write anything <= 256k seem to work fine.

I dug into the code... both the kernel module (which looks like it accomplishes all its reads/writes via the ioctl file_operation), and the user-mode Virtual Tape driver... but from the debugging output I put into those guys, it looks like the 300k write call never even gets that far... it's getting filtered out somewhere before it gets to the mhvtl drivers. I'm just guessing here that some other driver earlier in the call chain is returning the failure code.

This same test program works fine, though, when writing to a real Tape drive. So there must be some way of letting the intermediate driver know that the mhvtl driver can handle a larger write. I just haven't been able to find out what that is. I don't see any 256K limit anywhere in the mhvtl driver code. (Perhaps 256k is a default value?)

So that's the crux of my problem, and the reason I come to you folks. Would anyone happen to have experience with this, and could you offer any suggestions on where to look?

Thanks in advance for any help you can give.

John McD
 
Old 08-21-2009, 12:08 PM   #2
David1357
Senior Member
 
Registered: Aug 2007
Location: South Carolina, U.S.A.
Distribution: Ubuntu, Fedora Core, Red Hat, SUSE, Gentoo, DSL, coLinux, uClinux
Posts: 1,302
Blog Entries: 1

Rep: Reputation: 107Reputation: 107
Quote:
Originally Posted by jmcdonald_mcds View Post
I'm just guessing here that some other driver earlier in the call chain is returning the failure code.
Did you try running "strace" on your application to see exactly what function fails?

"EBUSY" is not in the list of normal "errno" values for "write". However, from the man page for WRITE(2): "Other errors may occur, depending on the object connected to fd."

You might also try grepping for "EBUSY" in the driver and the library.
 
Old 08-21-2009, 12:56 PM   #3
jmcdonald_mcds
LQ Newbie
 
Registered: Apr 2008
Location: Houston, TX
Distribution: Fedora 8
Posts: 5

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by David1357 View Post
Did you try running "strace" on your application to see exactly what function fails?
Yep. strace says the write() call failed with an errno of EBUSY.

Quote:
You might also try grepping for "EBUSY" in the driver and the library.
I had that thought as well. EBUSY doesn't appear anywhere in any of the driver code.

So I got 'determined' and went in and put syslog() (or printk for the kernal driver) calls in almost every function in the kernel mode source file and in the user-mode driver. Rebuilt, reloaded the drivers, and ran the test program again. I could see my new log calls for the shorter writes, but not for the 300k write. It's as if my test program had simply not made that call... though strace verified that it had.

It's got me scratching my head.
 
Old 08-21-2009, 01:31 PM   #4
David1357
Senior Member
 
Registered: Aug 2007
Location: South Carolina, U.S.A.
Distribution: Ubuntu, Fedora Core, Red Hat, SUSE, Gentoo, DSL, coLinux, uClinux
Posts: 1,302
Blog Entries: 1

Rep: Reputation: 107Reputation: 107
Quote:
Originally Posted by jmcdonald_mcds View Post
It's got me scratching my head.
Now you've got me scratching my head. Try running this program and see what your resource limits are:
Code:
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <string.h>
#include <sys/time.h>
#include <sys/resource.h>

typedef struct _rlimit_entry
{
    const char *pszName;
    const int  iValue;
} rlimit_entry, *prlimit_entry;

#define RLIMIT_ENTRY(name) \
    { #name, name }

static rlimit_entry rlimit_table[] =
{
    RLIMIT_ENTRY(RLIMIT_AS),
    RLIMIT_ENTRY(RLIMIT_CORE),
    RLIMIT_ENTRY(RLIMIT_CPU),
    RLIMIT_ENTRY(RLIMIT_DATA),
    RLIMIT_ENTRY(RLIMIT_FSIZE),
    RLIMIT_ENTRY(RLIMIT_LOCKS),
    RLIMIT_ENTRY(RLIMIT_MEMLOCK),
    RLIMIT_ENTRY(RLIMIT_MSGQUEUE),
    RLIMIT_ENTRY(RLIMIT_NICE),
    RLIMIT_ENTRY(RLIMIT_NOFILE),
    RLIMIT_ENTRY(RLIMIT_NPROC),
    RLIMIT_ENTRY(RLIMIT_RSS),
    RLIMIT_ENTRY(RLIMIT_RTPRIO),
//    RLIMIT_ENTRY(RLIMIT_RTTIME),
    RLIMIT_ENTRY(RLIMIT_SIGPENDING),
    RLIMIT_ENTRY(RLIMIT_STACK),
};

#define ARRAY_SIZE(array) \
    (sizeof((array)) / sizeof((array)[0]))

int main(void)
{
    struct rlimit rlData;
    int iIndex;
    int iResult;
    int iReturn;

    // Assume failure
    iReturn = EXIT_FAILURE;

    for (iIndex = 0; iIndex < ARRAY_SIZE(rlimit_table); iIndex++)
    {
        iResult = getrlimit(rlimit_table[iIndex].iValue, &rlData);
        if (-1 == iResult)
        {
            fprintf(stderr, "ERROR: Could not get resource limits for %s: %s\n", rlimit_table[iIndex].pszName, strerror(errno));
            goto err;
        }

        printf("%-20s: ", rlimit_table[iIndex].pszName);
        printf("rlim_cur = %10lu, ", rlData.rlim_cur);
        printf("rlim_max = %10lu\n", rlData.rlim_max);
    }

    // Sweet success!
    iReturn = EXIT_SUCCESS;

err:

    return(iReturn);
}
Build it using "gcc -g -o <filename> <filename>.c" where <filename> is whatever you decide to name it.
 
Old 08-24-2009, 08:06 AM   #5
jmcdonald_mcds
LQ Newbie
 
Registered: Apr 2008
Location: Houston, TX
Distribution: Fedora 8
Posts: 5

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by David1357 View Post
Now you've got me scratching my head. Try running this program and see what your resource limits are:
...
No joy... I don't see anything in the limits that looks to be a problem.

Code:
 ./getLimits 
RLIMIT_AS           : rlim_cur = 18446744073709551615, rlim_max = 18446744073709551615
RLIMIT_CORE         : rlim_cur =   20480000, rlim_max =   20480000
RLIMIT_CPU          : rlim_cur = 18446744073709551615, rlim_max = 18446744073709551615
RLIMIT_DATA         : rlim_cur = 18446744073709551615, rlim_max = 18446744073709551615
RLIMIT_FSIZE        : rlim_cur = 18446744073709551615, rlim_max = 18446744073709551615
RLIMIT_LOCKS        : rlim_cur = 18446744073709551615, rlim_max = 18446744073709551615
RLIMIT_MEMLOCK      : rlim_cur =      32768, rlim_max =      32768
RLIMIT_MSGQUEUE     : rlim_cur =     819200, rlim_max =     819200
RLIMIT_NICE         : rlim_cur =          0, rlim_max =          0
RLIMIT_NOFILE       : rlim_cur =       1024, rlim_max =       1024
RLIMIT_NPROC        : rlim_cur =      16111, rlim_max =      16111
RLIMIT_RSS          : rlim_cur = 18446744073709551615, rlim_max = 18446744073709551615
RLIMIT_RTPRIO       : rlim_cur =          0, rlim_max =          0
RLIMIT_SIGPENDING   : rlim_cur =      16111, rlim_max =      16111
RLIMIT_STACK        : rlim_cur =   10485760, rlim_max = 18446744073709551615
I've also been trading e-mails with Mark Harvey, the fellow developing mhvtl... he's confirmed the same behavior on Ubuntu. Curious, but at least it's consistent. :-)

John M
 
Old 08-24-2009, 01:27 PM   #6
David1357
Senior Member
 
Registered: Aug 2007
Location: South Carolina, U.S.A.
Distribution: Ubuntu, Fedora Core, Red Hat, SUSE, Gentoo, DSL, coLinux, uClinux
Posts: 1,302
Blog Entries: 1

Rep: Reputation: 107Reputation: 107
Quote:
Originally Posted by jmcdonald_mcds View Post
I've also been trading e-mails with Mark Harvey, the fellow developing mhvtl...
The only thing I found that might be related is that if kzalloc fails in the SCSI tape driver (drivers/scsi/st.c) it returns EBUSY if kzalloc does not fail because it was interrupted:
Code:
static struct st_request *st_allocate_request(struct scsi_tape *stp)
{
        struct st_request *streq;

        streq = kzalloc(sizeof(*streq), GFP_KERNEL);
        if (streq)
                streq->stp = stp;
        else {
                DEBC(printk(KERN_ERR "%s: Can't get SCSI request.\n",
                            tape_name(stp)););
                if (signal_pending(current))
                        stp->buffer->syscall_result = -EINTR;
                else
                        stp->buffer->syscall_result = -EBUSY;
        }

        return streq;
}
To copy from user space to kernel space, a driver needs to use copy_from_user. If some part of the kernel or some other driver is trying to allocate the target buffer on-the-fly, the allocation of 256KB might fail, and the code might return EBUSY. This may seem strange because ENOMEM would appear to be the more appropriate choice. However, not every line of the kernel makes sense to the casual observer.

Are you trying to do back-to-back 256KB writes? If so, if you are not opening the device with O_SYNC, you may be returning early from write (i.e. before it is finished). This might cause the next write to fail with EBUSY.

Last edited by David1357; 08-25-2009 at 10:38 AM. Reason: Fixed a typo
 
Old 08-25-2009, 08:50 AM   #7
jmcdonald_mcds
LQ Newbie
 
Registered: Apr 2008
Location: Houston, TX
Distribution: Fedora 8
Posts: 5

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by David1357 View Post
The only thing I found that might be related is that if kzalloc fails in the SCSI tape driver (drivers/scsi/st.c) it returns EBUSY if kzalloc does not fail because it was interrupted:
. . .
Are you trying to do back-to-back 256KB writes? If so, if you are not opening the device with O_SYNC, you may be returning early from write (i.e. before it is finished). This might cause the next write to fail with EBUSY.
I suspect it may not be this function itself (since that call seems to be allocating an st_request*), but it may be the equivalent function that tries to allocate the buffer to hold the data to be written. It probably defaults to 256k and has a bug in the logic for handling requests that take more than one of its default buffers.

Re: doing back-to-back writes, yes, my test program is doing that. But it seems to be the size of the write request that consistently causes the failure, and not the frequency. I can do a whole bunch of smaller writes with no EBUSY failure, but even *one* write of a block larger than 256k will cause a failure.

I'm still poking at it a bit, but have been pulled off to fight other 'fires', as it were, so this is a background task. (Isn't that always the way it goes?) We've set the Blocking size in our app down to 256k, which lets us use vtl for testing... so I'm really just chasing this one now to try and improve vtl itself.

Anyway, thanks for the feedback. That's a good lead on where to look.
 
Old 08-27-2009, 08:36 AM   #8
jmcdonald_mcds
LQ Newbie
 
Registered: Apr 2008
Location: Houston, TX
Distribution: Fedora 8
Posts: 5

Original Poster
Rep: Reputation: 0
Smile Mystery solved...

Mark Harvey, the mhvtl maintainer, tracked it down. The mhvtl driver was specifying an sg_tablesize of 64 in its scsi_host_template structure when it called scsi_host_alloc(). Apparently, this tells the SCSI driver to allocate 64 pages, at 4096k each on our system, for data transfer.

He says he'll be raising that limit in the next mhvtl release.
 
Old 08-27-2009, 12:02 PM   #9
David1357
Senior Member
 
Registered: Aug 2007
Location: South Carolina, U.S.A.
Distribution: Ubuntu, Fedora Core, Red Hat, SUSE, Gentoo, DSL, coLinux, uClinux
Posts: 1,302
Blog Entries: 1

Rep: Reputation: 107Reputation: 107
Quote:
Originally Posted by jmcdonald_mcds View Post
Apparently, this tells the SCSI driver to allocate 64 pages, at 4096k each on our system, for data transfer.
Sounds like an easy fix! Don't forget to mark this thread as "solved" using the "Thread Tools" drop down.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
the get data from serial port function read() blocks when data not available DEF. Programming 3 11-17-2014 07:11 AM
LXer: This week at LWN: Large pages, large blocks, and large problems LXer Syndicated Linux News 0 09-27-2007 11:40 AM
make bzImage failure - "Value too large for defined data type" hce_ Linux - Kernel 2 07-01-2007 02:30 PM
Delayed write failure from winxp client writing to slackwarey 10.2 Ook Slackware 7 10-21-2006 12:02 PM
Script file to replace large text blocks in files? stodge Linux - Software 0 09-27-2003 10:53 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software > Linux - Kernel

All times are GMT -5. The time now is 11:19 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration