LinuxQuestions.org
Review your favorite Linux distribution.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software > Linux - Kernel
User Name
Password
Linux - Kernel This forum is for all discussion relating to the Linux kernel.

Notices


Reply
  Search this Thread
Old 11-17-2011, 03:53 PM   #1
papaLou
LQ Newbie
 
Registered: Jan 2011
Posts: 12

Rep: Reputation: 0
Mutex is always locked? Not functioning properly...


I am developing a driver for a device in Linux Kernel 2.6.35.7. I create a mutex (have tried DEFINE_MUTEX() and mutex_init()), when created mutex_is_locked always returns 1. If I mutex_unlock and then check mutex_is_locked it still returns 1.

When I call mutex_trylock it returns 0 as expected because it cannot acquire the mutex but when I call mutex_lock_interruptible it returns 0 as if it has acquired the lock.

This is very frustrating as it did work on a previous kernel but since we upgraded our kernel it now is useless.

Am I missing some change to the Linux kernel mutex API? I have searched and found all the info on kernel mutexing I need but the behavior I'm seeing is not described anywhere...


EDIT: I am now noticing that when I call mutex_unlock, mutex_is_locked returns 1. If I call mutex_lock, mutex_is_locked returns 0.

Last edited by papaLou; 11-17-2011 at 04:29 PM.
 
Old 11-17-2011, 04:35 PM   #2
kbp
Senior Member
 
Registered: Aug 2009
Posts: 3,790

Rep: Reputation: 653Reputation: 653Reputation: 653Reputation: 653Reputation: 653Reputation: 653
What's the return code from mutex_init() ?
 
Old 11-17-2011, 05:01 PM   #3
papaLou
LQ Newbie
 
Registered: Jan 2011
Posts: 12

Original Poster
Rep: Reputation: 0
The mutex_init I have appears to be a macro. How do I look at the return code?
 
Old 11-17-2011, 05:09 PM   #4
kbp
Senior Member
 
Registered: Aug 2009
Posts: 3,790

Rep: Reputation: 653Reputation: 653Reputation: 653Reputation: 653Reputation: 653Reputation: 653
Maybe you could post a code snippet, it might be easier.
 
Old 11-17-2011, 05:33 PM   #5
papaLou
LQ Newbie
 
Registered: Jan 2011
Posts: 12

Original Poster
Rep: Reputation: 0
Will work on posting a snippet..

Q: Are mutexes binary or do they keep a count of requests and only block when you try to lock a mutex with a count of zero?
I've been printing out the mutex_name->count.counter value and it seems to increment and decrement like a semaphore. My code is working when I use that count for my conditional lock and unlocks.
 
Old 11-17-2011, 05:40 PM   #6
kbp
Senior Member
 
Registered: Aug 2009
Posts: 3,790

Rep: Reputation: 653Reputation: 653Reputation: 653Reputation: 653Reputation: 653Reputation: 653
Maybe you should use down/up for locking and unlocking then

Mutexes are binary:
http://geekswithblogs.net/shahed/arc.../09/81268.aspx

Last edited by kbp; 11-17-2011 at 05:44 PM.
 
Old 11-17-2011, 06:04 PM   #7
papaLou
LQ Newbie
 
Registered: Jan 2011
Posts: 12

Original Poster
Rep: Reputation: 0
That is what I thought. This behavior would indicate a bug in the kernel I compiled I guess.
 
Old 11-17-2011, 07:23 PM   #8
sundialsvcs
LQ Guru
 
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 10,665
Blog Entries: 4

Rep: Reputation: 3945Reputation: 3945Reputation: 3945Reputation: 3945Reputation: 3945Reputation: 3945Reputation: 3945Reputation: 3945Reputation: 3945Reputation: 3945Reputation: 3945
"A bug in the kernel ...?" Uhh, no.

How do I say this nicely? Well, maybe I don't have to mince too many words because after all we're talking about computer software and not about each other. It's a bug, all right, and (how do I say this?) it's in your code somewhere. Because, realistically, there is no other plausible place that it could be. Now it could obviously be many root causes and you're going to have to be very methodical (you imply that the code worked properly before, which is a very important finding), but the correct operation of the kernel is more or less something that you can take for granted.
 
Old 11-18-2011, 03:42 PM   #9
papaLou
LQ Newbie
 
Registered: Jan 2011
Posts: 12

Original Poster
Rep: Reputation: 0
Did not mean to offend, I meant an incompatibility with the libraries we have, they have been modified (some updated, some not) so it would be a developer created bug.
 
Old 11-19-2011, 11:07 AM   #10
sundialsvcs
LQ Guru
 
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 10,665
Blog Entries: 4

Rep: Reputation: 3945Reputation: 3945Reputation: 3945Reputation: 3945Reputation: 3945Reputation: 3945Reputation: 3945Reputation: 3945Reputation: 3945Reputation: 3945Reputation: 3945
Offend? Pshaw. "Nothing personal, it's just ones and zeroes."

Definitely you're going to get your libraries ship-shape first. If you're mixing apples and oranges in the same basket then it will be pragmatically impossible to figure anything out. As of course we all know. You don't say if these are your libraries or not. If they are, then I trust that you are using version-control in which case you can look back through the source-code modifications since the "last known-good" point. Or maybe you can just start by re-building everything. Also make very sure that you know which libraries and which versions thereof are, actually being referenced by the code. (Any possible source of uncertainty you can think of ... chase it down, answer it, and prove that you have the right answer.)

We can exclude the kernel from consideration because, if mutex locking was not working properly, nothing in the system would work ... the darned thing would have crashed into a kernel-panic long before now. The semantics of the kernel do change from time to time, e.g. across major-release boundaries, but fundamental interfaces do tend to remain stable, and mutual-exclusion is just about as "fundamental" as it gets.

Therefore, it is going to be an issue with user-land application code. I'd look very closely at the guts of whatever user-land libraries are used to access your driver. Many bugs like this can masquerade as what you think they are. (For example, what if the return-code really is nonzero, but the calling-sequence is busted such that your app always sees zero? You need to be able to, among other things, prove that you know what the return-code should be (tracing the value all the way from its origin to the point where it is given to userland), and then also prove that the application-side code always comes up with the correct answer in every single case.

Last edited by sundialsvcs; 11-19-2011 at 11:10 AM.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
usb mouse not functioning properly user100 Slackware 1 11-13-2010 04:48 PM
Shell script not functioning properly jordanmc31 Programming 21 04-17-2006 06:35 AM
ProFTPd installed, but not functioning properly kyletriggs Linux - Software 3 02-20-2006 12:00 PM
My network is not functioning properly procfs Linux - Networking 5 02-16-2006 12:50 AM
TCPIP not functioning properly safescott Linux - Networking 8 06-02-2003 01:13 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software > Linux - Kernel

All times are GMT -5. The time now is 10:39 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration