LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 04-02-2008, 02:10 PM   #1
guru_stpetebeach
LQ Newbie
 
Registered: Mar 2008
Location: St Petersburg, Florida
Distribution: ubuntu, gentoo & debian
Posts: 28

Rep: Reputation: 17
hal and dbus api - multithreaded is causing deadlock


I seem to be getting a deadlock in an application I am writing using hal and dbus.

I have the main thread #1, that is calling hal_dispatch(timeout=100) in loop. Here is the backtrace inside the deadlock:
#0 0x00002b349aa8ee06 in poll () from /lib/libc.so.6
#1 0x00002b3497c486c7 in ?? () from /usr/lib/libdbus-1.so.3
#2 0x00002b3497c470b4 in ?? () from /usr/lib/libdbus-1.so.3
#3 0x00002b3497c359de in ?? () from /usr/lib/libdbus-1.so.3
#4 0x00002b3497c364a0 in ?? () from /usr/lib/libdbus-1.so.3
#5 0x000000000043585d in hal_dispatch (timeout=100) at shell.cc:52
#6 0x000000000042e830 in main (argc=1, argv=0x7fff1366e638, envp=0x7fff1366e648) at main.cc:204

I have another thread #2, a service thread, that is calling a function to retrieve a property on a volume (cdrom disc) using it's udi. I have tried to solve the deadlock issue two ways by creating a seperate HalContext for this service thread.
1. Using libhal_ctx_init_direct()
2. Using dbus_bus_get(), libhal_ctx_new(), ctx_set_dbus_connection(), libhal_ctx_init().

Method #2 is the standard method to create a hal/dbus connection and is the same method I use to create the main hal context in the main loop.

These init methods are called inside the service thread #2. In both these cases, the libhal init function fails.

So I am back to trying to use the main loops HalContext. But again, this deadlocks, here is the backtrace of Thread#2 trying to query a device parameter:
#0 0x00002b3499d036a6 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
#1 0x00002b3497c4e411 in ?? () from /usr/lib/libdbus-1.so.3
#2 0x00002b3497c35554 in ?? () from /usr/lib/libdbus-1.so.3
#3 0x00002b3497c359b5 in ?? () from /usr/lib/libdbus-1.so.3
#4 0x00002b3497c367e2 in ?? () from /usr/lib/libdbus-1.so.3
#5 0x00002b3497c374fb in ?? () from /usr/lib/libdbus-1.so.3
#6 0x00002b3497c36bac in dbus_connection_send_with_reply_and_block () from /usr/lib/libdbus-1.so.3
#7 0x00002b3497e68c0e in libhal_device_get_property_string () from /usr/lib/libhal.so.1
#8 0x00000000004351da in hal_get_property (ctx=0x65f2d0, udi=0x6868f8 "/org/freedesktop/Hal/devices/volume_label_080211_1004",
key=0x2aaaaaabb82d "volume.mount_point", s=0x407fffa0) at shell.cc:59
#9 0x00002aaaaaab7e0a in task_archive_dvd (task=0x66d780) at archive.cc:548
#10 0x00002aaaaaab8156 in archive_import_handler (service=0x669d68, task=0x66d780) at archive.cc:626
#11 0x0000000000437452 in device_service (self=0x669d68) at sys.cc:168
#12 0x000000000043263b in service_t::s_run (pthis=0x669d68) at service.cc:34
#13 0x00002b3499cff047 in start_thread () from /lib/libpthread.so.0
#14 0x00002b349aa96f4d in clone () from /lib/libc.so.6
#15 0x0000000000000000 in ?? ()


I am sure I am doing something wrong here but info on hal/dbus is sparse...API call docs is all I can find. Much of my other hal/dbus code works great as its in the main thread. In fact, all this worked when my process was single threaded and I serialized all work done by my app.

My guess is I have to signal the main loop not to dispatch while I use the hal context with a mutex. Really, what I would like is to be able to create a second HalContext for each of my service threads so they can query HAL, but this context would not be used for event notification typically.

Anyone a dbus/hal expert?

Thanks,
Colin
 
Old 04-02-2008, 04:20 PM   #2
jailbait
LQ Guru
 
Registered: Feb 2003
Location: Virginia, USA
Distribution: Debian 12
Posts: 8,337

Rep: Reputation: 548Reputation: 548Reputation: 548Reputation: 548Reputation: 548Reputation: 548
You describe your problem in terms of the order that various functions are called. A much better way to analyze the problem is to describe what mutexes are locked and in what order.

As a simple theoretical example of a deadlock suppose that two threads each lock on both mutex A and mutex B. You can get into the deadlock where thread 1 has a lock on mutex A and is waiting for another thread's lock on mutex B to clear while thread 2 has a lock on mutex B and is waiting for another thread's lock on mutex A to clear.

There are two possible generalized solutions to a multiple mutex interlock. One way is for all threads to always lock on the mutexes in the same order. In the simple A,B example you could have the rule that all threads locking on both A and B must do so in the order of B then A.

If your program logic is such that you cannot guarentee the order of locking then all threads must issue a multiple lock on every mutex they lock on. In my A,B example all threads must issue a multiple lock for A and B when they first need to lock on one or the other. Issuing multiple locks slow down performance since some of the locks will be held longer than otherwise necessary.

So to analyze your deadlock problem you need to diagram the order in which each thread locks on the various mutexes in use.

Another possible cause of your deadlock is not releasing mutexes when they are no longer needed. So in your thread/mutex diagram you also need to show when each mutex is released.

---------------------------
Steve Stites
 
Old 04-07-2008, 11:20 AM   #3
guru_stpetebeach
LQ Newbie
 
Registered: Mar 2008
Location: St Petersburg, Florida
Distribution: ubuntu, gentoo & debian
Posts: 28

Original Poster
Rep: Reputation: 17
Hi Jailbait,

Thanks for the reply. However, I am very familiar with mutexes, semaphores and race conditions. In the code I am executing I havent used any mutexes or defined any critical sections yet.

Why no critical sections yet? Only, because I know the execution process to be linear right now. I mean that the initial thread initializes and makes some calls on the hal context. Then it creates thread 2, and thread 2 tries to make some hal calls but locks. Going back to thread 1, after it starts thread 2, it does call hal_dispatch(), which may be my source of contention. I am going to remove this call temporarily to see what happens. I have a few other ideas to test the behavior of hal api too.

The real problem though, is that I cannot create more than a single hal context to make calls on. Without this, I will be left to serializing all hal calls, yuk. My application is something like a web server. The user defines how many worker threads there are, and the application queues requests/jobs to each of the worker threads. Ideally, each worker thread would have a hal context.

I find the hal documentation lacking on multi-threading implementation. Really, hal documentation amounts to simple doxygen output of the embedded source comments. Figuring out hal this far has been very easy, mostly self explanatory, except no mention of multi-threading.

Also, another example of lacking hal docs, what is the hal main-loop integration callback? Ok, obviously somehow it helps you implement your application main loop and probably dispatching. But how? no sample, or further descriptions beyond the canonical...and I have looked REAL HARD!

My experience with programming is pretty extensive. I am comfortable with programming backend services, protocols, and linux device drivers. I've also made hardware peripheral cards using FPGAs and CPLDs. So it's ok to get real technical on me! (some of this is on my website.)

Thanks again,
Colin

Quote:
Originally Posted by jailbait View Post
You describe your problem in terms of the order that various functions are called. A much better way to analyze the problem is to describe what mutexes are locked and in what order.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Vmplayer + dbus -hal arubin Slackware 4 01-09-2008 09:28 AM
hal/dbus mounting cd/dvd ToK Linux - Software 5 06-09-2007 06:01 PM
Suse 10.1 Lost HAL and Dbus bogzab SUSE / openSUSE 3 02-08-2007 05:07 PM
HAL, DBus and Policykit slothpuck Slackware 2 01-31-2007 12:56 PM
HAL and DBUS? Archer36 Slackware 17 10-26-2006 09:50 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 02:40 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration