LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices


Reply
  Search this Thread
Old 12-16-2011, 08:10 AM   #1
wriswith
LQ Newbie
 
Registered: Dec 2010
Posts: 5

Rep: Reputation: 0
Dpkg hangs indefinitely


Hi,

I am working on a project in a fairly advanced ubuntu environment. The machine I am working on is part of a cluster and is running an opennms monitoring server. Since I started on this project I noticed there was a dpkg proces running in the background:
Code:
root     14265     1  0 Dec08 ?        00:00:00 /usr/bin/dpkg --status-fd 24 --unpack --auto-deconfigure /var/cache/apt/archives/libsmi2ldbl_0.4.8+dfsg2-2_amd64.deb
This process is unkillable and doesn't even flinch at a kill command.

This in itself is not a problem, I assume that when the server is rebooted i t will disappear, but now every dpkg install command hangs on unpacking the package.

Code:
root@opennms0302:~# apt-get install libsmi2ldbl
Reading package lists... Done
Building dependency tree
Reading state information... Done
Suggested packages:
  snmp-mibs-downloader
The following NEW packages will be installed:
  libsmi2ldbl
0 upgraded, 1 newly installed, 0 to remove and 95 not upgraded.
Need to get 359kB of archives.
After this operation, 1,053kB of additional disk space will be used.
Get:1 http://be.archive.ubuntu.com/ubuntu/ lucid/universe libsmi2ldbl 0.4.8+dfsg2-2 [359kB]
Fetched 359kB in 0s (862kB/s)
Selecting previously deselected package libsmi2ldbl.
(Reading database ... 60221 files and directories currently installed.)
Unpacking libsmi2ldbl (from .../libsmi2ldbl_0.4.8+dfsg2-2_amd64.deb) ...
There are no error messages what so ever. I spent the better part of the day researching this problem and tried various dirty techniques to solve this. I went in and tried to remove the broken packages by hand. I also used "dpkg --remove --force-remove-reinstreq", "apt-get clean" and "dpkg --configure -a".

I have yet to restart the server, but I doubt this will resolve the problem and it might not be easy to get the server up and running again in the cluster. The problem is getting critical though, because new plugins need to be installed for the OpenNMS implementation.

Any suggestions are welcome, I know I gave limited information, but that is because there are no error messages thrown anywhere.

Last edited by wriswith; 12-16-2011 at 08:41 AM.
 
Old 12-16-2011, 09:12 AM   #2
evo2
LQ Guru
 
Registered: Jan 2009
Location: Japan
Distribution: Mostly Debian and CentOS
Posts: 6,724

Rep: Reputation: 1705Reputation: 1705Reputation: 1705Reputation: 1705Reputation: 1705Reputation: 1705Reputation: 1705Reputation: 1705Reputation: 1705Reputation: 1705Reputation: 1705
Have you tried running it under strace so that you can see where it hangs?

Cheers,

Evo2.

---------- Post added 2011-12-17 at 00:13 ----------

Have you tried running it under strace so that you can see where it hangs?

Cheers,

Evo2.
 
1 members found this post helpful.
Old 12-19-2011, 02:10 AM   #3
wriswith
LQ Newbie
 
Registered: Dec 2010
Posts: 5

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by evo2 View Post
Have you tried running it under strace so that you can see where it hangs? - Evo2.
I just tried that and it confirmed my suspicions this is a problem I read about but did not find a solution for. Here is the part dpkg hangs on:
Code:
08:55:53.141345 read(21, "Unpacking replacement libsmi2ldb"..., 1024) = 39
08:55:53.141385 write(1, "Unpacking replacement libsmi2ldb"..., 39) = 39
08:55:53.141430 read(19, "status: man-db: triggers-pending"..., 1024) = 69
08:55:53.141502 wait4(15783, 0x7fff2499ed58, WNOHANG, NULL) = 0
08:55:53.141528 pselect6(22, [0 19 21], NULL, NULL, {1, 0}, {[], 8}) = 0 (Timeout)
08:55:54.142013 wait4(15783, 0x7fff2499ed58, WNOHANG, NULL) = 0
08:55:54.142068 pselect6(22, [0 19 21], NULL, NULL, {1, 0}, {[], 8}) = 0 (Timeout)
08:55:55.143190 wait4(15783, 0x7fff2499ed58, WNOHANG, NULL) = 0
08:55:55.143253 pselect6(22, [0 19 21], NULL, NULL, {1, 0}, {[], 8}) = 0 (Timeout)
08:55:56.144372 wait4(15783, 0x7fff2499ed58, WNOHANG, NULL) = 0
.... #Here I try to kill the process
08:57:58.680830 wait4(15783, 0x7fff2499ed58, WNOHANG, NULL) = 0
08:57:58.680885 pselect6(22, [0 19 21], NULL, NULL, {1, 0}, {[], 8}) = ? ERESTARTNOHAND (To be restarted)
08:57:59.584270 --- SIGINT (Interrupt) @ 0 (0) ---
08:57:59.584310 pselect6(22, [0 19 21], NULL, NULL, {0, 96689776}, {[], 8}) = 0 (Timeout)
08:57:59.681234 wait4(15783, 0x7fff2499ed58, WNOHANG, NULL) = 0
It just repeats the last 2 lines forever. I moved the output file of strace to prevent the file getting to big.

The problem is that dpkg is waiting for some kind of I/O, but it is not receiving anything and just sits there forever. As for why it won't let me kill it, I do not know. I will continue to research the problem, but I appreciate any help I can get.

ps: Thx for the suggestion Evo2.

Edit:
As most people having this problem seem to blame their file system, I will post my /etc/mtab:
Code:
/dev/sda1 / ext4 rw,errors=remount-ro 0 0
proc /proc proc rw,noexec,nosuid,nodev 0 0
none /sys sysfs rw,noexec,nosuid,nodev 0 0
none /sys/fs/fuse/connections fusectl rw 0 0
none /sys/kernel/debug debugfs rw 0 0
none /sys/kernel/security securityfs rw 0 0
none /dev devtmpfs rw,mode=0755 0 0
none /dev/pts devpts rw,noexec,nosuid,gid=5,mode=0620 0 0
none /dev/shm tmpfs rw,nosuid,nodev 0 0
none /var/run tmpfs rw,nosuid,mode=0755 0 0
none /var/lock tmpfs rw,noexec,nosuid,nodev 0 0
none /lib/init/rw tmpfs rw,nosuid,mode=0755 0 0
10.1.1.12:/vol/DATA /nasdata nfs rw,addr=10.1.1.12 0 0
/dev/drbd0 /cluster ext4 rw 0 0
After some inquiry I found out nothing was installed since the cluster volume was set up, so I assume my problems originate from there. Now the question is, how do I solve this?

Last edited by wriswith; 12-20-2011 at 06:48 AM.
 
Old 12-19-2011, 08:54 PM   #4
evo2
LQ Guru
 
Registered: Jan 2009
Location: Japan
Distribution: Mostly Debian and CentOS
Posts: 6,724

Rep: Reputation: 1705Reputation: 1705Reputation: 1705Reputation: 1705Reputation: 1705Reputation: 1705Reputation: 1705Reputation: 1705Reputation: 1705Reputation: 1705Reputation: 1705
From the strace it seems to be man-db related. I'm just guessing here but does man look in /cluster? Ie did you edit /etc/manpath.config to include /cluster? Also, what happens it you try to install/configure the package with /cluster unmounted?

Evo2.
 
Old 12-20-2011, 02:16 AM   #5
wriswith
LQ Newbie
 
Registered: Dec 2010
Posts: 5

Original Poster
Rep: Reputation: 0
Quote:
From the strace it seems to be man-db related. I'm just guessing here but does man look in /cluster? Ie did you edit /etc/manpath.config to include /cluster?
I did not change the manpath.config, but to share config between the main and the fail over OpenNMS server there are symlinks going from some folders in /etc to the cluster. So that way I guess dpkg is trying to access the cluster, but I don't see why that should cause any problems.

Quote:
Also, what happens it you try to install/configure the package with /cluster unmounted?
Some, if not most, of the installs will want to write at least some data to the cluster (through symlinks). Also, unmounting the cluster would mean taking the server out of production, which might be a hard sell towards my PM.

I think I will continue by looking for a solution to get dpkg to follow symlinks and access the cluster.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Maintaining VNC connection with server indefinitely? gosssamer Linux - Desktop 6 04-29-2021 12:13 AM
Sleep indefinitely abhijitd Linux - Newbie 1 02-16-2009 03:37 AM
Server randomly hangs indefinitely bdb4269 Linux - General 2 03-07-2008 01:24 PM
How do I pause a thread indefinitely and then resume it again (in Java)? Nylex Programming 7 02-18-2007 12:13 PM
logrotate takes up 100% CPU indefinitely BroX Slackware 2 05-24-2005 03:32 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Server

All times are GMT -5. The time now is 11:09 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration