LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices

Reply
 
Search this Thread
Old 01-02-2013, 09:08 AM   #1
eantoranz
Senior Member
 
Registered: Apr 2003
Location: Colombia
Distribution: Kubuntu, Debian, Knoppix
Posts: 1,982
Blog Entries: 1

Rep: Reputation: 83
pacemaker - no checking for iscsi status?


Hi!

I just figured out how to set up a iscsi resource in pacemaker. As a first test I shut down the iscsitarget service on the SAN server and I expected pacemaker to realize that the service was down after a few seconds but so far everything is going fine, at least from what pacemaker is telling me (which is not what should happen).

Why is this failure not detected? Thanks in advance.
 
Old 01-03-2013, 07:03 AM   #2
vishesh
Member
 
Registered: Feb 2008
Distribution: Fedora,RHEL,Ubuntu
Posts: 658

Rep: Reputation: 66
Hi

Can you please share output of following command

root# crm status

Also you can give a try by executing following command

root# crm resource cleanup <resource>

If its not working then please share pacemaker configuration so that we can help in better way .

Thanks

Last edited by vishesh; 01-03-2013 at 07:06 AM.
 
Old 01-03-2013, 07:33 AM   #3
eantoranz
Senior Member
 
Registered: Apr 2003
Location: Colombia
Distribution: Kubuntu, Debian, Knoppix
Posts: 1,982
Blog Entries: 1

Original Poster
Rep: Reputation: 83
Well, let's see.

Before I shut down the iscsitarget service on my san server, everything was fine and dandy. From cluster1 (the only active node at the time):

Code:
cps@cluster1:~$ netstat -ntp | grep 192.168.55.11
(No info could be read for "-p": geteuid()=1000 but you should be root.)
tcp        0      0 192.168.55.12:56798     192.168.55.11:3260      ESTABLISHED -
Then, I shutdown iscsitarget and almost immediately I started to see stuff on cluster1 about it on syslog:

Code:
Jan  3 08:53:26 cluster1 kernel: [  338.605211]  connection1:0: detected conn error (1020)
Jan  3 08:53:27 cluster1 iscsid: Kernel reported iSCSI connection 1:0 error (1020) state (3)
Jan  3 08:53:30 cluster1 iscsid: connect to 192.168.55.11:3260 failed (Connection refused)
Jan  3 08:54:03 cluster1 iscsid: last message repeated 9 times
Jan  3 08:55:04 cluster1 iscsid: last message repeated 16 times
Jan  3 08:55:27 cluster1 iscsid: last message repeated 6 times
Jan  3 08:55:27 cluster1 kernel: [  458.856309]  session1: session recovery timed out after 120 secs
Jan  3 08:55:27 cluster1 kernel: [  458.856512] sd 2:0:0:0: [sdb] Unhandled error code
Jan  3 08:55:27 cluster1 kernel: [  458.856519] sd 2:0:0:0: [sdb] Result: hostbyte=DID_TRANSPORT_FAILFAST driverbyte=DRIVER_OK
Jan  3 08:55:27 cluster1 kernel: [  458.856550] sd 2:0:0:0: [sdb] CDB: Write(10): 2a 00 00 00 00 7c 00 00 10 00
Jan  3 08:55:27 cluster1 kernel: [  458.856571] end_request: I/O error, dev sdb, sector 124
Jan  3 08:55:27 cluster1 kernel: [  458.870106] Buffer I/O error on device sdb5, logical block 0
Jan  3 08:55:27 cluster1 kernel: [  458.876521] lost page write due to I/O error on sdb5
Jan  3 08:55:27 cluster1 kernel: [  458.876529] Buffer I/O error on device sdb5, logical block 1
Jan  3 08:55:27 cluster1 kernel: [  458.881319] lost page write due to I/O error on sdb5
Jan  3 08:55:27 cluster1 kernel: [  458.881334] sd 2:0:0:0: [sdb] Unhandled error code
Jan  3 08:55:27 cluster1 kernel: [  458.881336] sd 2:0:0:0: [sdb] Result: hostbyte=DID_TRANSPORT_FAILFAST driverbyte=DRIVER_OK
Jan  3 08:55:27 cluster1 kernel: [  458.881338] sd 2:0:0:0: [sdb] CDB: Write(10): 2a 00 00 54 00 84 00 00 10 00
Jan  3 08:55:27 cluster1 kernel: [  458.881344] end_request: I/O error, dev sdb, sector 5505156
Jan  3 08:55:27 cluster1 kernel: [  458.885717] Buffer I/O error on device sdb5, logical block 688129
Jan  3 08:55:27 cluster1 kernel: [  458.890340] lost page write due to I/O error on sdb5
However pacemaker won't care much about it.

Code:
cps@cluster1:~$ sudo crm_mon -1
============
Last updated: Thu Jan  3 08:57:28 2013
Stack: openais
Current DC: cluster1 - partition WITHOUT quorum
Version: 1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd
2 Nodes configured, 2 expected votes
1 Resources configured.
============

Online: [ cluster1 ]
OFFLINE: [ cluster2 ]

 Resource Group: sanos
     ip_flotante        (ocf::heartbeat:IPaddr2):       Started cluster1
     san        (ocf::heartbeat:iscsi): Started cluster1
     sanprobedelay      (ocf::heartbeat:Delay): Started cluster1
     datapostgres       (ocf::heartbeat:Filesystem):    Started cluster1
     wwwsanos   (ocf::heartbeat:Filesystem):    Started cluster1
     wwwsesion  (ocf::heartbeat:Filesystem):    Started cluster1
     postgres   (lsb:postgresql-8.4):   Started cluster1
     pgbouncer  (lsb:pgbouncer):        Started cluster1
     apache     (lsb:apache2):  Started cluster1

Failed actions:
    pgbouncer_monitor_0 (node=cluster1, call=9, rc=1, status=complete): unknown error
pgbouncer is always complaining but the cluster works fine normally so I don't think that's something to care about.

crm status shows the same thing

Code:
cps@cluster1:~$ sudo crm status
============
Last updated: Thu Jan  3 08:58:48 2013
Stack: openais
Current DC: cluster1 - partition WITHOUT quorum
Version: 1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd
2 Nodes configured, 2 expected votes
1 Resources configured.
============

Online: [ cluster1 ]
OFFLINE: [ cluster2 ]

 Resource Group: sanos
     ip_flotante        (ocf::heartbeat:IPaddr2):       Started cluster1
     san        (ocf::heartbeat:iscsi): Started cluster1
     sanprobedelay      (ocf::heartbeat:Delay): Started cluster1
     datapostgres       (ocf::heartbeat:Filesystem):    Started cluster1
     wwwsanos   (ocf::heartbeat:Filesystem):    Started cluster1
     wwwsesion  (ocf::heartbeat:Filesystem):    Started cluster1
     postgres   (lsb:postgresql-8.4):   Started cluster1
     pgbouncer  (lsb:pgbouncer):        Started cluster1
     apache     (lsb:apache2):  Started cluster1

Failed actions:
    pgbouncer_monitor_0 (node=cluster1, call=9, rc=1, status=complete): unknown error
I will try the cleanup next. Let's see what happens when I cleanup san resource. After some seconds it ended up like this:
Code:
cps@cluster1:~$ sudo crm status
============
Last updated: Thu Jan  3 09:01:13 2013
Stack: openais
Current DC: cluster1 - partition WITHOUT quorum
Version: 1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd
2 Nodes configured, 2 expected votes
1 Resources configured.
============

Online: [ cluster1 ]
OFFLINE: [ cluster2 ]

 Resource Group: sanos
     ip_flotante        (ocf::heartbeat:IPaddr2):       Started cluster1
     san        (ocf::heartbeat:iscsi): Started cluster1 FAILED
     sanprobedelay      (ocf::heartbeat:Delay): Started cluster1
     datapostgres       (ocf::heartbeat:Filesystem):    Started cluster1
     wwwsanos   (ocf::heartbeat:Filesystem):    Started cluster1
     wwwsesion  (ocf::heartbeat:Filesystem):    Started cluster1
     postgres   (lsb:postgresql-8.4):   Started cluster1 (unmanaged) FAILED
     pgbouncer  (lsb:pgbouncer):        Stopped 
     apache     (lsb:apache2):  Stopped 

Failed actions:
    pgbouncer_monitor_0 (node=cluster1, call=9, rc=1, status=complete): unknown error
    postgres_stop_0 (node=cluster1, call=24, rc=1, status=complete): unknown error
    san_monitor_0 (node=cluster1, call=21, rc=1, status=complete): unknown error
 
  


Reply

Tags
disconnection, iscsi, pacemaker


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] pacemaker - iscsi: how to set up iscsi targets/logical units? eantoranz Linux - Server 9 01-02-2013 08:38 AM
checking network status ashishravande Linux - Newbie 3 06-15-2010 01:40 PM
Checking device status neerukamra Linux - General 1 09-06-2004 01:12 AM
command(s) for checking ip status stupidloser Linux - Wireless Networking 5 08-10-2004 10:21 AM
checking the status of PPP0 pudhiyavan Linux - Networking 6 01-14-2004 02:47 AM


All times are GMT -5. The time now is 01:29 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration