LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Networking
User Name
Password
Linux - Networking This forum is for any issue related to networks or networking.
Routing, network cards, OSI, etc. Anything is fair game.

Notices


Reply
  Search this Thread
Old 03-20-2018, 08:53 AM   #1
zenichev
LQ Newbie
 
Registered: Mar 2018
Posts: 2

Rep: Reputation: Disabled
TCP failover doesn't work as expected


Hi community.

I'm trying to build up a tcp failover cluster. I'm trying to save and restore active tcp sessions (that belong to master side) on the salve side, when master experience the failover. So that, I would have needed tcp sockets opened on slave side (that were indeed started on the master side).

The main goal is to make it working for kamailio (SER) daemon. I'm trying to reach real-time HA cluster for calls that are being on the line and save them when master experience the failure.

What I've already done: 1. Created well-formed rule set for iptables: -A INPUT -i eth0 -m state --state RELATED,ESTABLISHED -j ACCEPT -A INPUT -i eth0 -d 10.100.100.27/32 -j DROP -A INPUT -m state --state INVALID -j LOG

-P FORWARD DROP -A FORWARD -i eth0 -m state --state RELATED,ESTABLISHED -j ACCEPT -A FORWARD -i eth0 -d 10.100.100.27/32 -j DROP -A FORWARD -m state --state INVALID -j LOG

where 10.100.100.27 - VIP address The same rule set is stored on the slave side.

2. Configured conntracd: Sync { Mode FTFW { DisableExternalCache Off CommitTimeout 1800 PurgeTimeout 5 }

UDP { IPv4_address 10.100.100.28 IPv4_Destination_Address 10.100.100.29 Port 3780 Interface eth0 SndSocketBuffer 1249280 RcvSocketBuffer 1249280 Checksum on }

Options { ExpectationSync { sip ftp } } }

General { Nice -20 HashSize 32768 HashLimit 131072 LogFile on Syslog on LockFile /var/lock/conntrack.lock UNIX { Path /var/run/conntrackd.ctl Backlog 20 } NetlinkBufferSize 2097152 NetlinkBufferSizeMaxGrowth 8388608 Filter From Userspace { Protocol Accept { TCP UDP ICMP } Address Ignore { } } }

where 10.100.100.28 - master and 10.100.100.29 - salve. The same config file is stored on the slave side, but addresses in UDP section are swapped. I tried to use Address Ignore block, where I made an effort to add ip addresses belong to the node, but with this one it didn't work at all - there was no exchange of conntrackd traffic between cluster nodes. So I leaved it empty.

3. Configured keepalived:

vrrp_instance E1 { interface eth0 state BACKUP virtual_router_id 61 advert_int 1 authentication { auth_type PASS auth_pass herepassword } virtual_ipaddress { 10.100.100.27/27 dev eth0 } nopreempt garp_master_delay 1

notify_master «/etc/conntrackd/primary-backup.sh primary» notify_backup «/etc/conntrackd/primary-backup.sh backup» notify_fault «/etc/conntrackd/primary-backup.sh fault» }

where primary-backup.sh is a script, that is provided with conntrackd libraries. You will ask me, why I don't use dedicated link for conntrackd? I used it for a while, but as matter of fact it didn't change anything, so I simplified the assignment for myself and made it deprecated.

How the process of failover looks like for the current moment:

1. I use telnet/ssh/ftp to connect to VIP address located (for current moment) on master side; 1.1. Master side experience a fail - I bring down the eth0 link; 2. Backup node see the problem and execute: /etc/conntrackd/primary-backup.sh primary so the following sequence of conntrackd command are executed: /usr/sbin/conntrackd -C /etc/conntrackd/conntrackd.conf -c /usr/sbin/conntrackd -C /etc/conntrackd/conntrackd.conf -f /usr/sbin/conntrackd -C /etc/conntrackd/conntrackd.conf -R /usr/sbin/conntrackd -C /etc/conntrackd/conntrackd.conf -B

3. I can see the needed telnet/ssh/ftp session on the backup node by command: conntrackd -i It has state - ESTABLISHED state (I'm confident that this is session I need, cuz I remember the client's port was used for connection on master node).

4. But when I try to send packets (commands) from my client, server resets the TCP session with [R] flag. Tcpdump output on the backups node shows only 2 rows:

11:12:26.621001 IP telnet.client.test.56238 > 10.100.100.27.telnet: Flags [P.], seq 1380562259:1380562261, ack 1731331297, win 237, options [nop,nop,TS val 43239685 ecr 81905017], length 2 11:12:26.621083 IP 10.100.100.27.telnet > telnet.client.test.56238: Flags [R], seq 1731331297, win 0, length 0

As you can see, firewall accepts the traffic (in INPUT and FORWARD chains),this means that session exists on the backup internal cache/kernel table (otherwise iptables would drop the packet), but it then resets it, why? I tried to test it with ssh, telnet and ftp. No success at all. I also tried to remove flushing command, so that sequence was changed to : /usr/sbin/conntrackd -C /etc/conntrackd/conntrackd.conf -c /usr/sbin/conntrackd -C /etc/conntrackd/conntrackd.conf -R

and it also didn't work.

So if someone has the needed experience, please don't be apathetic, help a bit. At least I need a hint where to look for a problem.

-- BR, Donat Zenichev
 
Old 03-27-2018, 11:48 AM   #2
erabaddosi-2116
LQ Newbie
 
Registered: Mar 2018
Posts: 20

Rep: Reputation: Disabled
Hi zenichev,

Is the network stack (iptables, conntrackd, etc.) rejecting this or is the application (telnet) rejecting this? The network can be perfectly configured, but an application that doesn't have an open socket would (correctly) instruct the network stack to reject this connection (resulting in a reset packet on the line -- the [R] flag) - when the failover happens.

But...I have little experience with conntracd and friends. A few well-placed "-j LOG" entries in iptables on both might help confirm that iptables is working as expected. Perhaps "conntrackd -i > /tmp/$(/bin/date +%s)_state.txt" on each node before and immediately after the test might reveal interesting information, from conntrackd's perspective.
 
Old 04-02-2018, 10:21 AM   #3
zenichev
LQ Newbie
 
Registered: Mar 2018
Posts: 2

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by erabaddosi-2116 View Post
Hi zenichev,

Is the network stack (iptables, conntrackd, etc.) rejecting this or is the application (telnet) rejecting this? The network can be perfectly configured, but an application that doesn't have an open socket would (correctly) instruct the network stack to reject this connection (resulting in a reset packet on the line -- the [R] flag) - when the failover happens.

But...I have little experience with conntracd and friends. A few well-placed "-j LOG" entries in iptables on both might help confirm that iptables is working as expected. Perhaps "conntrackd -i > /tmp/$(/bin/date +%s)_state.txt" on each node before and immediately after the test might reveal interesting information, from conntrackd's perspective.

Server side rejects connection with RST flag. Will try to use logging to find where the problem appears.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
krb5.conf failover doesn't work pgb205 Linux - Security 0 07-09-2017 05:55 AM
[SOLVED] keyboard doesn't work as expected hua Solaris / OpenSolaris 13 06-03-2013 08:56 AM
iptables doesn't work as expected rluo Red Hat 3 01-18-2013 01:34 AM
Subwoofer doesn't work as expected Kubuntu Linux - Hardware 0 06-20-2012 04:23 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Networking

All times are GMT -5. The time now is 04:53 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration