LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices


Reply
 
Search this Thread
Old 01-15-2014, 12:35 AM   #1
thirupathi
LQ Newbie
 
Registered: May 2012
Location: Singapore
Distribution: RHEL
Posts: 22

Rep: Reputation: Disabled
Unhappy RHEL 5 two node cluster getting error


Hi,

I'm using RHEL 5 two node cluster, Its getting below errors

[root@nocidsdb02 ~]# uname -a
Linux nocidsdb02.nlb.gov.sg 2.6.18-371.el5 #1 SMP Thu Sep 5 21:21:44 EDT 2013 x86_64 x86_64 x86_64 GNU/Linux


Jan 15 13:20:18 nocidsdb02 kernel: RPC: bad TCP reclen 0x47455420 (non-terminal)
Jan 15 13:20:18 nocidsdb02 kernel: RPC: bad TCP reclen 0x00620103 (large)
Jan 15 13:20:18 nocidsdb02 kernel: RPC: bad TCP reclen 0x4a524d49 (non-terminal)
Jan 15 13:20:30 nocidsdb02 ccsd[6603]: Unexpected communication type (542393671)... ignoring.
Jan 15 13:20:30 nocidsdb02 ccsd[6603]: Unexpected communication type (542393671)... ignoring.
Jan 15 13:20:30 nocidsdb02 ccsd[6603]: Unexpected communication type (50422400)... ignoring.
Jan 15 13:20:30 nocidsdb02 ccsd[6603]: Unexpected communication type (1229804106)... ignoring.
Jan 15 13:20:31 nocidsdb02 kernel: RPC: bad TCP reclen 0x47455420 (non-terminal)
Jan 15 13:20:31 nocidsdb02 kernel: RPC: bad TCP reclen 0x00620103 (large)
Jan 15 13:20:31 nocidsdb02 kernel: RPC: bad TCP reclen 0x4a524d49 (non-terminal)


[root@nocidsdb02 ~]# ifconfig
bond0 Link encap:Ethernet HWaddr 44:1E:A1:4A:1A:98
inet addr:192.168.1.52 Bcast:192.168.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
RX packets:3301089 errors:0 dropped:0 overruns:0 frame:0
TX packets:1838407 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:579770741 (552.9 MiB) TX bytes:341855214 (326.0 MiB)

eth0 Link encap:Ethernet HWaddr 3C9:2B:FD:1B:38
inet addr:172.30.13.110 Bcast:172.30.13.127 Mask:255.255.255.192
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:29846872 errors:0 dropped:0 overruns:0 frame:0
TX packets:56428596 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:3006200476 (2.7 GiB) TX bytes:80404189768 (74.8 GiB)
Interrupt:170 Memory:f8000000-f8012800

eth2 Link encap:Ethernet HWaddr 44:1E:A1:4A:1A:98
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:3300295 errors:0 dropped:0 overruns:0 frame:0
TX packets:1838407 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:579713163 (552.8 MiB) TX bytes:341855214 (326.0 MiB)

eth3 Link encap:Ethernet HWaddr 44:1E:A1:4A:1A:98
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:794 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:57578 (56.2 KiB) TX bytes:0 (0.0 b)

lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:1256105 errors:0 dropped:0 overruns:0 frame:0
TX packets:1256105 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:404512516 (385.7 MiB) TX bytes:404512516 (385.7 MiB)

[root@nocidsdb02 ~]# ifconfig eth2 | grep MTU
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
[root@nocidsdb02 ~]# ifconfig eth3 | grep MTU
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1

[root@nocidsdb02 ~]# clustat
Cluster Status for ids_db @ Wed Jan 15 14:32:43 2014
Member Status: Quorate

Member Name ID Status
------ ---- ---- ------
nocidsdbsvr01 1 Online, rgmanager
nocidsdbsvr02 2 Online, Local, rgmanager

Service Name Owner (Last) State
------- ---- ----- ------ -----
service:metalib nocidsdbsvr01 started
servicerimo nocidsdbsvr01 started


When I fail back the service node 2 is rebooting.
Can help me to check
 
Old 01-16-2014, 09:49 AM   #2
Habitual
Senior Member
 
Registered: Jan 2011
Location: Pelican Bay Correctional
Distribution: I can't wait for Linux to be 'underground' again.
Posts: 4,016
Blog Entries: 1

Rep: Reputation: Disabled
Code tags
 
Old 01-16-2014, 09:55 AM   #3
TB0ne
Guru
 
Registered: Jul 2003
Location: Birmingham, Alabama
Distribution: SuSE, RedHat, Slack,CentOS
Posts: 15,855

Rep: Reputation: 2957Reputation: 2957Reputation: 2957Reputation: 2957Reputation: 2957Reputation: 2957Reputation: 2957Reputation: 2957Reputation: 2957Reputation: 2957Reputation: 2957
Quote:
Originally Posted by thirupathi View Post
Hi,
I'm using RHEL 5 two node cluster, Its getting below errors
Code:
Jan 15 13:20:18 nocidsdb02 kernel: RPC: bad TCP reclen 0x47455420 (non-terminal)
Jan 15 13:20:18 nocidsdb02 kernel: RPC: bad TCP reclen 0x00620103 (large)
Jan 15 13:20:18 nocidsdb02 kernel: RPC: bad TCP reclen 0x4a524d49 (non-terminal)
Jan 15 13:20:30 nocidsdb02 ccsd[6603]: Unexpected communication type (542393671)... ignoring.
Jan 15 13:20:30 nocidsdb02 ccsd[6603]: Unexpected communication type (542393671)... ignoring.
Jan 15 13:20:30 nocidsdb02 ccsd[6603]: Unexpected communication type (50422400)... ignoring.
Jan 15 13:20:30 nocidsdb02 ccsd[6603]: Unexpected communication type (1229804106)... ignoring.
Jan 15 13:20:31 nocidsdb02 kernel: RPC: bad TCP reclen 0x47455420 (non-terminal)
Jan 15 13:20:31 nocidsdb02 kernel: RPC: bad TCP reclen 0x00620103 (large)
Jan 15 13:20:31 nocidsdb02 kernel: RPC: bad TCP reclen 0x4a524d49 (non-terminal)
When I fail back the service node 2 is rebooting. Can help me to check
You've been using RHEL clustering since 2012:
http://www.linuxquestions.org/questi...up-4175427784/
http://www.linuxquestions.org/questi...lp-4175427826/

Please see some of the answers in your other threads, where you were directed to the clustering documentation:
https://access.redhat.com/site/docum...dministration/
https://access.redhat.com/site/solutions/22484

....and to Red Hat support. If you're using RHEL, you need to PAY FOR IT, which entitles you to support. Also, you only say RHEL5...not which version. The ONLY currently-supported version of RHEL5 is 5.9...and that's only with paid-for extended support. This is covered in the RHEL solutions guide, which you have access to since you're paying for RHEL. There are bugfixes which address such things...and again, they're available if you PAY for RHEL.
 
Old 02-13-2014, 04:04 AM   #4
thirupathi
LQ Newbie
 
Registered: May 2012
Location: Singapore
Distribution: RHEL
Posts: 22

Original Poster
Rep: Reputation: Disabled
Hi,

I'm still getting the same errors, As per RedHat suggested to check multicast and NIC drivers and MTU at Network switch side and Servers side.
1. I have updated NIC drivers.
2. I'm not using jumbo frames
Below are NIC's statistics.

#ethtool_-S_eth3

NIC statistics:
rx_crc_errors: 0
rx_alignment_symbol_errors: 0
rx_pause_frames: 0
rx_control_frames: 0
rx_in_range_errors: 0
rx_out_range_errors: 0
rx_frame_too_long: 0
rx_address_mismatch_drops: 12224 <-------- Packets dropped
rx_dropped_too_small: 0
rx_dropped_too_short: 0
rx_dropped_header_too_small: 0
rx_dropped_tcp_length: 0
rx_dropped_runt: 0
rxpp_fifo_overflow_drop: 0
rx_input_fifo_overflow_drop: 0

#ethtool_-S_eth2

NIC statistics:
rx_crc_errors: 0
rx_alignment_symbol_errors: 0
rx_pause_frames: 0
rx_control_frames: 0
rx_in_range_errors: 0
rx_out_range_errors: 0
rx_frame_too_long: 0
rx_address_mismatch_drops: 657 <------ Packets dropped
rx_dropped_too_small: 0
rx_dropped_too_short: 0
rx_dropped_header_too_small: 0
rx_dropped_tcp_length: 0
rx_dropped_runt: 0
rxpp_fifo_overflow_drop: 0


Drivers

#ethtool_-i_eth2

driver: be2net
version: 4.2.116r
firmware-version: 4.6.247.5
bus-info: 0000:02:00.0

#ethtool_-i_eth3
driver: be2net
version: 4.2.116r
firmware-version: 4.6.247.5

#ethtool_-k_eth2
Cannot get device udp large send offload settings: Operation not supported Offload parameters for eth2:
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp segmentation offload: on
udp fragmentation offload: off
generic segmentation offload: off
generic-receive-offload: on <-----------

#ethtool_-k_eth3
Cannot get device udp large send offload settings: Operation not supported Offload parameters for eth3:
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp segmentation offload: on
udp fragmentation offload: off
generic segmentation offload: off
generic-receive-offload: on

RedHat suggested to disable the "generic-receive-offload" for NIC's after changes also getting the same errors.

Can anyone help to overcome these error.

thanks,
 
Old 02-13-2014, 09:19 AM   #5
TB0ne
Guru
 
Registered: Jul 2003
Location: Birmingham, Alabama
Distribution: SuSE, RedHat, Slack,CentOS
Posts: 15,855

Rep: Reputation: 2957Reputation: 2957Reputation: 2957Reputation: 2957Reputation: 2957Reputation: 2957Reputation: 2957Reputation: 2957Reputation: 2957Reputation: 2957Reputation: 2957
Quote:
Originally Posted by thirupathi View Post
Hi,
I'm still getting the same errors, As per RedHat suggested to check multicast and NIC drivers and MTU at Network switch side and Servers side.
1. I have updated NIC drivers.
2. I'm not using jumbo frames

RedHat suggested to disable the "generic-receive-offload" for NIC's after changes also getting the same errors. Can anyone help to overcome these error.
So, Red Hat told you to disable a feature to overcome the problem...and you left it on, and are now asking how to solve the problem??? Why haven't you done what Red Hat told you to do?

Start by doing that, and see what happens.
 
Old 02-14-2014, 03:57 AM   #6
thirupathi
LQ Newbie
 
Registered: May 2012
Location: Singapore
Distribution: RHEL
Posts: 22

Original Poster
Rep: Reputation: Disabled
I have disabled "generic-receive-offload" for NIC's the as RedHat suggested. But error still appearing.

[root@nocidsdb01 ~]# ethtool -k eth2
Offload parameters for eth2:
Cannot get device udp large send offload settings: Operation not supported
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp segmentation offload: on
udp fragmentation offload: off
generic segmentation offload: off
generic-receive-offload: off

Feb 14 17:42:15 nocidsdb01 kernel: RPC: bad TCP reclen 0x47455420 (non-terminal)
Feb 14 17:42:15 nocidsdb01 kernel: RPC: bad TCP reclen 0x00620103 (large)
Feb 14 17:42:15 nocidsdb01 kernel: RPC: bad TCP reclen 0x4a524d49 (non-terminal)
Feb 14 17:42:24 nocidsdb01 kernel: RPC: bad TCP reclen 0x47455420 (non-terminal)
Feb 14 17:42:24 nocidsdb01 kernel: RPC: bad TCP reclen 0x00620103 (large)
Feb 14 17:42:24 nocidsdb01 kernel: RPC: bad TCP reclen 0x4a524d49 (non-terminal)
 
Old 02-14-2014, 09:31 AM   #7
TB0ne
Guru
 
Registered: Jul 2003
Location: Birmingham, Alabama
Distribution: SuSE, RedHat, Slack,CentOS
Posts: 15,855

Rep: Reputation: 2957Reputation: 2957Reputation: 2957Reputation: 2957Reputation: 2957Reputation: 2957Reputation: 2957Reputation: 2957Reputation: 2957Reputation: 2957Reputation: 2957
Quote:
Originally Posted by thirupathi View Post
I have disabled "generic-receive-offload" for NIC's the as RedHat suggested. But error still appearing.

[root@nocidsdb01 ~]# ethtool -k eth2
Offload parameters for eth2:
Cannot get device udp large send offload settings: Operation not supported
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp segmentation offload: on
udp fragmentation offload: off
generic segmentation offload: off
generic-receive-offload: off

Feb 14 17:42:15 nocidsdb01 kernel: RPC: bad TCP reclen 0x47455420 (non-terminal)
Feb 14 17:42:15 nocidsdb01 kernel: RPC: bad TCP reclen 0x00620103 (large)
Feb 14 17:42:15 nocidsdb01 kernel: RPC: bad TCP reclen 0x4a524d49 (non-terminal)
Feb 14 17:42:24 nocidsdb01 kernel: RPC: bad TCP reclen 0x47455420 (non-terminal)
Feb 14 17:42:24 nocidsdb01 kernel: RPC: bad TCP reclen 0x00620103 (large)
Feb 14 17:42:24 nocidsdb01 kernel: RPC: bad TCP reclen 0x4a524d49 (non-terminal)
Ok...so what did Red Hat say after you told them this? If you're paying for support from them, then you should be using it. I'm surprised they haven't had you run some more diagnostics...or have they? Did you call them back and tell them their suggested solution didn't work, and ask for someone who supports clustering specifically?
 
Old 02-17-2014, 02:03 AM   #8
thirupathi
LQ Newbie
 
Registered: May 2012
Location: Singapore
Distribution: RHEL
Posts: 22

Original Poster
Rep: Reputation: Disabled
Hi,

I have given sosreports to Redhat and they are still checking with experts. My case is open with redhat since 50days they can not find the cause.
I have changed the some settings as they suggested but still getting the error.

thks..
 
Old 02-17-2014, 10:30 AM   #9
TB0ne
Guru
 
Registered: Jul 2003
Location: Birmingham, Alabama
Distribution: SuSE, RedHat, Slack,CentOS
Posts: 15,855

Rep: Reputation: 2957Reputation: 2957Reputation: 2957Reputation: 2957Reputation: 2957Reputation: 2957Reputation: 2957Reputation: 2957Reputation: 2957Reputation: 2957Reputation: 2957
Quote:
Originally Posted by thirupathi View Post
Hi,
I have given sosreports to Redhat and they are still checking with experts. My case is open with redhat since 50days they can not find the cause.
I have changed the some settings as they suggested but still getting the error.
Sorry, but I have to doubt this. You are PAYING FOR support, and for Red Hat not to escalate the issue seems very odd. And if this is a production system, how has your employer not escalated this issue further, since leaving a production system having problems for almost two months would be a bad thing.

Also, there are some mentions of bugfixes in the RHEL customer portal (which you can access with your RHEL subscription; ask RHEL support for help) which address things that sound similar. You STILL don't say what version of RHEL5, but again we will tell you that only 5.9 is currently supported under extended support.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
How to make a two node RHEL 5.3 Cluster ? salman108 Linux - Server 2 04-24-2013 03:21 AM
RHEL CLuster - Node 2 _ Auto Reboot rajaniyer123 Linux - Server 1 08-07-2012 06:28 AM
GFS2 RHEL 6.1 1 Node Cluster sharing VNX SAN over Fibre drakal30 Linux - Server 2 02-21-2012 08:43 AM
Need Help - Two node cluster, RHEL 6 High Availability Add on , with Oracle over NFS ineedtosolvetheproblem Red Hat 1 09-28-2011 11:21 PM
Frequent RHEL cluster node crash/restarts aix_tiger Linux - Enterprise 0 07-07-2007 07:04 AM


All times are GMT -5. The time now is 09:56 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration