LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices


Reply
  Search this Thread
Old 09-26-2008, 08:53 AM   #1
marafa
LQ Newbie
 
Registered: Sep 2008
Posts: 26

Rep: Reputation: 1
ssh_exchange_identification: Connection closed by remote host


i have a setup with about 30 Suse Linux Enterpise Server 10 Service Pack 1 and i have a backup script that in pseudo code looks like:

for server in 1 to 30
do
ssh $server tar -cz $logdir && scp $logdir.tar.gz $central_log_repo &
done

the problem is while all the servers get the logs tarred, the return trip doesnt allows work. random servers log this error:
ssh_exchange_identification: Connection closed by remote host
where of course remote host is the $central_log_repo
there is no mention on $central_log_repo of any attempt by the offending server(s)


in summary, on one run server5 could scp the tarfile to $central_log_repo and on another it might fail because of the ssh error. how do i fix this?
 
Old 09-26-2008, 10:48 AM   #2
trickykid
LQ Guru
 
Registered: Jan 2001
Posts: 24,149

Rep: Reputation: 256Reputation: 256Reputation: 256
Probably need some more verbose output. Anyway to add a -v to see where it might be failing?
 
Old 09-26-2008, 10:56 AM   #3
tredegar
LQ 5k Club
 
Registered: May 2003
Location: London, UK
Distribution: Debian "Testing"
Posts: 6,111

Rep: Reputation: 413Reputation: 413Reputation: 413Reputation: 413Reputation: 413
Quote:
for server in 1 to 30
do
ssh $server tar -cz $logdir && scp $logdir.tar.gz $central_log_repo &
done
In your "script" you are starting 30 tarring jobs across 30 servers all simultaneously. They then try to send back the tarred log.
Maybe the 30 servers are trying to connect to make the scp transfer all at the same time.
Perhaps you have a limit set somewhere, and after a certain number of connections is reached, further connections are refused. If the remote servers all take different times to finish the tar job, the code works. If they all finish at (nearly) the same time, it breaks.

Try man limits.conf and this link: http://www.linuxweblog.com/limit-users-pam
I believe iptables can also set limits. You should look into that too.
 
Old 09-26-2008, 11:15 AM   #4
trickykid
LQ Guru
 
Registered: Jan 2001
Posts: 24,149

Rep: Reputation: 256Reputation: 256Reputation: 256
Quote:
Originally Posted by tredegar View Post
In your "script" you are starting 30 tarring jobs across 30 servers all simultaneously.
Actually no, that's not correct. The script he pointed out would do one at a time. But you shouldn't put a & at the end of the scp script so you can ensure it finishes the scp copy.

If you're just copying logs, you should either setup a loghost to capture these and or create a script that's run via cron on each.

Last edited by trickykid; 09-26-2008 at 11:22 AM.
 
Old 09-26-2008, 11:49 AM   #5
tredegar
LQ 5k Club
 
Registered: May 2003
Location: London, UK
Distribution: Debian "Testing"
Posts: 6,111

Rep: Reputation: 413Reputation: 413Reputation: 413Reputation: 413Reputation: 413
Quote:
Actually no, that's not correct. The script he pointed out would do one at a time. But you shouldn't put a & at the end of the scp script so you can ensure it finishes the scp copy.
Thanks for clearing that up.
 
Old 09-26-2008, 01:08 PM   #6
marafa
LQ Newbie
 
Registered: Sep 2008
Posts: 26

Original Poster
Rep: Reputation: 1
the script does throw each job in to the background. actually, i have to honestly say that
1. the ssh part is more like
ssh $server remote_backup.sh $central_log_repo &
i dint want to complicate the issue. but all the "master" script on the $central_log_repo does is calls a secondary remote_backup script and throws that script into the back ground.

2. the script is not mine. it came with a commercial application. the vendor will listen to my recommendations and may code them into the script.

3. i use clusterssh to manage these 30 servers and if i do scp $central_log_repo/file /tmp/. i see the same error pop up on some servers.

4. /etc/security/limits.conf is all hashed. nothing out of the ordinary as far as i can see.

5. both of you are right, from the output on screen, i see that it does starts 30 scp processes all at the same second, more or less but one after the other.
 
Old 09-26-2008, 02:10 PM   #7
trickykid
LQ Guru
 
Registered: Jan 2001
Posts: 24,149

Rep: Reputation: 256Reputation: 256Reputation: 256
Quote:
Originally Posted by marafa View Post
5. both of you are right, from the output on screen, i see that it does starts 30 scp processes all at the same second, more or less but one after the other.
I seriously doubt its an issue with the amount of connections. Like mentioned before, get a more verbose output to see what's causing the failure in connection.

I did a similar test like you have, ssh into one box that kicks off an scp to another, it seems putting a & at the end of the whole connection string will put the initial ssh into the background, not the remote scp.
 
Old 09-26-2008, 04:26 PM   #8
marafa
LQ Newbie
 
Registered: Sep 2008
Posts: 26

Original Poster
Rep: Reputation: 1
Quote:
Originally Posted by trickykid View Post
I seriously doubt its an issue with the amount of connections. Like mentioned before, get a more verbose output to see what's causing the failure in connection.

I did a similar test like you have, ssh into one box that kicks off an scp to another, it seems putting a & at the end of the whole connection string will put the initial ssh into the background, not the remote scp.
yes thats right .. and part of the script that runs on the remote machine does scp. so the scp is also in the background inside that remote script.

and from my tests (reducing number of servers to 8 for example and others) the only thing left is number of connections but i dont know where else to look.

i would love to fix this and open to other suggestions

ps. -v -v -v was put on the remote script and i dint get anything extra. neither to tty nor to log

pps. btw, this error also shows up when i use clusterssh to scp from the repo machine to the 30 servers at the same instant, without any script being used (see point 3 above)

Last edited by marafa; 09-26-2008 at 09:26 PM. Reason: cssh
 
Old 09-30-2008, 05:06 AM   #9
marafa
LQ Newbie
 
Registered: Sep 2008
Posts: 26

Original Poster
Rep: Reputation: 1
solution found at : http://archive.netbsd.se/?ml=openssh...7-10&t=5430083
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
ssh_exchange_identification: Connection closed by remote host piter23 Linux - Software 17 09-11-2013 09:59 AM
ssh_exchange_identification: Connection closed by remote host sailu_mvn Linux - Networking 5 07-09-2008 09:47 AM
ssh_exchange_identification: Connection closed by remote host jgray1978 *BSD 1 12-27-2007 10:22 PM
ssh_exchange_identification: Connection closed by remote host liguorir Linux - Software 3 09-18-2003 11:42 AM
ssh_exchange_identification: Connection closed by remote host ctav01 Linux - Networking 1 08-22-2003 07:14 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Server

All times are GMT -5. The time now is 10:09 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration