LinuxQuestions.org
Register a domain and help support LQ
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Slackware
User Name
Password
Slackware This Forum is for the discussion of Slackware Linux.

Notices



Reply
 
Search this Thread
Old 11-18-2009, 04:42 AM   #1
grissiom
Member
 
Registered: Apr 2008
Location: China, Beijing
Distribution: Slackware
Posts: 423

Rep: Reputation: 45
A script that could mirror slackware tree from multiple servers


It uses lftp to get the file names to download and uses aria2 to download them from multiple servers. (SlackBuild for aria2 can be found in SBo) It will start _one_ connection per time(i.e., only one thread from each server). So it will rise your speed but won't contribute very much to the servers' load. The script is here:
Code:
#!/bin/zsh

usage="mmirror-slack.sh mirror slackware tree from multiple servers.

usage: mmirror-slack.sh [-vt]
    v: be verbose. Display the commands that going to run.
    f: final mode. Remove the files that not present on remote server.
    n: dry run. Display the commands going to run but not excute them.
       Implies v.

mmirror-slack.sh also receive parameters from environment variables:
    VERSION: the version you want to mirror. -current, -13.0 etc.
             Default is -current. Don't forget the leading '-'.
    LOCALMIRROR: where is your mirror on the disk. Be sure to adjust
             it before run this script.
    ARCH: i386, x86_64 etc.
    FOLDER: the folder under tree you want to mirror. slackware/, extra/ etc.
             Don't forget the trailing '/'.
    LEXTRAAGRS: extra arguments that passed to lftp.
    AEXTRAAGRS: extra arguments that passed to aria2.
"

MAINMIRROR='ftp://ftp.osuosl.org/pub/slackware/'

# add your favorite mirrors here
MIRRORS=(ftp://darkstar.ist.utl.pt/pub/slackware/
ftp://slackware.mirrors.tds.net/pub/slackware/
ftp://ftp.slackware.no/pub/linux/slackware/
ftp://ftp.slackware.at/
ftp://ftp.ntua.gr/pub/linux/slackware/
http://mirror.switch.ch/ftp/mirror/slackware/
ftp://ftp.heanet.ie/mirrors/ftp.slackware.com/pub/slackware/
ftp://ftp.belnet.be/mirror/ftp.slackware.com/
ftp://ftp.slackware.org.uk/slackware/
http://slackware.cs.utah.edu/)
MIRRORS+=$MAINMIRROR
#http://mirrors.163.com/slackware/

# -current or -13.0 etc. Don't forget the leading '-'.
VERSION=${VERSION:-'-current'}

# where your local mirror located. In that folder you should have some thing
# like:
#   slackware64-current/
#   slackware-current/
#   slackware-13.0/
LOCALMIRROR=${LOCALMIRROR:-'/ext4/slackware_rsync'}
# use ARCH to determine which branch to mirror.
case $ARCH in
	'x86_64' )
	SBASE='slackware64'
	TBASE=$SBASE
	;;
	'i386' )
	SBASE='slackware'
	TBASE=$SBASE
	;;
	* )
	echo "ARCH=[x86_64|i386] mmirror-slack.sh"
	echo "see source file for more parameters."
	exit 1
	;;
esac

TDIR=${LOCALMIRROR}/${TBASE}${VERSION}/${FOLDER}
SDIR=${SBASE}${VERSION}/${FOLDER}

on_exit() {
	kill 0
	exit
}

exec_cmd() {
	[ $VERBOSE -ne 0 ] && echo $@
	if [ $DRYRUN -eq 0 ]; then
		eval $@
	fi
	[ $? -ne 0 ] && CMDFAIL+="\n""$@"
}

fetch_cmd() {
	LEXTRAAGRS=${LEXTRAAGRS}' --verbose=3 --script=- '
	# Some mirror have symbolic links, others are not. So for compatible
	#   reason, use --dereference to download symbolic links as files.
	#   Hope this won't get local mirror too large... 
	# If you behind good router and use good mirror, set ftp:sync-mode off
	lftp -c "set ftp:sync-mode on
		 open $MAINMIRROR &&
		 mirror ${LEXTRAAGRS} \
		   ${SDIR} ${TDIR}"
}

dispatch_cmd() {
	while read -u 0 cmdline; do
		case ${cmdline[1,3]} in
			"get" )
			cmd=${cmdline//"$MAINMIRROR"/}
			file=$(echo "$cmd" | rev | cut -f 1 -d ' ' | rev)
			folder=$(echo "$cmd" | rev | cut -f 2 -d ' ' | rev)
			exec_cmd aria2c ${AEXTRAAGRS} --summary-interval=0 \
			    --allow-overwrite=true --remote-time=true \
			    --dir="${folder}" --split=${NMIRROR} \
			    ${MIRRORS[@]/%/${file}}
			;;
			'rm ' )
			cmd=${cmdline//"file:"/}
			if [ $FINAL -eq 1 ]; then
				exec_cmd $cmd
			fi
			;;
			* )
			if [ "${cmdline[1,5]}" = 'chmod' ]; then
				cmd=${cmdline//"file:"/}
				exec_cmd $cmd
			elif [ "${cmdline[1,5]}" = 'shell' ]; then
				cmd=${cmdline//"shell "/}
				exec_cmd $cmd
			else
				echo "$cmdline"
			fi
			;;
		esac
	done
}

#############
# Main body #
#############
trap on_exit 1 2 3 6

FINAL=0
VERBOSE=0
DRYRUN=0
NMIRROR=$((${#MIRRORS[@]}-1))
while getopts ':nvf' opt; do
	case $opt in
		'v' )
		VERBOSE=1
		;;
		'f' )
		FINAL=1
		LEXTRAAGRS+=" --delete "
		MIRRORS=$MAINMIRROR
		NMIRROR=1
		;;
		'n' )
		DRYRUN=1
		;;
		'?' )
		unkopt+=$OPTARG' '
		;;
	esac
done
[ $DRYRUN -eq 1 ] && VERBOSE=1

[ -n "$unkopt" ] && { echo "$usage"; echo "Unkown option: $unkopt"; exit 2}

echo "Mirror $MAINMIRROR/$SDIR to $TDIR :"
if [ $FINAL -eq 1 ]; then
	MIRRORS=$MAINMIRROR
	echo "FINAL mode"
fi
fetch_cmd | dispatch_cmd

[ -n "$DFAILFILE" ] && echo "failed to download:" $DFAILFILE
[ -n "$CMDFAIL" ] && echo "failed to run command(s):" $CMDFAIL

exit 0
The code is hosted on http://gitorious.org/slack-utils/slack-utils . Cloning or suggestions are strongly welcome~

Last edited by grissiom; 11-21-2009 at 01:56 AM.
 
Old 11-19-2009, 08:12 PM   #2
MS3FGX
Guru
 
Registered: Jan 2004
Location: NJ, USA
Distribution: Slackware, Debian
Posts: 5,852

Rep: Reputation: 351Reputation: 351Reputation: 351Reputation: 351
OK, I'll bite. What exactly is the advantage of this? Just to lower load on the mirror servers by distributing the download?

Couldn't you do the same thing by running a different rsync operation against each directory? Or in other words, rsync "slackware" from one server and "source" from another?
 
Old 11-20-2009, 07:16 AM   #3
grissiom
Member
 
Registered: Apr 2008
Location: China, Beijing
Distribution: Slackware
Posts: 423

Original Poster
Rep: Reputation: 45
First, Thanks for your comment. Then, my answers:

1, Yes, one advantage is lowering the load on the server side. But this is not the most important feature. I post it in the very first post because I don't want to threaten them. The servers are always powerful, right?

2, Not every mirror has rsync service.(well, most of them have, but not all) This script use lftp that can get files from ftp, http, ftps... the ones lftp support. The more mirrors you utilize, the faster speed you can get.

3, rsync in my network is _very_ slow, about 0.xKB/s. I don't know the reason but this is the truth. So I could only use ftp/http protocol to update my copies. But the connections to foreign servers are also slow, about 10~20KB/s per connection. So I have to I have to think about solutions to boost the speed --- get files from more servers. The total speed now is bearable, about 100~200KB/s(roughly equal to 10*speed_per_server). I'm satisfied with it. So the third advantage could be: if you slow with one server, you could get more with this script.

Your solution has a disadvantage that you cannot run two rsyncs in the same folder as they may over write each other. But there are only limited numbers of folders and each changed will not effect all of them. Say, PatV upgraded firefox, only slackware64/xap and source/xap/mozilla-firefox may have changes. So you could just launch two rsync instances. It's not comparable with downloading from 10 servers at a time Although rsync could download text files very efficiently, I doubt the effect on binary files, which consist most part of the tree.
 
Old 11-20-2009, 07:57 AM   #4
GazL
Senior Member
 
Registered: May 2008
Posts: 3,502

Rep: Reputation: 1024Reputation: 1024Reputation: 1024Reputation: 1024Reputation: 1024Reputation: 1024Reputation: 1024Reputation: 1024
Thanks for posting, but I don't think I'd be inclined to use anything like that as I'm never in that much of a hurry. I'm curious though regarding consistency. If you're pulling from all over the place, what happens if the mirrors are out of step with the main one? At best you'll get some sort of file not found error, at worst, you could end up with some files having the wrong contents.
 
Old 11-20-2009, 08:46 AM   #5
grissiom
Member
 
Registered: Apr 2008
Location: China, Beijing
Distribution: Slackware
Posts: 423

Original Poster
Rep: Reputation: 45
In my experiences, Slackware never have two different package with the same package name(i.e., file name). So if one of the mirrors is out of date, you cannot get content there with the new file name. aria2 will handle it. As for SlackBuild scripts and other staff without a version number, they are very small and aria2 won't split them into many parts, so they won't be downloaded from multiple servers. But my script can't guarantee that. I may add this feature in the future. At least, there are the checksums. Thanks for advising ~

Last edited by grissiom; 11-20-2009 at 08:49 AM.
 
Old 11-21-2009, 02:11 AM   #6
grissiom
Member
 
Registered: Apr 2008
Location: China, Beijing
Distribution: Slackware
Posts: 423

Original Poster
Rep: Reputation: 45
I mailed the author of aria2 to ask the problem about more than one URIs point to different file. He answers that aria2 will compare aria2 will check the file size and if it differs, it will drop some of the URIs. However, it cannot guarantee which URI will be dropped and which URI will be preserved. So it unlikely to corrupt files, although the downloaded one maybe not the one in the main server. Here I have two solutions:

1, wait for 1~2 days. The mirrors listed in the script are very active -- 1~2 days is enough for them to synchronize with each other.

2, run "mmirror-slack.sh -f", than it will only download files from the main server. This could be slow but if you have already downloaded most of stuffs from multiple servers(i.e., run mmirror-slack.sh first), it won't take too much time. It won't even hurt if you run rsync afterward, because after run mmirror-slack.sh, out-of-sync files should be SlackBuilds, txts, CHECKSUMS.md5 that without a version number, these are all very small.

I updated the script in the very first post. If anyone use this script, please upgrade your local copy. Thanks.

Last edited by grissiom; 11-21-2009 at 03:05 AM.
 
Old 11-21-2009, 03:49 AM   #7
Petri Kaukasoina
Member
 
Registered: Mar 2007
Posts: 242

Rep: Reputation: 86
Quote:
Originally Posted by grissiom View Post
But the connections to foreign servers are also slow, about 10~20KB/s per connection. So I have to I have to think about solutions to boost the speed --- get files from more servers.
Hi

I notice that you are from China.

I analyzed the log file of my Slackware mirror (between 15th and 21st November). There were 1882 failed downloads of Slackware ISO files, with 205 unique IP addresses. According to whois, 164 of those were from China. And there where 43 succeeded ISO downloads, from 31 unique addresses. None from China. Most succeeded downloads were from Europe but some were from countries like Malaysia, Argentina and Colombia which are far from my location (Finland, Europe).

The downloads from China look like this:

Sat Nov 21 07:15:34 2009 [pid 30133] [ftp] FAIL DOWNLOAD: Client "XXX.XXX.XXX.XXX", "/slackware-13.0-iso/slackware-13.0-install-dvd.iso", 161424 bytes, 1.72Kbyte/sec
Sat Nov 21 07:17:19 2009 [pid 30153] [ftp] FAIL DOWNLOAD: Client "XXX.XXX.XXX.XXX", "/slackware-13.0-iso/slackware-13.0-install-dvd.iso", 192888 bytes, 3.01Kbyte/sec
Sat Nov 21 07:23:44 2009 [pid 30184] [ftp] FAIL DOWNLOAD: Client "XXX.XXX.XXX.XXX", "/slackware-13.0-iso/slackware-13.0-install-dvd.iso", 161424 bytes, 2.12Kbyte/sec
Sat Nov 21 07:24:38 2009 [pid 30187] [ftp] FAIL DOWNLOAD: Client "XXX.XXX.XXX.XXX", "/slackware-13.0-iso/slackware-13.0-install-dvd.iso", 112176 bytes, 2.16Kbyte/sec

I hid the ip address. The same file was tried to download 93 times from the same ip address for a time period of five hours. It always fails immediately, after about 100 kilobytes.

So, I think there is something wrong in the Chinese net.
 
Old 11-21-2009, 10:54 AM   #8
MS3FGX
Guru
 
Registered: Jan 2004
Location: NJ, USA
Distribution: Slackware, Debian
Posts: 5,852

Rep: Reputation: 351Reputation: 351Reputation: 351Reputation: 351
China uses extensive firewalling and QoS systems to control and monitor their access to the Internet, so that is very possible.
 
Old 11-21-2009, 08:05 PM   #9
grissiom
Member
 
Registered: Apr 2008
Location: China, Beijing
Distribution: Slackware
Posts: 423

Original Poster
Rep: Reputation: 45
Ok, I admit Chinese network have firewalls have many limitations.... So in one aspect, my script can be considered as some kind of "workaround" of the problem. Besides, not all of the nets in the world is as fast as Europe or USA or Japan, I think many under-developing country doesn't have very fast network yet. So they may get benefit from my script. And people in fast net could use my script to get faster, although there is less room to improve...
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Solution to inject script to multiple servers?? cdestiny Linux - Server 3 07-23-2008 02:14 PM
Can I use rsync to create mirror two servers with different OS Deadbeat456 Linux - Server 3 06-25-2008 08:44 AM
help with script to check processes on multiple servers ncsuapex Programming 7 06-10-2008 12:02 PM
Can I use a codafs to mirror /home and /home2 across 2 servers? abefroman Linux - Software 1 09-15-2005 08:25 PM
Mirror Servers biggdady6998 Linux - General 3 09-27-2003 08:12 AM


All times are GMT -5. The time now is 11:19 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration