LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Linux Answers > Applications / GUI / Multimedia
User Name
Password

Notices


By geoff_f at 2006-06-09 00:30
I have more than one computer running Linux at our place, so I've always been on the lookout for a way of saving the distribution's updates locally so that all the other computers could update from there, rather than having to download the same updates many times. With updates to distributions typically amounting to hundreds of Megabytes, downloading multiple times is not the way to go. A lot of distributions have ways of creating local updates repositories, but the ones I've used have been difficult to set up or cumbersome to use. So I wrote a few scripts to automate the download process to a local updates directory which, in SUSE 10.0, is very easily used by YAST Online Update (YOU) to install the updates to all computers on your local network.

The scripts have been designed to allow user control over how selective the download process is. For example, you can download everything from the updates site, you can restrict the selections to just those rpms installed on your local machines, and you can further restrict that selection by avoiding large files that would not be used on your machines. Some examples here are the OpenOffice.org language files, kernels not used by your system, etc.

The scripts I've written are:

SyncDirs10.0MMMMMMMMMIThe main script to synchronise the local updates directory with the ftp site.

ExportRpmsMMMMMMMMMMA script to make other machines on your network report their installed rpms.

CreateLocalFilesUpdatedMMMUsed in Testing mode; simulates files having been downloaded.

SyncDirs10.0-TestMMMMMMNUsed to evaluate SyncDirs10.0 in Testing mode, before running it in Live mode.

Because different machines can have differing installed programs, I found that downloading updates based on my main machine did not get all the rpms needed by the other machines. What was needed was a method of discovering the installed rpms of the other machines, so they could be aggregated into the main list. The ExportRpms script does that.

I will start with a description of the Bash code that makes up the main script, SyncDirs10.0. It will be a somewhat blow-by-blow description that delves into basic concepts at times, so that beginners in Bash scripting are not left behind. Some of this will be about generic Bash code and its syntax, while other bits will be about specific Bash commands, the algorithms used, and their rationale. You should download the script and view it on the screen as you read this. I have largely avoided using quote marks when quoting text, because both single and double quotes feature heavily in Bash code. Having quote marks in their vicinity could be very confusing. Instead, I have used blue text to show quoted text (red in some places where extra differentiation is needed). I have also made all script code appear in blue, to make it stand apart from the description.

Part 1 - Script Description

Preamble

The first line begins with #!/bin/bash. Normally the # character starts a comment, but the special combination of #! (think of it as Sha-Bang) signifies that this is an executable script, and the program that runs it is /bin/bash. The next several lines are indeed comments, as they start with just a # character. Comments are special lines: Bash does not process them at all. They are merely there for your own information. They can contain anything - even what would normally be Bash code - and it will be ignored, as long as the # character precedes the text. This provides a very useful method of disabling code for testing, by putting a # character at the start of the line. This is referred to as commenting out the line, as it is taken out of the sequence of execution by becoming a comment.

The first item in the preamble is a copyright and licence message, followed by an overview of the method used to discover the files needed for download, and then to download them. Take a moment to read steps 1 to 11 before you move on here, as they are the essence of the script's operation.

The next section of comments describes the SUSE updates directories, the files contained there, and a table of which ones are needed, and those which aren't. As the main reason for SyncDirs10.0 was to save downloading files unnecessarily, I took the opportunity of culling files here as well. The big savings come from avoiding files from other cpu architectures. The script's default settings selects i586 files, and avoids ppc, ppc64 and x86_64 files in the deltas and rpm directories (it selects only from the /rpm/i586/ sub-directory). The patches, patches.obsolete and scripts directories are cpu-independent, so no attempt is made to limit files from those. The deltas, i586 and noarch directories contain .rpm files and language-specific .info files, so some savings can be made in those. The <program>.rpm files listed in the ftp web pages are identical to the 'real' <program>.<version>.<release>.<architecture>.rpm files, so the former can be safely omitted. The script selects between _en.info and _de.info files to make further savings. Many large files are not used by most users. Examples are the OpenOffice.org language files and kernel files for other architectures. You might use one of the OOo language files, but not all of them, so an opportunity to avoid downloading large files can be taken here. Finally, many files in the SUSE ftp repository might not be installed on your computer, so these needn't be downloaded as well. If you install one of these programs later, a subsequent run of SyncDirs10.0 will detect its update file and download it; so these can be safely avoided for now.

I make mention of a few files that must be downloaded. directory.3 (in patches and patches.obsolete directories) and INDEX and INDEX.de in the sub-directories of rpm (i586, noarch, ppc, etc) are unlike other files held in the updates repository. They contain information about files in the directory and are updated with each change to the directories, but their names do not change from one update to another. Since the algorithm is something akin to 'if it's already in the local updates directory, don't download it', special attention is given to these files to make sure that the current version in the directory is downloaded every time.

The 'pseudo-code' in the next block of text is an outline of the method of extracting the required files, based on the considerations listed above it.

Section 1: Declarations

The Declarations section contains a lot of definitions of program values that are used throughout the script. Defining them here means that the label can be used throughout the script, and being generally shorter, saves lots of typing. Also, if you need to change the value of the definition, it only needs to be changed in one place. Variable definitions take the form:

LABEL=value

If the definition contains spaces, it needs to be enclosed in double quotes:

LABEL="value with lots of spaces included"

Note that no spaces appear between the label, the equals sign and the value. This form is used when declaring variables. When using variables in the script, they need to be preceded with the dollar ($) sign. This tells Bash that it's dealing with a defined variable value; it knows what the current definition is and will substitute that value whenever the variable is encountered. LABEL refers to the variable's name, whereas $LABEL refers to the variable's value.

Throughout the script, I have used the usual programming convention that Constants (in Bash's case, variables with a static value) are comprised of all upper case letters, whereas real Variables (ie, those whose value changes in operation) use a mixture of upper and lower case. As the Label cannot have spaces, an underscore (_) is used to enhance readability with all upper case forms. My convention for variables is to use a Capital letter for each word. I find that MyReallyLongVariableName is much easier to read than myreallylongvariablename. I avoid the underscore where I can, as the Shift key is easier for me than the underscore. Therefore, I would use the MixedCase version rather than my_really_long_variable_name. Personal preference is the key here; you can use whatever scheme you want in your scripts. Some people might prefer My_Really_Long_Variable_Name.

The first part of the Declarations section is for user customisation of values to tailor the script to your local environment - directory and user names, etc. This part is where you will configure the script to run on your computer. The default values are examples only; you will need to change them to suit your own circumstances. (Don't worry about the actual settings now. There will be a more detailed explanation later in Part 2 about user-customisation aimed at getting this script working on your computer).

Testing. The first declaration of TESTING provides a means of automatically changing various values for Testing mode where some commands are inhibited until you're sure the script is working correctly (as against Live mode where all commands are operational). Any value (say, yes) will cause Testing mode to be followed. A null value (ie, nothing following the equals sign, or "") will invoke Live mode, where all commands are run. The commands that are not executed in Testing mode are those that download the files, and others that copy or delete files on the local computer (with a couple of exceptions). This is to eliminate danger to your system from any mis-configured variable that could delete, say, some completely unrelated directory somewhere else on the filesystem. Again, more detail will come later about Testing.

Filtering of installed programs. The FILTER_NOT_INSTALLED setting determines if programs not installed on the system are filtered out. A yes setting will ensure that programs not installed on the system will not be downloaded. Read this as 'I do want to filter out programs that are not installed on the system'. A null value will cause all programs to be downloaded, regardless of whether they are installed on the system. As this represents a great deal of files which most users will not need, the default has been set to yes. A further opportunity to filter out files which are installed on the system comes later (in the section on the function Filter_rpms).

Remote hosts and users. SyncDirs10.0 is designed to poll any computer on your local network for its installed rpms, and to shape the download lists accordingly. The ExportRpms script (detail later) does that for each remote host. The remote host names, plus a username for each of the remote hosts, are defined here. The username is normally the main user of that computer (it could be any user). If you have fewer computers on your local network and need to delete any of the entries here, simply delete their lines or comment them out. You will also need to delete or comment out the lines in the Preparations section of the script that call the Get_remote_installed_rpms() function using those settings. The comments following these declarations tell you how to add any extra computers, should you need to.

Store Directory. The remote computers need somewhere to store their installed rpms data, plus somewhere for the main computer to access these files. The STORE_DIR setting is a shared folder on the main computer - the computer running this SyncDirs10.0 script - that the remote hosts have access to and in which they can save their installed.rpms file. Each nominated remote user needs to have write permissions to this directory.

CreateLocalFilesUpdated Directory. CreateLocalFilesUpdated is a companion script for SyncDirs10.0 that creates a series of temporary files in the working directory that are used by SyncDirs10.0 to simulate the downloading of files from the SUSE ftp directory, and the deleting of obsolete files from the local updates directory. It is used in Testing mode to allow completion of the whole script, without actually doing any downloading or deleting. The setting here for CREATE_LOCAL_FILES_UPDATED_DIR points SyncDirs10.0 to the directory on your system where the CreateLocalFilesUpdated script is stored.

Architecture. This refers to the CPU architecture of your computer, and can be one of i586, ppc, ppc64 or x86_64. This setting determines which sub-directory under the /rpm directory is accessed for the main .rpm files appropriate to your computer's CPU architecture.

Language. SUSE provides for the update process to be conducted in English or German. A LANG_EN_DE setting of en will enable selection of the _en.info files and exclude the _de.info files. It will be vice versa if de is set.

Ftp Directory. The FTP_STEM setting specifies the parent directory on the SUSE ftp repository where the updates are stored. Ftp sites are preferred over http sites, as they seem to have more consistency in format and thus a greater chance of being compatible with all the code in SyncDirs10.0, and thus being able to reliably extract the filenames from the raw HTML page structure. Most of the http sites checked had an associated ftp site; just changing the http prefix to ftp gave the correct address.

Updates Directory. The UPDATES_STEM setting specifies the parent directory where the SUSE updates are stored on the local computer. The downloaded updates are saved into the sub-directories (deltas, patches, patches.obsolete, /rpm/i586 /rpm/noarch and scripts) under this directory. The specified directory - and all the sub-directories - must exist before this script is run. Also, the root user must have write access to this directory.

Backups Directory. The BACKUPS_STEM setting specifies the parent directory where backups of the SUSE updates are stored on a remote computer on your local network. The downloaded updates are copied into the sub-directories (deltas, patches, patches.obsolete, /rpm/i586 /rpm/noarch and scripts) under this directory, to maintain a valid set of backup files. The specified backups directory - and all the sub-directories - must exist on the remote computer before this script is run, and it must be a shared directory that the main computer can access through the local network. Also, the root user must have write access to this directory.

Working Directory. A working directory is needed to hold lots of intermediate temporary files used in calculating which files need to be downloaded. The Live mode setting for $WK_DIR is /var/log/SyncDirs, which uses the normal parent directory for log files. A different setting is used for Testing mode, as it was useful to be able to compare the two sets after Live mode was enabled, and to be able to more easily identify the Testing set of files for deletion after testing was finished. The /tmp/SyncDirs location signifies the more temporary nature of these files. Regardless of these considerations, any directory could be used for either setting.

The if..then..else construct

The previous few declarations used an if..then..else statement to make the settings. This construct takes the general form:

if <test condition1> ; then
MMMperform action(s) pertinent to <test condition1> being True
elif <test condition2> ; then
MMMperform action(s) pertinent to <test condition2> being True
else
MMMperform action(s) pertinent to both <test condition1> and <test condition2> being False
fi

The <test condition> following the initial if keyword is evaluated, and if True, the indented lines of code following it are executed. Once these lines are completed, execution commences after the final fi keyword. If <test condition> evaluates as False, the lines are skipped and execution proceeds to the next keyword, elif (a contraction of 'else if').

The <test condition2> following elif is then evaluated, and if True, the indented lines of code following it are executed. Again, once these lines are completed, execution commences after the final fi keyword. If <test condition2> evaluates as False, the lines are skipped and execution proceeds to the next keyword, else.

Having reached the else keyword, no test conditions are evaluated. the remaining lines of code are executed on the basis of the preceding test conditions having evaluated as False. Execution then resumes following the final fi keyword.

The elif and else keywords are optional. That is, the construct could consist of only if..then..fi. Multiple elif keywords are allowed, but only one else keyword can be used. The semi-colon (;) between the if and then keywords is a command separator. It allows two commands to be put on the same line. It is important to have at least one space between the semi-colon and then.

Bash uses a consistent convention for endings of its 'test' constructs: its ending keyword is the reverse of its starting keyword - fi in the case of if, not end if. Similarly, the case statement ends with esac.

Looking to the test condition used in forming the above variables, it uses the form [ -n "$TESTING" ]. The square brackets form a synonym for the inbuilt command test, which causes an evaluation of -n "$TESTING". The -n operator asks the question: 'does the variable TESTING have non-zero contents?'. In other words, 'does it contain any characters?'. If the answer is in the affirmative, the test evaluates to True; if not, it evaluates to False. Since the default setting for TESTING is yes, it does contain characters, and the test evaluates as True. The lines setting the variables in Testing mode are therefore executed.

Note that having any string of text in TESTING would cause an evaluation to True - even the string no would do that. It requires a Null string - nothing - to cause the -n operator to evaluate to False.

The variable TESTING within the square brackets is double-quoted for good reason: variables with spaces in them will cause problems unless double quotes are used. Double-quoting preserves the entire length of the string, including any spaces. It is a good habit to use double quotes around variables within the square brackets test construct.

Other test operators used in if..then..else constructs in SyncDirs10.0 are:
-zMMMzero ('does the variable have no characters?'; ie, 'is it empty?')

-eMMNfile exists ('does the file named by the variable exist?')

-sMMMfile exists and is not zero size ('does the file contain valid data?') This will prove False if the file does not exist.

=MMMequals ('are the two variables equal?' or 'is this variable equal to <some value>?')

==MMIis equal to (equivalent to =)
Tests are negated by the ! keyword. That is, a True result evaluates as False and vice versa. For example, consider the 'nested if' statement following the variable assignments executed in the [ -n "$TESTING" ] test above. Its if test of [ ! -e "$WK_DIR" ] translates to 'if the file represented by $WK_DIR does not exist, then execute the command mkdir $WK_DIR'. Strictly, the -e "$WK_DIR" condition is evaluated; if the directory doesn't exist, the ! operator converts the False result to True, and that code section is executed. If the directory exists, the True result is converted to False, and that code section is skipped.

Many other test operators are available under Bash. See the references for those.

The comment line with lots of dashes signifies the end of the user-customisation part of the Declarations section. The variable declarations following that should not need changing. Because their sheer number, I will mention only the more significant ones.

GET_RPMS. The GET_RPMS variable is assigned the name of the command on the remote computers that will report their installed rpms. ExportRpms is another companion script to SyncDirs10.0 that is installed on each remote computer, is executed by SyncDirs10.0 issuing a command to those computers, and which saves its installed rpms file to the $STORE_DIR location (defined earlier) in the form <remotehostname>.installed.rpms. The content of ExportRpms will be covered at the end of this description.

CreateLocalFilesUpdated. The next two declarations provide a name and a full pathname for the CreateLocalFilesUpdated script. Assignments to now have taken the simple form LABEL=value. The second declaration here is a compound form of LABEL=$LABEL1$LABEL2, where the values of two labels are added to from a third label. Note that any character in between these two labels (in this case, a / character) is also added to the formed label. Thus $CREATE_LOCAL_FILES_UPDATED, made up of the directory name, the / character, and the filename, becomes /home/fred/SyncDirs/CreateLocalFilesUpdated. This method of creating variable strings makes it easy to change names during script development by altering only one or two variables, and having it reflected in many locations throughout the script. This technique is used many times in SyncDirs10.0 and its companion scripts.

DATE_TIME. In order to give the Log File name a Date-Time reference, DATE_TIME is assigned the current date and time in the format yymmddHHmm. This is done by putting the date command in the $(command) format. Enclosing the command in brackets with a leading $ character is called command substitution. It's a mechanism to redirect the output of the command to the current location. In this case, the current location is an assignment to DATE_TIME. Another notation for command substitution is to use backticks: eg, `command` (backticks is above the tab key).

LOG_FILE. This is where the $DATE_TIME value is added to the name of the Log File. In comprising the LOG_FILE assignment, Bash first expands the $WK_DIR and $DATE_TIME variables to their current value, then concatenates the characters to form the value of LOG_FILE, which in the default case, if the script were run at 1343 on 15 April 2006, would be /tmp/SyncDirs/syncdirs10.0_0604151343.log. Each log file is thus given a unique date-time reference to prevent being over-written by subsequent runs of the script. You can follow the logic:

1. $TESTING is equal to yes, which causes WK_DIR to be assigned /tmp/SyncDirs;
2. $DATE_TIME is equal to 0604151343 because of the date command assignment; and
3. That all gets concatenated to /tmp/SyncDirs/syncdirs10.0_0604151343.log.

If the script had been run in Live mode, the string would have been /var/log/SyncDirs/syncdirs10.0_0604151343.log. Can you see how that would have occurred? (Hint: see the line if [ -n "$TESTING" ]; then and the discussion on the if..then..else construct above.)

Temporary Files. The majority of other variable declarations create Constants for easier reference throughout the script. Those central to the download files' calculation are:

ftp.files.< dir >MMMMMia file holding the list of files in the ftp updates directory
local.files.< dir >MMMMa file holding the list of files in the local updates directory
ftp.add.< dir >MMMMMia file holding a list of ftp updates files missing from the local updates directory
local.remove.< dir >MMa file holding a list of local files missing from the ftp updates directory
ftp.fil.< dir >MMMMMMna file holding the list of files from ftp.add.< dir >, after unwanted files have been filtered

where < dir > represents one file for each of the deltas, patches, patches.obsolete, i586, noarch and scripts directories. This part gets allocated in later sections of the script.

Arrays. Arrays are declared using the declare -a < ArrayName > statement. Unfortunately, Bash arrays can only be one-dimensional. The first array member is array number zero, or ArrayName[0]. Note that the array size is not mentioned in the declaration. The array size is dynamic: it will be as big as there are members assigned during running of the script. Array members need not be contiguous; empty arrays are allowed. For example, in a 20-member array ArrayName[0] to ArrayName[19], five members ArrayName[10] to ArrayName[14] could be empty and unused. Members 0 to 9 could be assigned at the start of the script, while members 15 to 19 could be assigned during the running of the script.

SyncDirs10.0 uses four arrays, two to hold static data related to directory names, and two to record some results of the download calculations:
UpdateDirs[ ]MMNan array to hold directory names of the SUSE updates directories, including their leading / character
IdxDir[ ]MMMMMNan array used to hold suffixes for Index.html files returned by wget on the SUSE update directories
DloadCount[ ]MM-an array to hold numbers of files to download
IdxSuccess[ ]MNnan array to hold the result of index.html download by wget
All of these arrays have six members (0-5), for each of the deltas, patches, patches.obsolete, i586, noarch and scripts directories. The first two have their array members assigned in the initial steps, while the last two have theirs assigned in the body of the script. UpdateDirs is used to build path/file names pointing to files in the update directories. IdxDir is used in loops, keeping track of which directory is being processed.

Section 2: Preparations

Section 2 performs some preparatory work before the actual work of the script begins. The first is a question to the user that is dependent on whether it is being started in Testing or Live mode. In Testing mode, it allows you to specify to not download index.html files from the ftp site. With repetitive testing cycles, this can save time when you know the contents of the site haven't changed in the last few minutes. If it's been a few days, then you can answer y and get a fresh set. In Live mode, the question traps an inadvertent entry to Live mode. If you had run the script in Live mode, and had made some changes but had forgotten to change TESTING to equal yes, this will prompt you and give you the opportunity to abort and change the TESTING variable to the correct value. Once you're happy that Testing mode is no longer needed, this can be disabled by commenting out the line containing else and the next five lines.

The means of asking the question is the echo command, which simply prints the text that follows to standard out, which is the console used to launch the script. The means of getting the user's input is the read command, which reads from standard in (the keyboard) and when the Enter key is pressed, places all the keyed text into the variable following the read command (cont in this case). The script then conducts an if..then..else test on $cont to determine what to do. Note that in this sort of simple test, only the precise answer yes will cause an affirmative response to register. Entering Yes, Y or y will cause the == test to fail. However, an exact no response is not needed to register a negative response. Answers of No, N, n, no thank you or even yes please will all fail, because they are not equal to yes.

You will see lots of echo commands sprinkled throughout SyncDir10.0's code. As Bash does not include a debugging capability, echo commands are the next best thing. As they are relied on extensively during Testing, I have left the echo commands in place wherever Testing mode is enabled. Other echo commands have been commented out to ensure a cleaner output during Live mode. I deliberately left them in place to show their use in debugging/testing, and to be a potential source of information if someone had problems after installing the script.

Change directory. The cd command is used to change the working directory to the current value of $WK_DIR, whose contents will depend on whether the script was started in Testing or Live mode. As you proceed through the script's code, note the number of times that $WK_DIR is repeated, and how laborious it would be to change each one if it wasn't represented by a Constant. Multiply this by the number of other similarly-used Constants and you can see the value of defining a Constant in one place and using it many times throughout the code.

The script then announces the name of its Log File, then prints a message about its commencement date and time. This time, the $(command) construct is used to return the output of the date command to the end of the argument to the echo command. There are two other interesting features in this line: the pipe symbol (|) and the tee command.

Pipe symbol. The output of a command is normally sent to standard out (the console). The pipe symbol redirects the output of the previous command from standard out to standard in (ie, the input) of the command following the pipe. The output is thus piped from the first command to the input of the second. In this case, the following command is tee.

The tee command. The purpose of the tee command is to duplicate the input it receives into two output streams. One is directed to standard out, and the other is directed to the argument that follows. The argument given here is the Log File. Later uses of the tee command use a 'switch' (-a) to make sure the output is appended to the Log File, rather than over-writing it.

The combined effect of the pipe symbol and the tee command is to have whatever output is produced be repeated to the Log File. This is used in nearly every output in SyncDirs10.0 to allow the user to see the output as the script is running, but to have a permanent record in a file as well.

Backing up installed.rpms. The next few lines check if an installed.rpms file exists in the working directory. The -s parameter tests whether the file exists and is non-empty. If it exists, it is copied to a backup filename of installed.rpms.orig in the same directory.

Deleting temporary files. The next block of code deletes old temporary files before new ones get created, so as to avoid the situation where, if a file is not created because of an error, the script continues on using the data from an old file. It would be much less confusing in troubleshooting to have no file present, and to see the results of that, than to chase a few false trails. The main tool for deleting these files is the rm -f command (remove, with the force switch). All of the arguments to the rm command are formed from predefined Constants, with additions from variables or fixed text, as seen previously. The asterisk (*) forces a match on any sequence of characters that may follow. This means that $FTP_FILES* will match on /tmp/SyncDirs/ftp.files.deltas, /tmp/SyncDirs/ftp.files.patches, etc, and so delete all the < dir > files in the working directory beginning with ftp.files (example applies to Testing mode).

The only other significant point in this block of code is the 2>> $LOG_FILE notation on every line with the rm command. It includes the >> operator, part of the redirect operator family, alongside >, which redirects the output of a command from the console to the file named after the operator. Instead of being displayed on the console, the output becomes the contents of the file. If the file doesn't exist, a new file of that name will be created; if it does exist, it will be overwritten. The >> operator is similar to the > operator, except that the output is appended to the file (much as we saw with the -a switch to the tee command earlier).

The 2 in the 2>> notation is a file descriptor, or fd. fd0 to fd3 are regular assignments as follows:
fd0MMstdinMMn(standard in: the keyboard)
fd1MMstdoutMN(standard out: the console)
fd2MMstderrMM(standard error: the console)
fd3 to fd9 are also available for use within Bash code. Later on, you will see SyncDirs10.0 use fd7 when reading lists from files. The effect of 2>> then, is to redirect stderr to be appended to the Log File, so that any errors will be recorded, and can be examined later.

FreshIndexes. The fetching of fresh index.html files occurs inside a for..in..do..done loop that takes the general form:
for Variable in-[list]; do
MMMMcode
MMMMcode
MMMMcode
done
where [list] is a list of text words, separated by spaces. In each iteration of the for..in..do..done loop, successive text words are assigned to Variable. This is the equivalent of a Variable= statement, but with the assigned value being automatically selected from the next text word with each iteration. (This is why, unlike the if and case statements, the for statement does not use the $ symbol in its definition line.) When the lines of code are executed, the variable value is accessed by reference to $Variable. When the done keyword is reached, execution loops back to the for line, where the next word in the list is assigned to Variable. Execution continues in this loop until the last word in the list has been completed, whereupon the for keyword assigns a null value ("") to Variable (as nothing comes after the last word). The for Variable test therefore fails, the loop is exited, and execution continues after the done keyword.

The script's fresh index loop is headed by: for Dir in "${IdxDir[@]}". The list is formed from the IdxDir array. The @ member of the array is a special case that means 'all members of the array'. Bash will expand the array variable to:
for Dir in deltas patches patches.obs i586 noarch scripts
The curly brackets are required when accessing the contents of arrays. The brackets have to enclose the array name, and the array member notation ([n]), but not the $ symbol.

Within the loop, the index.< dir > name is built from $INDEX_STEM and $Dir (the current directory) and assigned to INDEX_DIR. This is used as an
argument to the rm command, which causes files related to each directory to be deleted. For each iteration of the loop, expansion of $INDEX_DIR will cause /tmp/SyncDirs/index.deltas, /tmp/SyncDirs/index.patches, /tmp/SyncDirs/index.patches.obs, /tmp/SyncDirs/index.i586, /tmp/SyncDirs/index.noarch and /tmp/SyncDirs/index.scripts to be deleted from the working directory. Examples given are for Testing mode.

The nested if statements equate to:
if in Testing mode,
MMand if $FreshIndexes is non-zero (eg, contains yes)
MMMMremove the index.< dir > file.
MMotherwise, print a message that FreshIndexes are not wanted.
otherwise, if in Live mode,
MMremove the index.< dir > file.
Save_backup_file() function. To now, the script's code has executed in a direct sequence, line by line. However, it will now skip the function definition for Save_backup_file() and re-commence on the line after it. The function definition is denoted by the closed brackets () after the function name, and the curly brackets that enclose the function's code. The function's code will not execute until the function is called later from normal code lines. Fundamental to use of functions is that they must be defined in the script before the point that they are called. Functions are used when similar actions in in-line code would cause excessive repetition, or when complicated or lengthy pieces of code might disrupt the flow of the script. Save_backup_file() falls into the former category, whereas Filter_rpms() falls into the latter.

Functions can take arguments, and they can return an exit status, just like normal Bash commands. Arguments are simply appended to the function name when it is called. Inside the function, the arguments are referenced by the order that they were appended. The first is allocated to the special variable name 1, the second to 2 and so on. Their contents are then used within the function by referring to $1, $2, etc. Throughout SyncDirs10.0, I have assigned the passed arguments to other variable names more suggestive of their role, which improves code readability. For Save_backup_file() the variables are FileToBackup, RemoteBackupsFile and BackupFile. This simple function moves $FileToBackup to $BackupFile and removes $RemoteBackupsFile. It does not create an explicit return value.

directory.3 and INDEX files. The Save_backup_file() function is called immediately after it is defined. A function is called by invoking its name, without the () closed brackets: Save_backup_file in this case. The arguments passed in the first call are directory.3 in the local updates patches directory, and the same file in both the remote backups patches directory and the working directory. The purpose is to clear the directory.3 file from the patches directory (but keep a backup in the working directory) so that the ftp.add mechanism will include it in its list. The next three calls to Save_backup_file do the same for directory.3 in the patches.obs directories, and for the INDEX file in both the i586 and noarch directories. In Testing mode, these files will be restored at the end of the script, thus the need for the backups.

Installed rpms. The installed rpms on the main machine running SyncDirs10.0 are then discovered by using the rpm -qa (query all) command, which produces a list of all rpms installed on the system. It is piped to the sort command, then on to the sed command. sed is a stream editor that conducts its editing operations on each line of input it receives (via the pipe from sort here). The arguments to sed determine what editing is done. Each line it receives will be the name of an rpm file, such as:
java-1_4_2-sun-1.4.2.10-2.1.i586.rpm
To be able to match to other versions, architectures and other file types (eg, en.info and .patch.rpm), what is needed from each line is the java-1_4_2-sun- part. That is where sed comes in. The form used here is sed -e < instruction >, where -e tells sed that the following string is an instruction, and < instruction > is the string which does all the work. Each line of the input stream is a sequence of filenames separated by a newline character (equivalent to the Enter key). sed edits each line using these instructions in the same way as regular expression matching. The form of the instruction is 's/< match string >/< replacement string >/' (including the single quotes), the initial s meaning substitute. sed will examine each line, and if it finds a match for < match string > in the line, it will substitute < replacement string > in its place, send the line to the output stream, then loop to do the next line, until all lines are done. If it doesn't find a match, it sends the line unaltered to the output stream. Now, how do we specify the instruction?

All .rpm files have the naming convention of < program name >-< version >-< release >.< architecture >.rpm. To extract the < version >-< release >.< architecture >.rpm, we need to look for a dash (-), followed by < any combination of a lower or upper case letter, any number, a dot (.) or an underscore (_) >, followed by another dash, followed by < any combination of a lower or upper case letter, any number, a dot (.) or an underscore (_), followed by the end of the line >. Looking at the command that creates installed.rpms:
rpm -qa | sort | sed -e 's/-[a-zA-Z0-9\._]*-[a-zA-Z0-9\._]*$/-/' > $INSTALLED_RPMS
the sed instruction string is: 's/-[a-zA-Z0-9\._]*-[a-zA-Z0-9\._]*$/-/'

breaking down between the slashes, we can see that:
< match string > = -[a-zA-Z0-9\._]*-[a-zA-Z0-9\._]*$, and
< replacement string > = -
Included in < match string > are metacharacters or special characters. These have a special meaning in that context that takes precedence over the literal meaning. For example, Bash has metacharacters, and treats $ as more than just a unit of currency. To Bash, $ means that it is dealing with the value of a variable. Many Linux commands, including sed, recognise some characters as metacharacters; they take their special meaning by default, and need to be told otherwise if you want their literal meaning to be taken. sed gives special meaning to these characters:
[..]MN-= any combination of any characters enclosed by square brackets
*MMM= the previous character repeated any number of times
$MMn= the end of the line
.MMM= any character
\MMM= treat the following as a literal character, not the metacharacter it would normally represent
MMMMn(thus \. means treat this as a dot, not the metacharacter . that matches any character)
a-zMn= any lower case letter
A-ZM-= any upper case letter
0-9M-= any number from 0 to 9 inclusive
Breaking down < match string >, it specifies:

1.MM-a dash, followed by:
2.MM-[..]MN = any combination of:
MMM'a-zNn = any lower case letter;
MMM'A-ZN- = any upper case letter;
MMM'0-9M' = any number;
MMM'\.MMI = a dot;
MMM'_MMI = an underscore;
MMM'*MM'' = repeated any number of times, followed by:
3.MM-a dash, followed by:
4.MM-[..]MN = any combination of:
MMM'a-zNn = any lower case letter;
MMM'A-ZN- = any upper case letter;
MMM'0-9M' = any number;
MMM'\.MMI = a dot;
MMM'_MMI = an underscore;
MMM'*MM'' = repeated any number of times, followed by:
MMM'$MM' = the end of the line.

The < replacement string > is simply: a dash.

Taking our example, java-1_4_2-sun-1.4.2.10-2.1.i586.rpm, the section in red here:
java-1_4_2-sun-1.4.2.10-2.1.i586.rpm
is removed, leaving java-1_4_2-sun, to which a dash is added, giving:
java-1_4_2-sun-
which is what we want. sed will process all the other lines in the same way, and leave just the filename stem for all of them. This is then redirected via the > operator to the installed.rpms file in the working directory. sed is used many more times in SyncDirs10.0. A few more sed features will be covered then.

Get_remote_installed_rpms(). The Get_remote_installed_rpms() function does the work of getting an equivalent installed.rpms file from other computers on the local network (referred to as remote hosts). Furthermore, it calculates which rpms are unique to each remote host and adds them to the list of installed rpms extracted previously. After processing all remote hosts, the installed.rpms file constitutes the list of all rpms installed on all computers on the local network. By basing the files to be downloaded on this list, the needs of all local computers are met.

Get_remote_installed_rpms() first assigns the arguments passed to $User and $Remote. It then defines a few filenames in the working directory: < hostname >.installed.rpms < hostname >.installed.rpms.old and < hostname >.installed.rpms.uniq. It also defines a filename in user fred's shared directory (/home/fred/Common) of < hostname >.installed.rpms. The previous version of < hostname >.installed.rpms is first backed up to < hostname >.installed.rpms.old. A command is then issued via the ssh command for $User at $Remote (ie, < hostname >) to run the command $GET_RPMS (/usr/bin/ExportRpms) on that computer. As you will see later, ExportRpms performs the same function of extracting a list of installed rpms as was done above for the local computer, but additionally, saves it in user fred's /home/fred/Common shared directory, from where this function accesses it as $REMOTE_INSTALLED_RPMS_TMP. The sleep command waits for one second (approx), just to make sure the remote host has finished its work. After checking that the file actually exists and is not empty, it is moved to < hostname >.installed.rpms in the working directory. If in Live mode, its ownership is changed to user root and group root via the chown (change owner) command. The line:
if [ ! -s "$REMOTE_INSTALLED_RPMS" ]; then
conducts a confidence check that the < hostname >.installed.rpms file actually exists and is not empty. If not, a message is printed to the console and the backup of the previous file is copied to < hostname >.installed.rpms to be used instead. This could occur if, for example, one of the remote computers was switched off and was unable to respond. The only significant consequence would be if the remote host had had a program installed since the last updates were retrieved, and a critical update for that program was subsequently added to the ftp site. For that instance, it would be advisable to switch the remote host on and run SyncDirs10.0 again. For most occasions though, it's likely that no extra programs have been installed, and it's OK to proceed, using the previous list of installed rpms. For this reason, the default action is to continue on.

A similar check is performed on installed.rpms, and the same actions will be taken if it is missing.

comm command. The comm command compares two sorted files line by line. With no options, it will produce a three-column output, showing the following information:
Column 1:MMM'lines unique to file1
Column 2:MMM'lines unique to file2
Column 3:MMM'lines common to both files
Adding options of 1, 2 or 3 - or in combination - will suppress the output of the respective columns. So, comm -1 file1 file2 will suppress the column showing lines unique to file1 and only print columns 2 and 3. The command used in Get_remote_installed_rpms() is:
comm -13 $INSTALLED_RPMS $REMOTE_INSTALLED_RPMS > $REMOTE_INSTALLED_RPMS_UNIQ
comm -13 suppresses output of lines unique to file1 ($INSTALLED_RPMS) and lines common to both files, thus leaving one column showing lines unique to file2 ($REMOTE_INSTALLED_RPMS). These lines unique to $REMOTE_INSTALLED_RPMS represent those rpms installed on the remote host that are not installed on the local host. They are redirected to the file $REMOTE_INSTALLED_RPMS_UNIQ (< hostname >.installed.rpms.uniq).

cat command. The cat command comes from concatenate, which means to join. Its action is to send each file in its argument list, character by character, to stdout. The output will be seen as one continuous stream, all files joined together, printed to the console. With only one file as its argument, the action simply prints the characters to the console. As the output of cat can be redirected to another file, these actions can be seen as joining multiple files in one newly created file, or copying a file to another file. In this case, cat is used as:
cat $REMOTE_INSTALLED_RPMS_UNIQ >> $INSTALLED_RPMS
The $REMOTE_INSTALLED_RPMS_UNIQ file is redirected and appended to $INSTALLED_RPMS by the >> operator, thus adding the lines unique to < hostname >.installed.rpms to installed.rpms. The file installed.rpms now contains all of the rpms installed on the remote host and on the local computer.

The comm command can be confusing to grasp as first; so to press the point, and to make sure this is fully understood, I'll explain it in a different way:

Suppressing lines unique to $INSTALLED_RPMS and common to both it and $REMOTE_INSTALLED_RPMS (comm -13) has no effect on the content of $INSTALLED_RPMS, as both these set of lines are contained within it. All that's needed to encompass all the lines in both files is to add the lines unique to $REMOTE_INSTALLED_RPMS (which is what's left when comm -13 is run) to $INSTALLED_RPMS. So the equation becomes: all lines in $INSTALLED_RPMS + lines unique to $REMOTE_INSTALLED_RPMS = all lines in both files.

Two instances of the cat command are used to sort the contents of $INSTALLED_RPMS to a temporary file, then to copy them back to $INSTALLED_RPMS. Get_remote_installed_rpms() then exits without explicitly assigning a return value.

Calling Get_remote_installed_rpms(). Following its definition, Get_remote_installed_rpms() is called twice to add the installed rpms from $REMOTE1 and $REMOTE2 to the list in installed.rpms. Note: for Get_remote_installed_rpms() to work correctly, all computers will need to be set up to run ssh (a program for logging into a remote machine and for executing commands on a remote machine). This will be covered later in the section on configuring SyncDirs10.0 to run on your computer.

Having done all the preparatory work, the script now gets down to doing some actual work in getting the ftp page contents.

Section 3: Ftp page contents

Each ftp site has an index page for each of the updates directories with an HTML format that contains the detail of the files available for download. This section uses wget to download the index.html file for each of the updates directories. If successful, these index.html files are then saved as index.< dir > files (index.deltas, index.patches, etc).

k=0 sets a directory counter for use in accessing the array $IdxSuccess[n]. A for DirPath in..do..done loop processes each directory. Within the loop, A True flag is set for IdxSuccess[$k] ($k = the current directory), INDEX_HTML is defined as /tmp/SyncDirs/index.html, and FtpPage is built up to point to the ftp site's page for this directory. Because a directory path is being constructed, the $DirPath from the relevant $UpdateDir[ ] array member is used. The trailing / in this string is very important, because without it, wget will try to download files from the updates directory's parent directory, not the index.html data that constitutes the content of the updates directory itself.

If in Testing mode, and the user wants fresh index.html files, wget is run with $FtpPage as its argument. This downloads the index.html file for that directory. If the user doesn't want fresh index.html files, the touch command creates a dummy one to satisfy an upcoming test for it. In Live mode, an index.html file is fetched regardless.

The directory is retrieved from the $IdxDir[ ] array using the directory counter $k, and assigned to Dir. This is used to build the filename for the index file (/tmp/SyncDirs/index.< dir >) which is assigned to IndexDir. After checking that the index.html file was successfully downloaded, that Testing mode is active, and that the user wanted fresh indexes, the index.html file is renamed with the mv command to index.< dir >. In Live mode, the renaming happens regardless. If the download of index.html file was not successful, an error is printed to that effect, and the success of the download is registered as False in IdxSuccess[$k]. This will be used in deciding if this particular directory should be processed in later sections of code.

The directory counter is then incremented using Bash's built-in command let, with its ability to use arithmetic evaluation as per the C language. The let k+=1 notation here means add 1 to the present value of k. Another way to achieve that would be let k++. Once the index.< dir > files have been downloaded, the script moves on to building lists of files available for download.

Section 4: Build lists of files available for download

The job of Section 4 is to decipher the contents of the index.< dir > files and extract the filenames listed there for download. The files holding these lists of files are named ftp.files.< dir >, to represent all the files in that ftp directory applicable to your computer. The main body of Section 4 is a for Dir in "${IdxDir[@]}"..do loop that calls a number of functions that are defined before the loop. These functions are:
Get_filenames()
Get_arch_delta_rpm()
Get_rpm()
Get_index()
Get_noarch_rpm()
Get_info()
Get_all()
The last six functions create selection criteria for the types of files they need, then call the Get_filenames() function with the selection criteria as arguments, which then extracts the required filenames from the index.< dir > files, based on those selection criteria. Let's skip over these functions for the moment. We'll revisit them once we've looked at the main loop of Section 4.

The Section 4 for..in..do..done loop. After resetting the directory counter (k), the loop progressively evaluates Dir as one of the directories, then the variables IndexDir and FtpFilesDir are built from their component parts to become /tmp/SyncDirs/index.< dir > and /tmp/SyncDirs/ftp.files.< dir > respectively. A check is made on whether the previous download of index.< dir > was successful, and if not, a message is printed, the k directory counter is incremented, and the continue command is run to cause the remaining code in the loop to be skipped, and the next loop to be started. If the test succeeds, a case..in..esac statement is entered.

The case..in..esac statement. The case..in..esac statement is designed to test the value of a variable, and to take actions pertinent to different values the variable may have. It takes the general form:
case "$Variable" in
value1)

MMMM'< commands to run if $Variable == value1 >
MMMM';;
value2)
MMMM'< commands to run if $Variable == value2 >
MMMM';;
*)
MMMM'< commands to run if $Variable does not equal those previously tested >
MMMM';;
esac
Essentially, $Variable is tested, and if its value matches any of the strings appearing before one of the right bracket ')' markers, the code between that marker and the next double semi-colon marker (;;) is executed. Execution of code then continues on the next line after the esac keyword. If a match is not made with one of the explicitly specified strings, the code following the *) string is executed. *) is a catch-all for all values that do not match those previously tested, and is optional. the value tested can be either simple text, as shown above, or it could be a variable, using the form:
"$Value")
A case..in..esac statement with only one tested value and a *) catch-all is equivalent to an if..then..else statement.

The case statement in Section 4 tests the value of $Dir and takes varying actions depending on whether it matches deltas, patches, etc. All tested values are simple strings, except for "$ARCH", which could evaluate to one of i586 (the default), ppc, ppc64 or x86_64. Most of the tested values call only one of the previously defined functions; however, "$ARCH" and noarch each call three. Let's now trace the called functions if deltas was matched. Since these functions are similar in principle, this will serve to illustrate them all.

Get_arch_delta_rpm() function. Like most of the functions that call Get_filenames(), Get_arch_delta_rpm() works by specifying one string to match the general type of file required, then specifying another string (or strings) that further narrow the selection until just the required files are left. These strings equate to the $MatchText and $BlockText variables. For this function, files of type .noarch.delta.rpm and .i586.delta.rpm are the targets when .i586 is set as the default architecture. (.noarch.delta.rpm files are needed for all architectures). MatchText is therefore set to match on .delta.rpm and then a number of BlockText variables are defined to exclude those that aren't .i586.delta.rpm or .noarch.delta.rpm files. This is because .delta.rpm will also match on ppc.delta.rpm, ppc64.delta.rpm, and x86_64.delta.rpm. For the i586 architecture, .ppc.delta.rpm and 64.delta.rpm are set to be excluded by BlockText1 and BlockText2 respectively. 64.delta.rpm will block both ppc64.delta.rpm and x86_64.delta.rpm. The other architectures will have different architectures excluded, depending on which one is required to remain.

sed matching. The eventual destination for the $MatchText and $BlockText strings are some sed commands in the Get_filenames() function. The MatchText definition:
MatchText='/\.delta\.rpm/p'
has a < match string > specification between the two slashes, and a p at the end. The p stands for print. This whole expression, as an instruction to a sed command, says 'match on .delta.rpm, then only print lines containing this match'. (Remember that the \ leading the . character turns it into a literal dot character, not the metacharacter meaning any character).

The first BlockText definition:
BlockText1='/\.ppc\.delta\.rpm/!p'
is different to the MatchText definition: the p is preceded with an exclamation mark (!), which is the negation operator. This expression says 'match on .ppc.delta.rpm and do not print the lines which contain a match to this string'. This will therefore filter out any line that contains the string .ppc.delta.rpm. Similarly, $BlockText2 will filter out any lines containing .ppc64.delta.rpm and x86_64.delta.rpm. To see how this is all achieved, we need to examine the Get_filenames() function.

Get_filenames() function. On entry, two temporary files ($FTP_FILES_TMP and $FTP_FILES_TMP2) are cleared of any contents. The > character is a built-in command which reduces the named file to zero bytes. The first argument passed to the function ($1) is then assigned to the Select variable. The shift built-in command dumps the first argument, and shifts the remaining arguments one position up the argument list, such that $2 becomes $1, $3 becomes $2, etc. This means that the previously passed $MatchText is now in $Select, $BlockText1 is sitting in $1, and $BlockText2 is in $2. The line beginning with cat performs the general matching function with the $Select variable, while the for..do..done loop performs the selective filtering with the BlockText variables. Looking at the cat line:
cat $IndexDir | sed -ne '/href=/p' | sed -e 's/^.*">//' -e 's/<.*$//' -ne "${Select}" | sort > $FTP_FILES_TMP
The output of cat is piped to the input of sed, whose output is piped to the input of the second sed command, whose output is piped to the input of sort which output is then redirected to the first temporary file $FTP_FILES_TMP (/tmp/SyncDirs/ftp.files.tmp).

The effect of the cat command is to send the stream of data in the file index.deltas to the first sed command. The argument's meaning is:
-n = suppress printing of all lines
-p = enable printing for all matched lines (ie, those containing href=)
The only lines printed therefore, are those containing href=. This filters out the ftp page's higher HTML structure, leaving just the lines with the page's listed files. These lines are then fed to the second sed command.

A feature of sed is that it can have more than one instruction passed, provided that they are preceded with the -e switch. This sed command can be represented by:
sed -e < first instruction > -e <second instruction> -ne <third instruction >
where:
<first instruction> = 's/^.*">//'
<second instruction> = 's/<.*$//'
<third instruction> = "${Select}" (which, for this example = '/\.delta\.rpm/p' (including single quotes))
The first two instructions are intended for 'substitution', as they begin with the s character. The <replacement string> for both is nothing, as there is nothing between the last two slashes in the instruction. The third instruction only has a <match string>, as it is not a 'substitution' instruction. To see the effect of sed filtering, let's look at a representative sample of the lines being fed from the first sed command into the second:

2006 Feb 08 03:03 File <a href="<path>/deltas/alsa-1.0.9-23_23.2.i586.delta.rpm">alsa-1.0.9-23_23.2.i586.delta.rpm</a> (123,862 bytes)
2006 Feb 08 03:03 File <a href="<path>/deltas/alsa-1.0.9-23_23.2.ppc.delta.rpm">alsa-1.0.9-23_23.2.ppc.delta.rpm</a> (143,610 bytes)
2006 Feb 08 03:03 File <a href="<path>/alsa-1.0.9-23_23.2.x86_64.delta.rpm">alsa-1.0.9-23_23.2.x86_64.delta.rpm</a> (127,734 bytes)


(I shortened these lines for display purposes by removing extra spaces and substituting <path> for ftp://mirror.pacific.net.au:21/linux/suse/i386/update/10.0. Incidentally, the above output was obtained by taking the first three lines from the command cat index.deltas | sed -ne '/href=/p' | sed -ne '/alsa/p'.)

The <first instruction> means: 'match on text from the start of the line (^), through any character, any number of times, finishing with ">, and replace it with nothing'. If that were applied to the first sample line, it would match the text shown in red:

2006 Feb 08 03:03 File <a href="<path>/deltas/alsa-1.0.9-23_23.2.i586.delta.rpm">alsa-1.0.9-23_23.2.i586.delta.rpm</a> (123,862 bytes)

replacing that text with nothing gives:
alsa-1.0.9-23_23.2.i586.delta.rpm</a> (123,862 bytes)
The <second instruction> means 'match on a < character, followed by any character, any number of times, up to the end of the line, and replace it with nothing'. Applied to the output from the first instruction, it would match the text shown in red:
alsa-1.0.9-23_23.2.i586.delta.rpm</a> (123,862 bytes)
replacing that text with nothing gives:
alsa-1.0.9-23_23.2.i586.delta.rpm
Remember, this is only one of the sample lines. When applied to them all, the output will look like:
alsa-1.0.9-23_23.2.i586.delta.rpm
alsa-1.0.9-23_23.2.ppc.delta.rpm
alsa-1.0.9-23_23.2.x86_64.delta.rpm
The first two instructions didn't have a -n switch to the sed command. They weren't aimed at suppressing any lines, they were more about stripping sections of unwanted text from all of the lines, and keeping the wanted sections (ie, the filenames).

The <third instruction> however, has a -n switch. Combined with the $Select text of '/\.delta\.rpm/p', it has the effect of printing only the lines that contain a match with .delta.rpm. As all of the sample lines end with .delta.rpm, they will all be passed to the output to the temporary file /tmp/SyncDirs/ftp.files.tmp. In another directory, say i586, the $Select value of '/_en\.info/p' will collect all files ending in _en.info and reject those ending in _de.info. Code execution then moves on to the for Filter do..done loop, where the $BlockText filters are applied to the contents of the first temporary file .

for Filter do..done loop. This loop is a special case of the for..in..do..done statement. When the in keyword is missing, the variable following the for keyword is assigned the contents of $@, which is the collection of all command line arguments. Since $1 now equals $BlockText1 and $2 equals $BlockText2, Filter will be assigned these respective values in successive iterations of the loop. In the first iteration, the cat command will be run as:
cat /tmp/SyncDirs/ftp.files.tmp | sed -n '/\.ppc\.delta\.rpm/!p' > /tmp/SyncDirs/ftp.files.tmp2
This will filter all lines containing .ppc.delta.rpm

The next command moves the contents of the second temporary file to that of the first, then the loop is repeated.

In the second iteration, the cat command will be run as:
cat /tmp/SyncDirs/ftp.files.tmp | sed -n '/64\.delta\.rpm/!p' > /tmp/SyncDirs/ftp.files.tmp2
This will filter all lines containing 64.delta.rpm. The sample lines, filtered by these two exclusions, would produce only the line containing .i586.delta.rpm:
alsa-1.0.9-23_23.2.i586.delta.rpm.
On exiting the loop, the final command in Get_filenames() removes any blank lines from the list. The >> (append) operator is required, as this function may be called many times to compile the list of files listed in a directory. Exiting Get_filenames() takes us back to the Section 4 for Dir in "${IdxDir[@]}" main loop.

Back in the main loop, the remaining commands sort the file list into the first temporary file, then rename the sorted file to /tmp/SyncDirs/ftp.files.<dir>. The directory counter k is incremented, then the loop is repeated. Once all directories have been done, the code moves on to build lists of files to download.

Section 5: Build lists of files to download

Having retrieved the ftp pages' contents and assembled lists of available updates for each of the directories, this section calculates and builds the lists of files that your particular system needs for its updates. The functions called from the main Section 5 loop are:
Check_installed_file()
Filter_rpms()
Copy_patches()
Calc_ftp_add()
All of these functions use global variables; no return values are made.

Check_installed_file(). Check_installed_file() checks whether a file available for download is installed on the system. The $FtpAdd variable represents the file available for download. Its name is parsed with the same sed command that created the installed.rpms file to remove the <version>-<release>-<architecture>.rpm section, leaving the stem name in $ThisFile. Up to now the cat command has been used with sed. Note that since $FtpAdd contains a string representing the file's name, the echo command is used to output that string to the sed command. It's the name that's being dealt with here, not the contents of a file by that name. That's why echo is used instead of cat. The next command (which does use cat) streams the list of files in installed.rpms to the sed command, which uses $ThisFile as part of its <instruction> string, which translates to 'print only the line that contains the value of $ThisFile from the start (^) to the end of the line ($)' (ie, an exact match). For example, if the value of $ThisFile were alsa-, and a match were made, it would be placed in $InstalledFile. This matching scheme prevents a match on, say, sun-alsa-, as the alsa- part isn't at the start of the line. It also prevents, say, php4- making multiple matches on itself, php4-exif and a number of other php4-<nnnn> files, as the php4- part isn't at the end of the line. If no match is made (ie, the file is not installed on the system) a Null ("") is returned in $InstalledFile. Finally, if the file available for download is INDEX or INDEX.de, then it is included explicitly, as it will always need to be selected (not being an rpm file, it will not be in the installed.rpms list).

Filter_rpms(). Filter_rpms() filters certain rpms from the deltas and i586 directories, to save on downloading unneeded large files. It will also filter out files not installed on the system if $FILTER_NOT_INSTALLED is set to yes. Otherwise, all files available for download - minus those to be explicitly excluded in the next case statement - will be downloaded. Be careful with this setting, that's a lot of files! If set to filter out files not installed, AND the file is not installed, then no action is taken except to print a message 'Skipping non-installed file $FtpAdd'. The logical AND function is performed by the double-ampersand (&&) operator between the two test conditions enclosed by the square brackets. A set of cascading case statements then decide if the file is to be included in the download list. If the file is not an OpenOffice_org-, kernel- or MozillaFirefox- file, then it will be included by default by the final *) test. Otherwise, more case statements decide whether specific versions of these files get included - or not - with an appropriate Adding... or Skipping... message. The default settings select version 2 (or higher) of OpenOffice_org- but not the 1.9 version, nor any OpenOffice_org- language file; kernel-default is selected, while all others are rejected; and any MozillaFirefox-translations file are excluded, while all other MozillaFirefox- files are included. The settings can be changed by editing, adding to, or deleting from, the lines of code. An example of how to include an OpenOffice language file is given in the comments. During this process, a count of the number of files to be downloaded (kept in $i) is incremented with each 'adding' action.

Copy_patches(). Copy_patches() copies files from the local updates patches directory - that also exist in the ftp patches.obs directory - to the local updates patches.obs directory, to save having to download them from the ftp site. The method used is to:

1. Calculate which files are common to both local.files.patches and ftp.add.patches.obs listings.
2. Copy the actual files from the local updates patches directory to the local updates patches.obs directory.
3. Remove the listing of these files from those listed in ftp.add.patches.obs
4. Add the listing of these files to those listed in local.files.patches.obs

Step 1 is achieved by applying comm -12 to ftp.add.patches.obs and local.files.patches (leaves 3 - files common to both). This list of filenames is redirected to local.files.patches.copy.

In Step 2, the total number of files to be copied is calculated using:
TotCopy=$(cat $LOCAL_FILES_PATCHES_COPY | wc -l)
The contents of $LOCAL_FILES_PATCHES_COPY (local.files.patches.copy) are piped to the wc -l command, whose output is stored in $TotCopy. The wc command, depending on the options used, prints the number of newlines (-l), words (-w), bytes (-c) or characters (-m) in a file. It can even print the length of the longest line with the -L switch. With the -l switch, we are asking for the number of lines in the file, which equates to the number of files to be copied. The if test asks 'is the total number of files to be copied greater than zero?'. If so, these files are processed for copying. The script then has to read the filenames in from local.files.patches.copy. As covered earlier, the read command usually gets its input from the keyboard, the default stdin. stdin needs to be redirected to the file for the read command to operate on it. These two commands do that:
exec 7<&0MMMMMMMMMMMMMMMMMMMMMi'# Link file descriptor #7 with stdin (fd #0)
exec < $LOCAL_FILES_PATCHES_COPYMMMM# redirect stdin to the file
The first command redirects the default stdin (fd #0: the keyboard) with fd #7, which saves it there. The second command redirects stdin to the file, after which each read command will read characters from the file, up to the next newline character, when it will return the string (just as it would have if input was coming from the keyboard and the user had pressed Enter). Reading from the file, and actions taken on that input, occurs inside an until..do..done loop, which is similar to loops covered previously, except that the lines of code are executed repeatedly until the test condition evaluates as True. In this case, while ever $PatchCopy holds a string of some sort, the loop will repeat. Once the lines in the file have been exhausted, $PatchCopy will receive a null string (""), and code execution will exit the loop. Within the loop, the two copy commands (cp) copy each file, in turn, to both the local updates and remote backups patches.obs directories, thus completing Step 2. A message is also printed about the file being copied. Note that before the until loop, $PatchCopy is given dummy contents of test. If this were not done, and $PatchCopy were to be uninitalised on entry to the loop, the test would evaluate to True, and the loop would be bypassed. When finished reading from the file, these two commands restore the keyboard as standard input:
exec 0<&-MMMMMMMi'# Close input from file
exec 0<&7 7<&-MMMM# Restore stdin from fd #7 & close fd #7
First, input from the file is closed, stdin is redirected from fd #7 to fd #0, then fd #7 is closed. This restores the status quo. This method of redirecting stdin to a file, then using the read command to interrogate the file, will be used several more times in SyncDirs10.0.

Step 3 requires the files listed in local.files.patches.copy to be removed from the listing in ftp.add.patches.obs. This is done by:
comm -23 ftp.add.patches.obs local.files.patches.copy
The output of which is moved back to ftp.add.patches.obs. I'll explain the comm command because its effect is not obvious at first glance. comm -23 suppresses lines unique to local.files.patches.copy and those common to both, leaving lines unique to ftp.add.patches.obs. However, there should be no lines unique to local.files.patches.copy, because it was formed from lines contained in ftp.add.patches.obs in the first place. Therefore, only lines common to both files (which, by definition, will be all the lines in local.files.patches.copy) will be removed from ftp.add.patches.obs.

Step 4 is much simpler. The cat command joins the two filename listings, and after the sort command, local.files.patches.obs will have had the lines from local.files.patches.copy added to it, listed in the correct order.

Calc_ftp_add(). Calc_ftp_add() calculates which files need to be added from the ftp site. It compiles the list by comparing the files available for download, against those already in the local updates directory. It's also able to narrow the selection to only those programs installed on the system. It starts by defining the MatchText variable, which will be used in a sed command to strip the directory path from a list of files, leaving only the filenames. The $MatchText string is built by including $Dir (the current updates directory being examined) in between two fixed-character strings, as follows:
MatchText='s/^.*'$Dir'\///'
If the directory were deltas, then the string assigned to $MatchText would look like this:
s/^.*deltas\///
Note that the single quotes are now missing from the contents of $MatchText. This happened through expansion and evaluation of the strings as they were assigned to $MatchText. That's OK, because no shell expansion will occur when the $MatchText value is passed as an argument to sed. The string will be passed without interpretation. If you had entered the same instruction at the command line, you would have had to include the single quotes:
sed -e 's/^.*deltas\///'
This is because shell expansion occurs with strings passed as arguments typed in at the command line, but not with arguments passed as variables. In the latter case, the expansion occurs when the string is assigned to the variable. It's important to note this distinction when trying to understand how Bash handles strings in its various stages of operations.

To sed, the $MatchText variable means 'substitute all characters from the start of the line, up to and including deltas/ with nothing'. As a slash character is needed at the end of this <match string>, it needs to be escaped by the backslash character, giving the \/ combination after deltas. If it wasn't backslashed, sed would think it was defining the end of the <match string>, not part of it. A further complication arises when dealing with the patches.obs directory. As its real name is patches.obsolete, that needs to be substituted for $Dir, which is why the if statement uses the predefined $IDX_PATCHES_OBS2. The . in patches.obsolete needs to be backslashed so that sed thinks it is a dot character, and not a metacharacter. So why is $IDX_PATCHES_OBS2 defined with two backslashes before the . character? Well, the answer lies in how $MatchText is defined. Note that, in the case of the patches.obs directory, the $MatchText definition takes the form:
'<first string between single quotes>'$IDX_PATCHES_OBS2'<second string between single quotes>'
Having single quotes around these two strings prevents their contents from being evaluated as metacharacters. They are said to be 'protected' from shell expansion (when Bash evaluates metacharacters and expands them to their special meaning in that context). Every metacharacter is treated as a literal character inside single quotes, except the single quote character itself. When surrounded by double-quote (") characters, most metacharacters are protected from shell expansion. The only ones that remain a metacharacter inside double quotes are the $, \ and the back-tick character (`). If you want to use these characters in their literal sense inside double quotes, they need to be escaped by a backslash. For example, \\ represents a literal \ character inside double quotes.

Looking at the '<first string between single quotes>', with the exception of the s, they are all metacharacters. The action of evaluating them for assignment to MatchText would attempt a shell expansion, were it not for the single quotes. They need to be preserved in their literal meaning for insertion in MatchText, so that they will be presented to sed as metacharacters. For $IDX_PATCHES_OBS2 though, it is unprotected, so shell expansion will occur when it's being evaluated and assigned to MatchText. If we were to put patches\.obsolete in $IDX_PATCHES_OBS2, the Bash interpreter would think that \. is defining a literal ., not the metacharacter meaning any character, and patches.obsolete would be inserted into MatchText. sed would then interpret that as patches<any character>obsolete, not patches<literal dot>obsolete. $IDX_PATCHES_OBS2 therefore needs to be defined as patches\\.obsolete. With \\., the first backslash will protect the second backslash as a literal \, and the string patches\.obsolete will be inserted into MatchText. sed will then interpret that as patches<literal dot>obsolete, which is what we want. As it happens, either way of defining $IDX_PATCHES_OBS2 works in this script, because no character other than <literal dot> appears between patches and obsolete anywhere in an index listing, so no confusion arises from matching on <any character>. I make mention of this point at some length here because it is just the subtle error that will catch you out with other text groupings, and it would be very hard to track down.

Having constructed $MatchText as a sed instruction, it is used in the next line:
ls -x $UpdatesDir/* | sed -e $MatchText > $LocalFilesDir
In the deltas directory, ls -x would be operating on /backups/susetest/i386/update/10.0/deltas/*. The * character at the end of the directory path is important: it produces a single-column listing (albeit with the directory path included), whereas just / at the end produces a two-column output of just filenames. I thought the single-column format was easier to work with so chose that. One line from this listing might be:
/backups/susetest/i386/update/10.0/deltas/alsa-1.0.9-23_23.2.i586.delta.rpm
The text in red shows what sed filters out, leaving the filename, which is redirected to local.files.deltas (in this example). Processing of the ls -x listing in this way will add all the other filenames from the deltas directory to local.files.deltas.

Obsolete files in local updates directory. Obsolete files that are no longer listed in the ftp site are easily identified by comparing the ftp.files.<dir> and local.files.<dir> listings. The filenames unique to local.files.<dir> are the ones no longer needed. The comm command finds them and redirects the result to $LocalRemoveDir (eg, local.remove.deltas)

Files to add to local updates directory. A similar comparison finds out which files need to be added to the local updates directory, but this time by identifying the files unique to the listing in ftp.files.<dir>, and storing the result in $FtpAddDir (eg, ftp.add.deltas).

Calling Patches_copy() function. Having just compiled the lists of files in the local updates directories and the ftp directories, Now is the appropriate time to use the Patches_copy() function to save on having to download any ftp.files.patches.obs files that already exist locally. The if test ensures that this function is only called for the patches.obs directory. The PatchCopied flag is reset to False beforehand. If all the files needed to be downloaded were instead simply copied from the local updates patches directory, that would leave just the directory.3 file needing to be downloaded. The PatchCopied flag will be used later in ensuring that the directory.3 file is downloaded if PatchesCopy() actually copies some files to the local updates patches.obs directory.

Adding the filenames. Most of the work in adding filenames to the download lists occurs in the next block of code. A files counter ($i) is set to zero and if the $IdxSuccess[ ] array indicates that a valid index.<dir> file exists for this directory ($k) the $FtpAddDir file is prepared for reading. Each line of the file is read into the $FtpAdd variable. If in the deltas or $ARCH (eg, i586) directories, the Filter_rpms() function is called; if in any of the other directories, the files are added to the list held in $FtpFilDir (eg, ftp.fil.deltas). After processing the list in $FtpFilDir, the file is closed.

Section 5 main loop: build download lists. The for Dir in "${IdxDir[@]}" loop cycles through the updates directories and builds lists of files to download, using the functions defined previously in Section 5. A sed filter is used to strip the http or ftp text from the start of $FTP_STEM and store it in $IndexType for later use in deciding whether to update each downloaded file's date-time information. The sed instruction of 's/:.*$//' (replace from the colon to the end of the line with nothing) matches the text shown in red for the default $FTP_STEM setting:

ftp://mirror.pacific.net.au/linux/suse/i386/update/10.0

thus leaving ftp for $IndexType. The directory counter ($k) is set to zero and the loop is entered. Quite a number of path and file names, which will be used in building the download lists, are defined. Their purpose and example names are:
DirPathMMMMMMM'name of the update directory path. eg, /deltas
UpdatesDirMMMMMfull pathname for the local updates directory. eg, /backups/suse/i386/update/10.0/deltas
IndexDirMMMMMMnname of the ftp directory index.<dir> file. eg, index.deltas

FtpFilesDirMMMMM'filename for storing names of ftp files. eg, ftp.files.deltas
LocalFilesDirMMMMfilename for storing names of local updates files. eg, local.files.deltas
LocalRemoveDirMMfilename for storing names of local files to be removed. eg, local.remove.deltas
FtpAddDirMMMMMnfilename for storing names of files to be added from the ftp site. eg, ftp.add.deltas
FtpFilDirMMMMMMnDitto to $FtpAddDir, but unwanted files filtered out. eg, ftp.fil.deltas
FtpFilesDirDatesMMfilename for storing names of files and their date/time stamps. eg, ftp.files.deltas.dates
FtpFilesDirAllMMMn'filename for storing names of all files in the ftp directory. eg, ftp.files.deltas.all
File date-time stamps. Since the ftp pages contain each file's date and time information, they are all extracted and saved in a temporary file (ftp.files.<dir>.dates) for use later, when each file is downloaded. The command for generating this file is:
cat $IndexDir | grep 'href=' | tr -s " " | cut -d" " -f 2-5 > $FtpFilesDirDates
The grep command prints lines that match a pattern. In the simple mode used here, grep prints any line containing the 'href=' string, and removes any line without it. This action limits the lines to those with filenames, excluding the ftp page's higher HTML structure. The cut command removes columns from each line of a file or text output. The -d" " switch specifies a space as the delimiter between columns, and -f 2-5 specifies the columns to remain after the cut is made. This setting will delete column 1 and column 6 and upwards. However, the cut command presents difficulties when a line has multiple spaces between columns, because it will treat each space as a separate column. Since many commands pad with spaces to align columns, cut will have problems parsing their content into like columns. The tr command comes to the rescue here. tr is used to translate or delete characters. Its -s " " switch is used to 'squeeze' multiple instances of the given character (a space) to one. With only one space now between columns in output, cut will work correctly. The command above thus takes each line of index.<dir>, excludes non-file data, truncates multiple spaces to one space, cuts columns out such that only columns 2 to 5 remain, then stores the result in ftp.files.<dir>.dates. To see how this works, this is the initial part of a line from index.deltas:

NN2006 Feb 08 03:03NNFileNNNNNNNN<a href="ftp://mirror.pacific.net.au

There are two spaces before 2006 and File, and eight spaces after File. This is how it looks after tr -s " " has squeezed the spaces to one:

N2006 Feb 08 03:03 File <a href="ftp://mirror.pacific.net.au

The cut command extracts the text in blue:

N2006 Feb 08 03:03 File <a href="ftp://mirror.pacific.net.au

Notice how cut treats the first space as column 1. That's why 2006 is in column 2 and 03:03 is in column 5.

All that's created is a four-column list of YYYY Mon dd HH:mm with no filenames attached. The filenames couldn't be extracted along with the dates and times because they are not separated by spaces in the index.<dir> file. Since the listing of dates-times extracted just now is comprised of all files listed in the ftp directory, it follows that we need to extract a list of all filenames - not the selective lists derived to now - to marry the filenames with their corresponding dates-times. We can pluck them out with sed, using this line:
cat $IndexDir | sed -ne '/href=/p' | sed -e 's/^.*">//' -e 's/<.*$//' > $FtpFilesDirAll
The first sed command is equivalent to the grep 'href=' expression used in the previous command; it restricts printed lines to those containing filenames. The sed instructions following the second sed command are the same as those used previously in Get_filenames() when extracting filenames from the index.<dir> files. The result is an ftp.files.<dir>.all file that lists all files in index.<dir>, and whose filenames correspond exactly with the dates assembled in ftp.files.<dir>.dates, as they used the same index.<dir> file, extracted in the same order. These filenames now need to be matched to their dates-times. This is done with the line:
paste -d" " $FtpFilesDirDates $FtpFilesDirAll > $FTP_FILES_TMP
The paste command merges corresponding lines of files into a single line and writes the combined line to stdout. The -d" " switch instructs paste to separate the text from the two files by a space. In the order used here, each line will be composed of:
<date-time><space><filename>
After renaming to $FtpFilesDirDates (eg, ftp.files.deltas.dates), this file will serve as a source of dates and times for when the files are downloaded later.

The Calc_ftp_add() function is then called, where most of the work is done in compiling the download lists. A check is then carried out on the number of files flagged for download, then if only one file is due from the patches, patches.obs, i586 or noarch directories, a message is printed about ignoring the directory.3 or INDEX file, depending on the directory. The logic here is that previously, these files were removed to force a new copy to be downloaded if the directory contents had changed. Since the only file flagged for download is one of these, then it follows that no other file in the directory has changed, and so nor would have one of these. It's also simpler to recover the removed version instead of downloading it. The only exception to this logic is the directory.3 file in the patches.obs directory. If files have previously been copied from the local updates patches directory to save downloading them, it's most likely that the directory.3 has changed and needs to be downloaded. So the if test on $PatchCopied decides whether to download it. If so, the download count ($i) is set to 1, which will enable its download later. The number of files to be downloaded is then printed in a message, and also stored in the $DloadCount[ ] array member for that directory. The directory counter ($k) is incremented, and the loop repeats to build download lists for the next directory. Note the expression $(($k)) used in the if statement and the second case statement. The double-parenthesis construct ((..)) permits arithmetic expansion and evaluation of the enclosed expression. A translation of $(($k)) is 'the value of the arithmetic evaluation of the value of the variable k'. This is to make sure that the quantity is being treated as a number. (Strictly, an integer: a floating point value will cause an error with this notation.) Although not used in this script, the double-parenthesis construct allows C-type manipulation of variables (eg, (( i = 25 )), (( i++ )), (( --i )), etc).

Section 5 has another for Dir in "${IdxDir[@]}" loop that provides a summary of files to be downloaded for all directories, and restores directory.3 and INDEX files when no files are to be downloaded. Beforehand, a Restore_backup_file() function is defined, which is similar to Save_backup_file() described earlier, except that it (quite obviously) restores files instead. It also has an extra $RestoreFlag variable used to record the success, or otherwise, of the restoration. It's followed by four other variables ($Dir3Restored, $Dir3ObsRestored, $IndexRestored and $IndexNoarchRestored) where this record of success is stored for the patches, patches.obs, i586, and noarch directories. The directories counter ($k) is then reset, and a $Download flag is set to zero (0 = no files to download, 1 = there are files to download). Inside the second loop, the download count for each directory is loaded into the $Count variable. If the download count is zero, then the case statement directs execution to another case statement that tests the current directory ($k) being examined. The directory.3 or INDEX file appropriate to that directory is then restored by a call to the Restore_backup_file() function. Outside these case statements, an if..then..else..fi statement assigns a singular or plural flavour to the $Files variable, then a message is printed regarding the number of files to be downloaded in the directory. The directory counter ($k) is incremented, and the loop is repeated for all other directories. Following the second loop, a Cleanup() function is defined.

Cleanup() function. The Cleanup() function is designed, for Testing mode only, to remove files from the local updates patches.obs directory that were copied from the patches directory in the Copy_patches() function, and to restore the directory.3 and INDEX files mentioned previously, if their restore flags are still set to False.

The final part of Section 5 checks if any files are to be downloaded, and if not, prints a message for the user, and asks if the script should continue on to synchronise the remote backups with the local updates directories. If you had some reason to believe that they had become different, you might answer y here. Otherwise, answering n will call the Cleanup() function, print a finishing date and time, and exit the script. If there were files to download, the script will take you to Section 6.

Section 6: Download the Files

The script has discovered which files need to be downloaded from which ftp directories, and it knows where they are hosted, so it only remains to download them. It uses yet another for Dir in "${IdxDir[@]}" loop to process these downloads. Before that though, a Touch_date_time() function is defined.

Touch_date_time() function. This function runs the touch command on a downloaded file ($FtpAdd), setting its date and time information to the value stored in the appropriate $FtpFilesDirDates file (eg, ftp.files.deltas.dates). The touch command is normally used to change file timestamps. When used in the simple form touch <filename>, the effect is to change the file's date-time data to the current date and time. If the file doesn't exist, a new, empty one is created. In the mode used here, the -d switch sets a specific date and time for an existing file, as specified by the string following the switch. The first command extracts the date and time from the $FtpFilesDirDates file:
FileData=$(cat $FtpFilesDirDates | grep $FtpAdd)
The grep command ensures that only a line that matches the file $FtpAdd is assigned to the variable $FileData. If alsa-1.0.9-23_23.2.i586.delta.rpm were the file being processed, then this string would be inserted into $FileData:
2006 Feb 08 03:03 alsa-1.0.9-23_23.2.i586.delta.rpm
Note that the year is in column 1, the month in column 2, the date in column 3, the time is in column 4, and they are all separated by a single space. The Year is extracted from this line by this command:
Year=$(echo $FileData | cut -d" " -f 1)
The string of characters in $FileData is echoed to the cut command, which uses the -f 1 switch to select column 1. Thus, 2006 gets allocated to Year. Similar commands extract the remaining data, except that their cut commands select column 2 for MonthText, column 3 for Date and column 4 for Time. Some files don't have a time recorded for them, so the word File appears in its place. For these, a token 00:01 time is allocated. The touch command needs the month to be a number, not the Feb style used in the ftp page. The case $MonthText in statement does that conversion to MonthNum. In Live mode, the touch command is run on the file $FtpAdd:

touch -d "$Year-$MonthNum-$Date $Time" $FtpAdd
with the -d switch specifying to parse the YYYY-MM-DD HH:mm string format. In Testing mode, a downloaded file doesn't exist, but the date-time information is printed to the console, to check the integrity of the extraction process.

Before the main Section 6 loop, the directories counter ($k) is reset to zero.

Section 6 main loop: Download all selected files. The downloaded files counter ($i) is reset to zero, and the following variables are defined:
DirPathMMMMMMMname of the update directory path. eg, /deltas
UpdatesDirMMMMNfull pathname for the local updates directory. eg, /backups/suse/i386/update/10.0/deltas
FtpDirPathMMMMM'directory path for the ftp site updates directory. eg, ftp://mirror.pacific.net.au/linux/suse/i386/update/10.0/deltas
FtpFilDirMMMMMMnfilename for storing names of files to be added from the ftp site. eg ftp.fil.deltas
FtpFilesDirDatesMMfilename for storing names of files and their date/time stamps. eg, ftp.files.deltas.dates
$UpdatesDir and $FtpDirPath are built from their predefined stem values, plus $DirPath, extracted from the $UpdateDir[ ] array member for the current directory ($k). $FtpFilDir and $FtpFilesDirDates are built from their predefined stem values, plus .$Dir, extracted from the $IdxDir[ ] array member for the current directory, with the latter having .dates appended as well. The cd command is used to change directory to the $UpdatesDir, so that downloaded files will be saved into the correct local directory. The Count variable is then loaded with the number of files to download from the $DloadCount[ ] array member for the current directory. After checking that a valid index.<dir> file exists for this directory, AND that there is at least one file to download (the test asks: 'is $Count greater than zero?') the file holding the download list ($FtpFilDir) is prepared for reading, and a message is printed announcing how many files are to be downloaded. The processing of these downloads is handled by the until [ -z "$FtpAdd" ]; do loop.

until [ -z "$FtpAdd" ]; do loop. The filename is read from the $FtpFilDir file into $FtpAdd. If a valid filename results (ie, the end of file has not yet been reached) the file is processed for downloading. The file counter ($i) is incremented (the first file will have $i incremented to 1) and the file's full path is generated by joining $FtpDirPath, / and $FtpAdd. This is stored in $FileAdd. The download sequence number and the filename are printed to the console, then, if in Live mode, the file is downloaded by the curl command:
$(curl -C - -O -L $FileAdd)
curl is designed to transfer files from a server without user interaction. The switches used in conjunction with curl mean:
-C -MMMresume a previous transfer - automatically.
-OMMMnoutput to a local file, using only the filename part; discard the (server) path
-LMMMmallow server to specify a different location
With these settings, curl will resume a previously interrupted download, which is a boon if your 98MB OpenOffice file got interrupted by a network problem at the 96MB point. It will also save the file to your local filesystem using the same name listed in the ftp page. An if test checks that the file was successfully downloaded, then if the download source was an ftp site, Touch_date_time() is called to update the file's date-time information to that shown on the ftp directory page. If the file proves to not exist, an error message is printed, except for Testing mode, where the Touch_date_time() function is called regardless, to check its workings.

As an alternative to the double-barreled if statement in this loop, the final else keyword tests whether $IdxSuccess[$k] is False. If it is, a message is printed that the index file does not exist, and the script is skipping that directory. If $IdxSuccess[$k] is not False (ie, it is True) then this must also mean that $Count is zero. A message is then printed that this directory is being skipped. Finally, the directory counter is incremented, and the loop repeats.

Before leaving Section 6, and if in Testing mode, the CreateLocalFilesUpdated script is run with the argument 1. This creates local.files3.<dir> files that simulate the effect of downloading the files. The local.files3.<dir> files are internal to CreateLocalFilesUpdated. They are an intermediate step to producing the local.files4.<dir> files in Stage 2 of CreateLocalFilesUpdated, which is run as the last action of Section 7.

Section 7: Remove Obsolete Files From the Local Updates Directory

Section 5 identified the obsolete files in the local updates directory that needed to be deleted; this section deletes them, using the familiar for Dir in "${IdxDir[@]}" loop. After checking that this directory has a valid index.<dir> file, a message is printed announcing that obsolete files are being removed from the <dir> directory. If IdxSuccess[$k] is False, a message is printed about skipping this directory, then the next directory is processed.

The $UpdatesDir and $LocalFilesDir variables are defined as before, with one other - $LocalRemoveDir - constructed to point to a file holding filenames of local files to be removed. The total number of files to be deleted is calculated using:
TotDel=$(cat $LocalRemoveDir | wc -l)
The contents of $LocalRemoveDir are piped to the wc -l command, whose output - the number of lines in the file - is stored in $TotDel.

The file is prepared for reading, a message is printed announcing the total number of files to be removed, then two counters are reset to zero: $i is the normal files counter, and $j is the files skipped counter. Since the algorithm does not delete files starting with kernel-, the skipped files counter is incremented each time one of these is skipped. Inside the until [ -z "$LocalRemove" ]; do loop, the filename is read into $LocalRemove, and if it's deemed to be a valid filename, it is processed for deletion. The case $LocalRemove in statement singles out kernel- files to only have a message printed about skipping its deletion. The files counter ($i) and the skipped files counter ($j) are both incremented. Otherwise, any other file is dealt with under the *) test, where a message is printed that the file is being deleted. In Live mode, it is deleted using the rm command. In Testing mode, no files are deleted; only the message is printed. In either mode, the files counter is incremented.

After all the files are processed, the files counted is adjusted by subtracting the number of skipped files with the statement let i-=$j. Think of this as being i = $i-$j. $i now holds the number of files actually deleted. A message is then printed about the number of files skipped, using $j, and another message is printed about the number of files deleted, using $i. The $LocalRemoveDir file is closed, the directories counter is incremented and the loop is repeated for the remaining directories. If all went well, the number of files deleted, plus the number skipped, should equal the number announced for deletion ($TotDel) just before the start of this loop.

After all directories have been processed, and if in Testing mode, CreateLocalFilesUpdated 2 is run. This creates the local.files4.<dir> files that simulate the effect of deleting these obsolete files.

Section 8: Synchronise the Remote Backups Directories with the Local Updates Directories

Another for Dir in "${IdxDir[@]}" loop controls this 'sync backups' section. The $UpdatesDir and $LocalFilesDir variables are defined as before, with five new ones being created:
BackupsDirPathMMMMnfull pathname for the remote backups directory
LocalFiles2DirMMMMMnfile holding names of files in the local updates directory after the new updates have been added
BackupsFilesDirMMMMnfile holding names of files in the remote backups directory
BackupsAddDirMMMMMfile holding names of files to be added to the remote backups directory
BackupsRemoveDirMMnfile holding names of files to be removed from the remote backups directory
The MatchText variable is created in the same way as in Calc_ftp_add in Section 5, and it will be put to the same use in stripping file pathnames from directory listings to get the filenames. In Live mode, the command:
ls -x $UpdatesDir/* | sed -e $MatchText > $LocalFiles2Dir
takes a listing of the current updates directory, strips the pathnames, then stores the filenames in the $LocalFiles2Dir file. In Testing mode, the line:
LocalFiles2Dir=$LOCAL_FILES4.$Dir
reassigns the variable LocalFiles2Dir to point to the pre-prepared local.files4.<dir> file and a message is printed to that effect. Note that from now on, whenever the script refers to $LocalFiles2Dir, it is really referencing local.files4.<dir>, which will make it behave as though the updates were really downloaded. This use of LocalFiles2Dir is much like using pointers in the C language - maybe in a crude (and limited) sense - but its re-assignment here proves just as useful.

A listing of $BackupsDirPath/*, with sed -e $MatchText filtering, creates $BackupsFilesDir, which, with $LocalFiles2Dir as arguments to the comm command, produce:
$BackupsRemoveDirMMnfiles unique to $BackupsRemoveDir
$BackupsAddDirMMMMMfiles unique to $LocalFiles2Dir
The total number of files to be added to the local backups directory ($TotAdd) is found by piping the contents of the $BackupsAddDir file to the wc -l command. The filenames are then read in from $BackupsAddDir and the files are copied to the remote backups directory using similar code to the method used to delete the obsolete local updates files just described. Similarly, the contents of $BackupsRemoveDir are used to remove obsolete files from the remote backups directory. Again, any kernel- files are skipped. At the end of the loop, the directories counter is incremented, and the loop repeats for the remaining directories.

Section 9: Finishing

The Cleanup() function is called in Testing mode, to:
remove the files listed in local.files.patches.copy from the patches.obs local updates and remote backups directories,
restore the directory.3 files to the patches and patches.obs local updates and remote backups directories, and
restore the INDEX files to the $ARCH (i586) and noarch local updates and remote backups directories.
A finish date and time is recorded in the Log file, then a message is printed 'Directories now synchronised, exiting...'

That completes the description of SyncDirs10.0. Let's now turn our attention to the ExportRpms script.

ExportRpms script

ExportRpms is a relatively simple script that creates a list of installed rpms on the computer on which it is run, and copies that list to a shared directory on the main computer running SyncDirs10.0. It uses the same command to extract the installed rpms list as described earlier in Section 2. It's important to differentiate between the use of the terms local and remote in the ExportRpms script, versus any use they have received previously in this tutorial. In the ExportRpms script, local means the computer on which that script is run, and remote refers to the main computer running SyncDirs10.0.

First, LOCAL_HOST is assigned the name of the local host, which is a unique name on the network. Therefore, each computer running ExportRpms will necessarily have different settings for this variable. The default setting of remhost1 is an example, which will need to be changed to the actual hostname of the computer involved. If a hostname has not been set, then its IP number should do (eg, 192.168.0.2)

The REMOTE_DIR variable is assigned the value /mnt/fred_common, the local mount point for the directory on the main computer where SyncDirs10.0 will access each <hostname>.installed.rpms file. This directory name does not need to be the same for every computer, but it makes sense to have it that way, to make life easier. Again, the default setting is an example, which will need to be changed to the actual mount point used on the computer involved.

LOCAL_DIR is a local directory for which the nominated user has read-write permissions. The default setting of /tmp is a natural fit for this purpose, but it could be any other directory with the same user permissions.

INSTALLED_RPMS is a generic name common to all installed rpms files. The default setting of installed.rpms must agree with the corresponding setting in SyncDirs10.0.

The setting for LOCAL_INSTALLED_RPMS of $LOCAL_DIR/$INSTALLED_RPMS produces the full pathname /tmp/installed.rpms.

LOCAL_INSTALLED_RPMS_TMP is a temporary file used to create $LOCAL_INSTALLED_RPMS above. It is set to /tmp/installed.rpms.tmp.

REMOTE_INSTALLED_RPMS is the full pathname of the installed rpms file in its final location. Its setting will produce /mnt/fred_common/remhost1.installed.rpms for the examples given.

The rpm command saves the list of installed rpms into $LOCAL_INSTALLED_RPMS_TMP (ie, /tmp/installed.rpms.tmp), from where the cat command pipes its contents via the sort command to $LOCAL_INSTALLED_RPMS (ie, /tmp/installed.rpms). The last command copies it to $REMOTE_INSTALLED_RPMS (ie, /mnt/fred_common/remhost1.installed.rpms).

CreateLocalFilesUpdated and SyncDirs10.0-Test scripts. I will not describe these two scripts, as they do not cover any Bash commands additional to those already described; nor is an understanding of their operation vital to how SyncDirs10.0 and ExportRpms work, as they are only needed in Testing mode. Both scripts are well documented with comments, so if you need to understand how they work, please refer to the internal comments.

Before starting Part 2, I'll list some Bash guides/tutorials I consulted regularly when constructing SyncDirs10.0:

1. Bash Guide for Beginners http://www.tldp.org/LDP/Bash-Beginne...tml/index.html
2. Advanced Bash Scripting Guide http://www.tldp.org/LDP/abs/html/index.html
3. Sed - An Introduction and Tutorial http://www.grymoire.com/Unix/Sed.html
4. sed tutorial http://www.selectorweb.com/sed_tutorial.html
5. man pages for the various Bash commands (get these by entering man <command> in a console).

Part 2 - Configuring SyncDirs10.0 on your Computer

This part of the tutorial will explain how to configure SyncDir10.0's various settings on your computer, and on others on your local network. Throughout these instructions, I'll refer to these three example configurations:

1.Mlocal host mainMMMMMMMMMMNi# Main machine running the SyncDirs10.0 script
MMuser fredMMMMMMMMMMMMMNn# Primary user on main - most probably you
MM/home/fred/SyncDirsMMMMMMMN'# Primary location of all scripts
MM/home/fred/CommonMMMMMMMM# Shared folder accessible by remote machines
MM/backups/suseMMMMMMMMMMM'# Primary directory for holding SUSE updates
MM/backups/susetestMMMMMMMMMl# Primary directory for holding SUSE updates (Testing mode)
MM/backups/susetest2MMMMMMMMn# Directory for holding backups of SUSE updates (Testing mode)
MM/mnt/remhost1_suseMMMMMMMM'# Local mount point for /backups/suse directory on remhost1
MM/tmp/SyncDirsMMMMMMMMMMMn# Temp working directory (Testing mode)
MM/var/log/SyncDirsMMMMMMMMMN# Working directory (Live mode)

2.Mfirst remote host remhost1MMMMN-# First remote machine needing updates
MMuser maryMMMMMMMMMMMMMn'# Primary user on remhost1
MM/backups/suseMMMMMMMMMMIm# Directory for backups of SUSE updates held on main (root-owned)
MM/mnt/fred_commonMMMMMMMMM# Local mount point for fred's shared directory /home/fred/Common on main
MM/mnt/main_suseMMMMMMMMMMn# Local mount point for /backups/suse directory on main

3.Msecond remote host remhost2MMM# Second remote machine needing updates
MMuser daveMMMMMMMMMMMMMn# Primary user on remote host remhost2
MM/mnt/fred_commonMMMMMMMMn'# Local mount point for fred's shared directory /home/fred/Common on main
MM/mnt/main_suseMMMMMMMMMMl'# Local mount point for /backups/suse directory on main

You can base your own configurations on these examples. Where these instructions place a comment following a # character, they are meant to be an explanation for the reader here; they should not be included in any commands issued in a console. The above configurations mention directories beginning with /backups. I have a separate /backups partition on my machines which I use to store the updates. Because it's not one of the normal linux system partitions, it remains available after re-installations, etc. However, any root-owned directory would be suitable, provided it had sufficient space.

Some of the above directories appear in script definitions:

Testing mode:
UPDATES_STEM=/backups/susetest/i386/update/10.0MMMMM# Temp parent directory holding updates
BACKUPS_STEM=/backups/susetest2/i386/update/10.0MMMNi# Temp parent directory holding backups of updates
WK_DIR=/tmp/SyncDirsMMMMMMMMMMMMMMMMMMMMMM# Working directory - where Lots of temp files get stored
Live mode:
UPDATES_STEM=/backups/suse/i386/update/10.0MMMMMMN# Parent directory holding updates
BACKUPS_STEM=/mnt/remhost1_suse/i386/update/10.0MMMn# Parent directory holding backups of updates
WK_DIR=/var/log/SyncDirsMMMMMMMMMMMMMMMMMMMN'# Working directory - where Lots of temp files get stored
To create these directories, start a console session (Start-> System-> Terminal-> Terminal Program (Konsole)) on the main computer and enter the following commands:
cdMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM# Make sure /home/fred is the current directory
mkdir SyncDirsMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM# Create the scripts' directory
mkdir CommonMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM# Create the /home/fred/Common directory
mkdir -p /backups/susetest/i386/update/10.0/deltasMMMMMMMMMMM# Create the /backups/susetest/i386/update/10.0/deltas directory
mkdir /backups/susetest/i386/update/10.0/patches
mkdir /backups/susetest/i386/update/10.0/patches.obsolete
mkdir -p /backups/susetest/i386/update/10.0/rpm/i586
mkdir /backups/susetest/i386/update/10.0/rpm/noarch
mkdir /backups/susetest/i386/update/10.0/rpm/scripts
The - p switch to the mkdir command creates parent directories where they don't yet exist. Instead of typing in all the text for the commands involving /backups/susetest, try tapping the Up Arrow key to get the previous command from Bash's command history, then edit for the next command. You can use this trick to repeat for all the directories involving /backups/susetest2. Create the remaining directories with these commands:
mkdir /tmp/SyncDirs
su -
MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
# Note the dash following su. Enter root's password when prompted
mkdir -p /backups/suse/i386/update/10.0/deltasMMMMMMMMMMMMMM# Create the /backups/suse/i386/update/10.0/deltas directory
mkdir /backups/suse/i386/update/10.0/patches
mkdir /backups/suse/i386/update/10.0/patches.obsolete
mkdir -p /backups/suse/i386/update/10.0/rpm/i586
mkdir /backups/suse/i386/update/10.0/rpm/noarch
mkdir /backups/suse/i386/update/10.0/scripts
mkdir /var/log/SyncDirs
On remhost1, enter these commands:
su -
mkdir -p /backups/suse/i386/update/10.0/deltas
mkdir /backups/suse/i386/update/10.0/patches
mkdir /backups/suse/i386/update/10.0/patches.obsolete
mkdir -p /backups/suse/i386/update/10.0/rpm/i586
mkdir /backups/suse/i386/update/10.0/rpm/noarch
mkdir /backups/suse/i386/update/10.0/scripts
Of course, if your $ARCH setting is not i586, then use whatever your $ARCH setting is.

Copy scripts to local host main

The next thing to do is to download the scripts and copy them to the normal user's /home directory (/home/fred/SyncDirs for example) on the computer that is going to control the update process. I have made it the computer I use most, as I will see the SUSE Watcher icon turn red (or yellow), indicating that updates are available, and I can run the script directly from there.

Make the scripts executable

Before the scripts can be run directly as a Bash script, they need to have their executable flags set. In the same console on the main computer, enter the following commands:
exitMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMN# Exit from root session to fred's console session
cd SyncDirsMMMMMMMMMMMMMMMMMMMMMMMMMN# Change directory to scripts' location
chmod 755 SyncDirs10.0
chmod 755 ExportRpms
chmod 755 CreateLocalFilesUpdated
chmod 755 SyncDirs10.0-Test
ls -l
The last command should show all the files as having permissions of rwxr-xr-x, their executable flags (x) having been set by the chmod command.

Configuring the SyncDirs10.0 script

Initial settings. There are a few settings specific to each user's environment that will need modification before being run. These are identified at the start of the Declarations section, and include:
whether to run in Testing mode,
whether to filter out rpms not installed on your machines,
usernames and remote hosts on your local network that you want updates for,
the local directory for saving remote hosts' lists of installed rpms,
the directory holding the CreateLocalFilesUpdated script,
the cpu architecture,
the language used by YOU,
the ftp site's url, and
the local directories to be used.
I have left example settings there to guide you. A good place to get urls for SUSE's updates sites is through YOU. Start YOU (right-click SUSE Watcher in the system tray->Start YOU) and select an Installation source via the drop-down menu. The url is then shown under Location. Note that this url will end in suse. You will need to add /i386/update/10.0 to the end of the url in the script. The supplied FTP_STEM value in the script serves as a guide. (Note that you will need to supply the root password to launch SUSE Watcher). Any changes to the scripts can be made by editing them in your favourite text editor. (I use KDE, and can recommend Kate as a very good text editor.)

Remote backups. The script is based on my standard procedure of backing up the downloaded updates on another computer on the local network. As well as serving as a backup for the local host, it can also serve as a primary source for updates on the computer on which they are stored. In following this practice, the script makes provision for a Local Updates directory (declared as UPDATES_STEM) and a Remote Backups directory (declared as BACKUPS_STEM) in which the sub-directories of the SUSE updates site are stored. For this to work, the $BACKUPS_STEM directory should be a shared directory, mounted locally on the main machine running the SyncDirs10.0 script. For example, if the selected backup computer was remhost1 and its backup directory was /backups/suse, then a local mount point on the main machine could be made as, say, /mnt/remhost1_suse. Then, a suitable entry for the script would be:
BACKUPS_STEM=/mnt/remhost1_suse/i386/update/10.0

The locations in these instructions, and the default settings in the script, are merely suggestions. You are free to decide any names you wish.

SyncDirs10.0 outputs its activity to the console, and generates a log file of the form syncdirs10.0_yymmddHHMM.log for reference, should you ever need to check on what happened during the running of the script.

ExportRpms

This script is installed in the /usr/bin directory of each remote host, with permissions for a nominated local user. You will need a local mount point on each of the remote hosts for fred's shared directory called, say, /mnt/fred_common, to which the ExportRpms script can save its <hostname>.installed.rpms file. To get our copy of ExportRpms to the remote hosts, first copy it from /home/fred/SyncDirs to /home/fred/Common, then copy it from that directory to the remote hosts. Enter this command on the local host main - as user fred:
cp /home/fred/SyncDirs/ExportRpms /home/fred/Common
On each remote machine, enter these commands:
remhost1 - as user mary:
cp /mnt/fred_common/ExportRpms /usr/bin
remhost2 - as user dave:
cp /mnt/fred_common/ExportRpms /usr/bin
This script will already have the necessary permissions to run, as it will adopt the permissions of the user doing the copying. The local host main will issue a command via ssh (secure shell) across the local network for the nominated user (not root) on the remote host to execute /usr/bin/ExportRpms.

Before use, each remote host will need to have the following lines in ExportRpms tailored to suit their own specific environment:
remhost1:
LOCAL_HOST=remhost1
REMOTE_DIR=/mnt/fred_common
remhost2:
LOCAL_HOST=remhost2
REMOTE_DIR=/mnt/fred_common
Using ssh. Because the local host main uses ssh to issue the command to the remote hosts, ssh will need to be configured on all machines. While not a tutorial on ssh, these commands will do what's required:

remhost1 - as user mary:
cdMMMMMMMMMMMMMMMMMMM# Make sure /home/mary is the current directory
mkdir .sshMMMMMMMMMMM# Create the .ssh directory
chmod 0700 .sshMMMMMM# Make sure only user mary can access the directory
remhost2 - as user dave:
cdMMMMMMMMMMMMMMMMMMM# Make sure /home/dave is the current directory
mkdir .sshMMMMMMMMMMM# Create the .ssh directory
chmod 0700 .sshMMMMMM# Make sure only user dave can access the directory
main - as user fred:
cdMMMMMMMMMMMMMMMMMMM# Make sure /home/fred is the current directory
mkdir .sshMMMMMMMMMMM# Create the .ssh directory
chmod 0700 .sshMMMMMM# Make sure only user fred can access the directory
sshkeygen -t rsaMMMMM# Press the Enter key when prompted for the passphrase
chmod 0600 .ssh/*MMMM# Make sure only user fred can access the files
ls -l .sshMMMMMMMMMMM# Should see 2 files: id_rsa and id_rsa_pub
scp .ssh/id_rsa_pub mary@remhost1:~/.ssh/authorized_keys2MMMM# scp = secure copy
MMMMMMMMMMMMMMMMMMMMM# When prompted, enter mary's password - should get output confirming a successful copy
scp .ssh/id_rsa_pub dave@remhost2:~/.ssh/authorized_keys2
MMMMMMMMMMMMMMMMMMMMM# Ditto for dave's password

# These next commands establish connection to the remote hosts:
ssh mary@remhost1MMMM# Confirm successful login as mary on remhost1
exitMMMMMMMMMMMMMMMMM# Confirm successful logout from remhost1
ssh dave@remhost2MMMM# Confirm successful login as dave on remhost2
exitMMMMMMMMMMMMMMMMM# Confirm successful logout from remhost2
The two scp commands copy fred's ssh public key to mary's and dave's machines to be used in authenticating any ssh connection that fred initiates. Do not confuse the public key (id_rsa_pub) with the private key (id_rsa) that must remain in fred's .ssh directory - and remain private (with the exception of the next step).

main - as user root:
su -MMMMMMMMMMMMMMMMM# Note the dash following su. Enter root's password when prompted
cdMMMMMMMMMMMMMMMMMMM# Make sure in root's home directory (/root)
mkdir .sshMMMMMMMMMMM# Create .ssh directory in root's home directory
chmod 0700 .sshMMMMMM# Make sure only root user can access the directory
scp /home/fred/.ssh/id_rsa .sshMMMM# Copy fred's private key to root's .ssh directory
chown fred:users .ssh/id_rsaMMMMMMM# Make sure fred remains the owner
ssh mary@remhost1MMMM# Confirm successful login as mary on remhost1
exitMMMMMMMMMMMMMMMMM# Confirm successful logout from remhost1
ssh dave@remhost2MMMM# Confirm successful login as dave on remhost2
exitMMMMMMMMMMMMMMMMM# Confirm successful logout from remhost2
For these commands to work, both remhost1 and remhost2 must appear in the /etc/hosts file on main. Either enter them directly with a text editor, or through YAST Control Center-> Network Services-> Hostnames.

Copying fred's private key to root's .ssh directory is unconventional, but practical. When using ssh on my network, I don't let root make any connections under root authority. That's because I don't want any future vulnerability to expose root authority for performing actions on the remote hosts. If any security issue should arise, the damage would be limited to what a normal user can do on the system. This means that if I want to act as root on one of the remote machines, I first ssh as a normal user, then su to root when logged in to the remote host. My main machine doesn't have root ssh keys like id_rsa or id_rsa_pub. None of the remote hosts even have a root .ssh directory. This effectively prevents root from making ssh connections with root authority. The arrangement I've described above does however, allow a script being run by root to issue commands as though they were coming from ordinary users. The remote hosts don't see a request from root; all they see is an authentication from fred, check their authorized_keys2 file, then allow the connection. That's adequate for the purposes of the SyncDirs10.0 script, as the task to be performed - an rpm query - can be done quite adequately by a normal user. This arrangement does mean, however, that root can not ssh to a remote host under root authority without generating proper root ssh keys. That's fine for my modus operandi, as I purposely avoid root having the ability to do that; but if it's not acceptable to you, then you will need to arrange your own ssh configurations.

CreateLocalFilesUpdated

The CreateLocalFilesUpdated script is only used for Testing, and is designed to generate files that simulate what the local updates directories' contents would be after the updates are applied, and obsolete files are deleted. It is called by SyncDirs10.0 in Testing mode, so does not need to be run separately by the user. The only user-configurable declaration is that involving WK_DIR (see Testing section below). This script can reside in the user's home directory - say, /home/fred/SyncDirs.

SyncDirs10.0-Test

The SyncDirs10.0-Test script is only used for Testing, and is run separately after SyncDirs10.0 is run in Testing mode. It is designed to analyse the various temporary files generated to see if any errors occurred. It performs a lot of error-checking, and provides detailed results as it is run, plus a summary at the end. A typo involving one of the directories, for example, should produce Lots of errors. It generates a log file of the form test_files_update_yymmddHHMM.log for reference, should troubleshooting be required. This script can reside in the user's home directory - say, /home/fred/SyncDirs, and be run from there.

An important point is that all scripts that declare WK_DIR must have the same setting throughout. So the CreateLocalFilesUpdated and SyncDirs10.0-Test scripts need to have the same setting inserted before running in Testing mode. Similarly, the ARCH definition in SyncDirs10.0-Test needs to agree with the same definition in SyncDirs10.0 (default is ARCH=i586).

Testing

While getting lists of installed rpms from remote hosts may be able to be conducted under the auspices of normal users, the business of the main script - SyncDirs10.0 - is essentially one for root. The target files will be owned by root, and they will be used by root-executed processes (YOU). A necessary precaution when developing scripts is to not run them as root until all the functions performed are tested as doing what is intended. Otherwise, the potential for total system corruption is very high if, for example, the script goes awry and deletes critical system files. There could be many reasons that this could happen, including a quite simple typo which changes the target for deleting files, say. While I have exhausted most (hopefully, all) avenues for error in testing of these scripts on my system, the same can not be said for ones copied into your system and modified to suit your setup. I therefore strongly recommend that the Testing mode built in to the script be used to full advantage before releasing the script Live under the full authority of the root user. This testing should be conducted in two stages.

Stage 1. Firstly, run SyncDirs10.0 as user fred (example only - use your username in the actual case) making sure that the TESTING variable is set to yes. This will ensure that the script will use directories writable by fred (for example, WK_DIR=/tmp/SyncDirs) so that the script can be fully tested without running into problems caused by permissions. This also ensures that no downloading of updates, or copying or deleting of files occurs. The expected actions will only be printed to the screen, and to the log file. Also, create temporary local updates and remote backups directories for the UPDATES_STEM and BACKUPS_STEM declarations, as per the susetest and susetest2 examples in the User Customisation section of the script. It would be a good idea to seed these directories with a few files from the SUSE updates server to make sure that these particular files are not being identified for downloading or backing up. Both of these 'testing' directories need to be readable and writable by user fred, so that the script doesn't run into permissions problems when dealing with files named directory.3 and INDEX, which the script moves to a different directory to force new versions to be downloaded. Before running the script for the first time, run it as:
sh -n /home/fred/SyncDirs/SyncDirs10.0
sh is the command interpreter, or shell. The - n (noexec) switch causes sh to read the commands in the script, but not execute them. This allows for syntax error checking; it will pick up obvious errors, but not necessarily the subtle ones. The prompt returning on the next line means that no errors were detected. As a precaution, make sure files in your /home directories are backed up in case an error occurs. After fixing any syntax errors, run the script proper by entering the following command:
/home/fred/SyncDirs/SyncDirs10.0
The first time this script is run, you will need to answer yes to downloading a fresh set of index.html files. Thereafter - in Testing only - you can save time by answering no to use the previously obtained set. Once the script runs successfully, and the output suggests all is working correctly, run the SyncDirs10.0-Test script:
/home/fred/SyncDirs/SyncDirs10.0-Test
If you get any errors from this script, you will need to examine its log file to find the probable cause. This may have to be done in conjunction with examining the log file from the previous run of SyncDirs10.0, the output of which is more verbose in Testing mode. If you need to track errors, you can use the many echo commands throughout the script (which are presently commented out) to shed some light on what might be going wrong. To make it an active command, remove the # character from the start of the line. To disable it again, insert the # character back in to the start of the line. If the test passed all test conditions, move to Stage 2 of Testing.

Stage 2. The aim in Stage 2 testing is to verify that correct operation is maintained under root execution. As correct operation has already been established in Stage 1, this is effectively a confidence check that no further errors have been introduced by switching to the root user, before removing the protections afforded by Testing mode. To run the script as root (still in Testing mode), enter the following commands at the console:
su -MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM# Again, note the dash character, and respond with the root password
/home/fred/SyncDirs/SyncDirs10.0
Check the log file to see that directory.3 is included in patches and patches.obs directories, and INDEX is included in deltas and i586 directories only when there are other files to download from these directories. If no problems are found here, there is only one more change to make: change the line in the User Customisation part of the Declarations section to say:
TESTING=
That is, remove the yes from the declaration, leaving it blank after the equals sign.

This marks that testing has been completed. To remove the 'special precaution' for ensuring deliberate entry into Live mode, disable these six lines from the first block in the Preparations section:
else # Special precaution to make sure use of Live mode is actually deliberate (for development/testing only)
MMMMecho "In Live mode. Are you sure that you want to continue? (yes/no) "
MMMMread cont
MMMMif [ ! "$cont" == "yes" ]; then
MMMMMMMMexitMMMM# User wants to stop
MMMMfi
by either deleting the lines or commenting them out. After making this change, save the file and place a copy of the script in /usr/sbin. This will be run as root, so enter these commands at the console to provide the necessary permissions:
# should still be logged in as root at the console
cp /home/fred/SyncDirs10.0 /usr/sbinMMMMMMM# Change /home/fred/... to /home/<your username>/... as appropriate
chown root:root /usr/sbin/SyncDirs10.0MMMMM# Change ownership to allow only root to run the updates script
Live mode

To run the script in Live mode, enter the command:
# should still be logged in as root at the console
SyncDirs10.0
The usr/sbin/ part is not necessary, since it is in the root $PATH environment variable. The script should operate as previously, except that the real directories will be used for UPDATES_STEM, BACKUPS_STEM and WK_DIR, not the 'testing' ones. (As a special precaution, double-check the accuracy of these definitions in the User Customisation section, as these particular ones have not yet been tested.) This time, the script will go straight to the ftp site and begin downloading the various directories' contents and start calculating what files need to be downloaded. As the script runs, you will see curl downloading the required files, and lots of messages flashing past about copying and deleting files. After it has run, you should see the updates sitting in your nominated local updates and remote backups directories.

Installing the updates

Local host main. SUSE 10.0 makes it really easy to install these updates with YOU. Start YOU from the system tray, and select User-Defined Location in the Installation source drop-down menu. Type in the location of the updates in the Location box, or select New Server->Directory->OK->Browse and navigate to the applicable directory. For the local host main, enter the value of $UPDATES_STEM up to the suse part. In the example here, it would be /backups/suse. The format required for manual entry in YOU's Location box is:
dir:///backups/suseMMMMMMMMMMMMMMMMM# Note: three 'slashes' after the colon
Then click on Next. The procedure for applying the updates is then as per connecting directly to the SUSE updates site, except that accessing the files is much faster because they are being read from a local hard disk instead of the Internet.

Remote host remhost1. Installing the updates on remhost1 uses much the same procedure, except that you have the choice of using its own /backups/suse directory, or its /mnt/main_suse directory as the YOU User-Defined Location.

Remote host remhost2. Ditto for remhost2, except that you only use its /mnt/main_suse directory.

After installing the updates

After the updates are installed, YOU drops off the desktop unceremoniously, leaving it at its last-used configuration. This means that it is looking at your local updates repository, so it will never know about new updates appearing on SUSE's updates servers. YOU needs to be reconfigured to point to a live update server. To do this, re-start YOU and select one of SUSE's updates servers from the Installation source drop-down menu, and then select Next. YOU should then scan the available updates and show a window with no available updates, because they have already been installed in the previous step. Without making any further selections, click on Accept, then Close, ignoring the message about there being no patches selected for installation. Your machine will now be on the lookout for new updates from the selected SUSE updates server.

Finished

Congratulations! you now have automatic downloading of SUSE's updates with selective filtering to match your local machines' installed rpms. They will have been saved locally, backed up, and easily installed with SUSE's standard tools.

Next updates

The next time you see the SUSE Watcher turn red or yellow on your machine, don't activate YOU, but instead run SyncDirs10.0. Then you can start YOU, point it towards your local updates directory, then start the installation process. After that, point it back to look at one of SUSE's updates sites. Do the same with YOU on the remote hosts.

Postscript: What if there's no network?

If your computers are not connected by a network, all is not lost. It's still possible to use SyncDirs10.0, but you will have to move the files around manually. In the first instance, modify the ExportRpms script to insert a # character at the start of the last line that reads:
cp -p $LOCAL_INSTALLED_RPMS $REMOTE_INSTALLED_RPMS
The next step would be to go to each of your remote hosts, and run ExportRpms there, under normal user authority. Then copy the file produced by that script (/tmp/remhost1.installed.rpms and /tmp/remhost2.installed.rpms respectively, in our example) onto a floppy, then transfer them to the local host main, in its /home/fred/Common directory. From there it's just a matter of running SyncDirs10.0 as normal. You may see an error line to the effect of no route to local host(1) but the script will still discover the newly placed files and use them. Once the updates have been downloaded, you will need to burn them to a CD-ROM, or similar, so as to transfer them to the remote hosts. Use a suitable directory (say, /backups/suse) and point YOU at it when installing the updates. Note that you will need to include the whole directory path /backups/suse/i386/update/10.0/ for the destination of these copied directories.

(1) If the error message bothers you, then place a # character at the start of the line in SyncDirs10.0 that reads:
ssh $USER@$REMOTE $GET_RPMS
Conclusion

Apart from getting a script to automatically download our SUSE 10.0 updates, we've covered a lot in this tutorial. We've seen how to construct a Bash script from scratch, including comments, defining variables and functions, manipulating variables, using test constructs and test conditions, loop constructs, many commands, and much more intricate detail than you can expect to absorb at one sitting. Hopefully, it will give you enough to be able to start using Bash scripts to automate your computing experience.

by rkrishna on Mon, 2008-12-08 03:04
great resources to the community, thanks


  



All times are GMT -5. The time now is 05:51 PM.

Main Menu
Advertisement
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration