LinuxQuestions.org - [SOLVED] Help with sporadic Yum Update issue from local repository

- CentOS (https://www.linuxquestions.org/questions/centos-111/)

- - Help with sporadic Yum Update issue from local repository (https://www.linuxquestions.org/questions/centos-111/help-with-sporadic-yum-update-issue-from-local-repository-4175589899/)

Help with sporadic Yum Update issue from local repository

I’m having sporadic issues with satellite workstations downloading CentOS patches from a local repository that I setup. The satellite workstations are currently running CentOS v6.7 and I have v6.8 available on the local repository server.

The workstations have a 1.5MB/sec down DSL connection and they connect up to the repository server that has a 1GB/sec central pipe. The total package they will download contains 425 files and is about 650MB.

I setup an on demand script that runs on the satellite workstations during the night to call “yum –y update” with my repo file pointing to the repository server. The repository server is running Apache to serve up the updates to the workstations.

I tested this in our lab and it worked fine. I then configured one production workstation to run it in the middle of the night and it worked fine. Ran the same process on one more production workstation and it was fine as well. The entire process consistently took about 1.5 hours to complete.

Next, I ran the process on 22 workstations during the same night at approximately the same time. I found that 18 workstations patched with no problem, but 4 did not. The 4 problem workstations downloaded all but one rpm, which I’m guessing caused yum to fail its validation test prior to attempting to apply them. I determined the missing rpm file for each of the workstations by examining the Apache access_log file on the local repository server. One thing to note is the one rpm file that failed to download was different on each of the 4 workstations.

I checked the /var/log/yum.log file and it didn’t contain any information about the patching process or any errors. The only information in the log file was for the last time we manually patched the system several months ago.

The next day, I manually re-ran “yum –y update” on all 4 workstations and it worked. It pulled down the missing rpm and applied all of them.

I checked performance statistics on the repository server by reviewing the sar log file and didn’t see any issues with CPU, memory, disk i/o or Ethernet.

I have about 500 workstations that will need to be patched on a quarterly basis so this is only going to get worse.

I’m guessing there was a brief DSL interruption on the 4 workstations during the download of the missing rpm. I’m unclear as to why it would have only skipped that one and continued to download the remaining rpms.

I reviewed the Yum documentation and found two settings in the yum.conf I am curious about….“retries” and “timeout”.

retries - Set the number of times any attempt to retrieve a file should retry before returning an error. Setting this to '0' makes yum try forever. Default is '10'.

timeout - Number of seconds to wait for a connection before timing out. Defaults to 30 seconds. This may be too short of a time for extremely overloaded sites.

Questions:
1. If there was a DSL interruption, I’m wondering if I need to increase the “Retries” value from 10 to…say 30. As I stated earlier, I’m not sure why it would only fail on only one file and then pick up fine on the remaining ones.

2. I read another post stating that setting the “Timeout” value to 999 resolved issues with slow internet connections. This is a stupid question, but does yum establish a new connection for each rpm it downloads during the “yum update” process? I can’t believe this would be true, but I have to ask. Do you think changing this value to 999 would help based on the issue reported?

3. What are your thoughts on running “yum check-update” after the “ yum –y update” in the script and then trapping on its return code? If it’s 100 then I know all updates were not received and I could run “yum –y update” again

4. Any other ideas?

Thanks

First question, I'm not sure what a DSL outage has to do with a local transfer from the repository?

Second, Why are you scripting yum? By 'on demand script' are you just setting a cronjob to run yum -y update?

Third, if you're going to have 500+ workstations hit this server, you may want to stagger them throughout the night. When you setup the cronjob, use a random number during the hour for which minute each system should start their update. That should distribute the load on your local repo.

1. When I say "local" I am just trying to make a distinction between a local repository on our network as opposed to a third party hosted CentOS mirror. We have locations across the country that use a DSL connection to connect to our corporate network to download the patches.

2. I am scripting yum so I can control the process of what happens before & after the yum update process. I send down an update during a nightly polling process that triggers yum to run. It could be done via cron, but I like the process we have designed.

3. Yes, once we resolves the kinks in the existing process, I will be staggering the updates over several days. I would not contemplate patching 500+ locations in one night.

Quote:

Originally Posted by jjscott (Post 5609506)

Ok, so your company is hosting a server somewhere that gets all of the packages you want, then your remote offices connect in on their ISP connection, whatever that is. That makes more sense with a DSL outage being a problem. What is the average size of your offices? How many workstations? Would it be feasible to have a workstation in each remote office download the packages during the day, then distribute to the rest of the office? It would reduce DSL usage and you could rate limit the download so it could be done during the day.

They are not offices. They are retail store locations and it would not make sense to do what you are suggesting.

Thanks for your help

Quote:

We have locations across the country that use a DSL connection to connect to our corporate network to download the patches.

In that case you have a lot of bits of network (& kit) that are not under your control. It could be almost anything - networks like that have the odd glitch all the time, and the fact it was random rpms that failed indicate its likely not your stuff that's at fault.

I'd just use an 'after' check to see if its all come down.
If not retry for just the missing stuff.

Oh, and start staggering now anyway.

I was leaning toward an issue with the standard DSL service we use at our satellite locations for connecting to our network to pull down the patches.With that thought in mind, I set the retries to 30 & the timeout to 999 in the yum.conf file on all the satellite systems. I also added a retry loop to the script the initiates the "yum -y update". The loop runs "yum -checkupdate" after running "yum -y update" to determine if it was successful. If not, I rerun the "yum -y update" command. I did this for 32 locations earlier this week and it work perfectly. All locations patched without issue the first time around. The script never had to rerun the "yum -y update" in the "for loop" I added, so I'm guessing either the retries or timeout change addressed the issue.