LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (http://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Reading and writing a file with bash (http://www.linuxquestions.org/questions/linux-newbie-8/reading-and-writing-a-file-with-bash-646340/)

XeroXer 06-02-2008 03:06 AM

Reading and writing a file with bash
 
Hi all,
this is my first post so I would first like to say hi to everyone before asking my question...



I am in the beginning of learning bash scripting and haven't really gotten the hang of reading and writing files.

What I am trying to create is a bash script that by crontab is run every hour, it then checks a text file for links, wget's them all and then marks them as done.

Let's say I have this textfile /home/user/urls :
http://www.example.com/file_1.zip
#http://www.example.com/file_2.zip
http://www.example.com/file_3.zip
#http://www.example.com/file_4.zip
http://www.example.com/file_5.zip

Then my bash script is in /home/user/bash/wget.sh :
exec < /home/user/urls
while read line
do
echo $line
done

That is all I have done so it just prints the line, and I could of course just make it wget $line but I thought I want some more working before I do that.

Now what I was thinking is that it opens the file, loops through every line and on the lines not starting with # it does wget (the line).
After it has downloaded the url it adds a # before that line and continues the loop.

This would make it so if I wanted a file downloaded I would just put the url in this file and the script would within an hour (or any other crontab time) download the file and add a # first on the line so I know it's done.


Any help would be great :)
Thanks...

Alien Bob 06-02-2008 03:46 AM

Something along this perhaps:
Code:

#!/bin/sh
INFILE="/home/user/urls"
for line in $(cat $INFILE | grep -v '^#'); do
  # download the file:
  wget -nv $line
  # mark the line as done by putting '#' in front:
  sed -i -e "s?^$line?#$line?" $INFILE
done

Eric

XeroXer 06-02-2008 04:22 AM

Quote:

Originally Posted by Alien Bob (Post 3171769)
Something along this perhaps:
Code:

#!/bin/sh
INFILE="/home/user/urls"
for line in $(cat $INFILE | grep -v '^#'); do
  # download the file:
  wget -nv $line
  # mark the line as done by putting '#' in front:
  sed -i -e "s?^$line?#$line?" $INFILE
done

Eric

Not that I understood all of that but thank you Alien Bob (Eric).
It works just like I wanted it to and I can now add it to my crontab.

Alien Bob 06-02-2008 05:06 AM

Next on your list should be to understand this bit of shell script completely :-)

Eric

unSpawn 06-02-2008 05:37 AM

...which could start here:
http://www.tldp.org/HOWTO/Bash-Prog-Intro-HOWTO.html
http://www.tldp.org/LDP/Bash-Beginne...tml/index.html
http://www.tldp.org/LDP/abs/html/

XeroXer 06-02-2008 05:57 AM

Got another problem with this, more related to crontab though.

If I add this to crontab like this :
*/30 * * * * /home/user/bash/wget.sh

Then it starts the bash script every 30 minutes but the conflict comes if the first download isn't done by the second start.

I recently had three /home/user/bash/wget.sh and three wget of the same file running at the same time.
I guess this could create a few errors in the file or just a bunch of wgets running on the same file all the time.

Anyone have any idea on how to solve this?

Alien Bob 06-02-2008 06:22 AM

Quote:

Originally Posted by XeroXer (Post 3171876)
Got another problem with this, more related to crontab though.

If I add this to crontab like this :
*/30 * * * * /home/user/bash/wget.sh

Then it starts the bash script every 30 minutes but the conflict comes if the first download isn't done by the second start.

I recently had three /home/user/bash/wget.sh and three wget of the same file running at the same time.
I guess this could create a few errors in the file or just a bunch of wgets running on the same file all the time.

Anyone have any idea on how to solve this?

The solution would be to use a lock file. The script creates lockfile when it starts, and deletes it at the end of the script. But if it finds an existing lockfile, it knows that another version of the script is still running and then it aborts.

Code:

#!/bin/sh

# Files we use:
INFILE=/home/user/urls
PIDFILE=/var/tmp/$(basename $0).pid

# Check for existing lockfile:
if [ -e $PIDFILE ]; then
  echo "Another instance (`cat $PIDFILE`) still running?"
  echo "If you are sure that no other instance is running, delete the lockfile"
  echo "'${PIDFILE}' and re-start this script."
  echo "Aborting now..."
  exit 1
else
  # Create our new lockfile:
  echo $$ > $PIDFILE
fi

for line in $(cat $INFILE | grep -v '^#'); do
  # download the file:
  wget -nv $line
  # mark the line as done by putting '#' in front:
  sed -i -e "s?^$line?#$line?" $INFILE
done

# Remove our lockfile:
rm -f $PIDFILE


unSpawn 06-02-2008 08:08 AM

AFAIK you don't even need a lock, there's enough envvars to go on I think: cronjobs run at a certain $SHLVL, share the $PPID ($PID of cron daemon) and if you 'pgrep -f 'cron.*/path/tojobname'' the amount of $PIDs should be exactly one plus the $PID should be equal to $$, plus fuser should show only cronjob wget process $PID accessing the wgetfile.


All times are GMT -5. The time now is 08:59 PM.