LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   In a bash script, how do I move a line to a new file. (https://www.linuxquestions.org/questions/programming-9/in-a-bash-script-how-do-i-move-a-line-to-a-new-file-876625/)

wonderfullyrich 04-23-2011 02:38 AM

In a bash script, how do I move a line to a new file.
 
I've got a bash script I'm using to download a text file list of links via axel. What I'd like to do is automate the movement of completed links in the for loop when axel has successfully completed the download.

This is what I've got. I can figure that I can just echo append the line to a new file, but what is the easiest way to delete the line with the link I just downloaded?


Code:


#!/bin/bash

for i in $( cat $1); do
axel --alternate --num-connections=6 $i
export RC=$?
        if [ "$RC" = "0" ]; then
          echo
          echo Succeeded downloading following link $i
          echo $i >> downloaded-links.txt
          echo
                #remove $i line from $1 file...?
        else
            echo
            echo Failed downloading following link $i
            echo $i >> failed-links.txt
            echo
        fi

done


druuna 04-23-2011 02:49 AM

Hi,

Assuming that all lines are unique:
Code:

#remove $i line from $1 file...?
sed -i "/$i/d" $1

Hope this helps.

grail 04-23-2011 03:09 AM

You do also know that your export line is not required. In fact, you could just use the return value on its own:
Code:

if (( $? ))
The only difference is you need to change the logic as 0 will now be false so change then and else sections around.

ntubski 04-23-2011 08:05 AM

You don't even need to look at the return value at all, if already does that:
Code:

if axel --alternate --num-connections=6 $i ; then
  # success
else
  # failure
fi


konsolebox 04-23-2011 09:58 AM

just an example. i would do things this way.

Code:

#!/bin/bash

remove=()
j=0

while read i; do
    (( j++ ))

    if axel --alternate --num-connections=6 "$i"; then
        echo
        echo "Succeeded downloading following link $i"
        echo "$i" >> downloaded-links.txt
        echo
        remove=("${remove[@]}" "-e" "${j}d")
    else
        echo
        echo "Failed downloading following link $i"
        echo "$i" >> failed-links.txt
        echo
    fi
done < "$1"

sed -i "${remove[@]}" "$1"


wonderfullyrich 04-25-2011 11:27 PM

I just had a chance to change up the script and try it out, and I now see the duplication of the return value and indeed I see the improve simplicity of konsolebox's example script. Thanks guys!

One problem that I didn't anticipate was that although the example script does work, it's doesn't initiate the sed change till after all the links are downloaded, so if I break the script during download (or during the while function) and restart it start's over at the top of the list.

This is partially due to the fact that I reorder the links due to change in priorities. Normally I just break the script and restart it to affect that change in priorities, but is there a way to create a loop that 1. when it's finished downloading a link it removes the line, 2. actively reads the file during every loop and initiates the download of the top line. 3. stops at the end of the file?

I started on this idea below, but am not well versed enough in sed, loops, and bash to make it work. Specifically I'm struggling with the sed line as the links are fully qualified urls so I can't use / for a delimiter.

Code:

#!/bin/bash

j=0

while [ -s "$1" ]; do
        if [ $j -le 0 ]; then
                (( j++ ))
                i=$(head -1 $1)

                    if axel --alternate --num-connections=6 "$i"; then
                        echo
                        echo "Succeeded downloading following link $i"
                        echo "$i" >> downloaded-links.txt
                        echo
#This isn't working... but it's an idea.
                        remove=("|$i|d")
                        sed -i '$remove' $1
                    else
                        echo
                        echo "Failed downloading following link $i"
                        echo "$i" >> failed-links.txt
                        echo
                    fi
        else
                (( j=0))
        fi

done


konsolebox 04-26-2011 11:36 AM

I can't test this but I hope it works for you.

Code:

#!/bin/bash

FILE=$1
TOTALLINES=$(exec wc -l "$FILE")
LINENUMBER=1

for (( I = 1; I <= TOTALLINES; I++ )); do
        if read < <(exec sed -n "${LINENUMBER}p" "$FILE"); then
                if [[ -n $REPLY ]]; then
                        if axel --alternate --num-connections=6 "$REPLY"; then
                                echo
                                echo "Succeeded downloading following link $REPLY"
                                echo "$REPLY" >> downloaded-links.txt
                                echo
                                sed -i "${LINENUMBER}d" "$FILE"
                                continue
                        else
                                echo
                                echo "Failed downloading following link $REPLY"
                                echo "$REPLY" >> failed-links.txt
                                echo
                        fi
                fi
        fi

        (( LINENUMBER++ ))
done


grail 04-26-2011 12:07 PM

I am not sure I see how the for loop construct would be better than a simple while loop reading from the file? (of course not to say it doesn't work)

@OP - I am not sure how breaking the script at the point of using axel makes a difference if after each successful download you are removing the lines
from the file? Obviously if you cancel the script at any point and a download is incomplete then the corresponding entry will not have been removed so the download
will start afresh once the script is started again.

Maybe I am missing something but is it not enough to say:
Code:

#!/bin/bash

FILE="$1"

while read -r LINE
do
    if axel --alternate --num-connections=6 "$LINE"; then
        echo
        echo "Succeeded downloading following link $LINE"
        echo "$LINE" >> downloaded-links.txt
        echo
        sed -i '1d' "$FILE"
    else
        echo
        echo "Failed downloading following link $LINE"
        echo "$LINE" >> failed-links.txt
        echo
    fi
done<"$FILE"


wonderfullyrich 04-27-2011 12:08 AM

Thanks all. Tis all much appreciated.

To answer your question grail, ideally i'd just do a loop based on reading from the file, however I edit the url file on the fly in the background and so the number of lines change in the file up and down depending. If I use a loop to read the file, it's retains the order and number without integrating new lines, deleted lines, or order changes, which is why I was trying to figure out how to make the loop dependent on the end of file. This also makes any attempt at deleting the line based solely on the line number precarious, as it could have changed, so that's why I was trying to delete the actual entry rather then the line number or number of lines.

I think this answers your second question too, but I thought I'd say that yes axel and the script does just leave the incomplete download listed. The reason I move it out is because if axel is given a download and finds a dup file without state information (i.e. an .st file), it restarts the download from scratch with the .0 extension (and increments). This means if it's completed the download of a file (and there's no longer an .st file), it'll just keep downloading the first link on the list adding new extension even if the last file has been successfully completed. It's not quite as smart as wget with no clobber, but the multi-connection is more useful in this particular case.

Any ideas? Was my loop just missing the sed delimiter, or was the whole loop entirely malformed?

Many thanks guys!
Rich

grail 04-27-2011 01:28 AM

Code:

sed -i '$remove' $1
This does not work as the single quotes have stopped the shell from expanding the variable so you should use double quotes
to get around this.

Now that you have explained further I do understand what you intend but would caution you that it is indeed fraught with danger.
Even though you only access the file using head to get an entry the big problems will occur while you are changing the file
and head tries to grab the next line. Then at the same time if the download is quick you may also execute a sed on the file whilst
you are still editing. All of this chills me to the bone as data will actively be added and deleted at the same time. Sounds
like a recipe for disaster.

The only thing springing to mind would be to create a lock on the file so when one or the other application is making changes the other
one, you or script, will have to wait until it is freed.

I'll be interested to see how this is solved??

konsolebox 04-27-2011 05:27 AM

you can try this one. it requires bash version 4.0 or newer.
Code:

#!/bin/bash

FILE=$1
declare -A SUCCESS=()
declare -A FAILED=()

for (( ;; )); do
        ACTIVE=false

        for (( ;; )); do
                if read; do
                        if [[ -z $REPLY ]]; then
                                echo >&3
                                continue
                        elif [[ -n ${SUCCESS[$REPLY]} ]]; then
                                continue
                        elif [[ -n ${FAILED[$REPLY]} ]]; then
                                echo "$REPLY" >&3
                                continue
                        elif axel --alternate --num-connections=6 "$REPLY"; then
                                echo
                                echo "Succeeded downloading following link $REPLY"
                                echo "$REPLY" >> downloaded-links.txt
                                echo
                                SUCCESS[$REPLY]=.
                        else
                                echo
                                echo "Failed downloading following link $REPLY"
                                echo "$REPLY" >> failed-links.txt
                                echo "$REPLY" >&3
                                echo
                                FAILED[$REPLY]=.
                        fi

                        while read; do
                                [[ -z ${SUCCESS[$REPLY]} ]] && echo "$REPLY" >&3
                        done

                        ACTIVE=true
                fi

                break
        done < "$FILE" 3>temp.txt

        cat tempt.txt > "$FILE"

        [[ $ACTIVE = true ]] || break
done


grail 04-27-2011 06:25 AM

@konsolebox - whilst an interesting solution, how does this address the issues of the FILE being edited whilst, for example, you are catting the temp.txt back over the same FILE?

I am not saying this won't work, I am more just curious how this may / will stop this from occurring?

konsolebox 04-28-2011 02:24 AM

sorry i had to revise it. this is the best solution i know so far since i can't consider file locking yet:

Code:

#!/bin/bash

FILE=$1
declare -A SUCCESS=()
declare -A FAILED=()

for (( ;; )); do
        # find a new link

        cat "$FILE" > temp.txt

        HASNEW=false

        while read; do
                [[ -z $REPLY || -n ${SUCCESS[$REPLY]} || -n ${FAILED[$REPLY]} ]] && continue
                HASNEW=true
                break
        done < temp.txt

        [[ $HASNEW = true ]] || break

        # download

        if axel --alternate --num-connections=6 "$REPLY"; then
                echo
                echo "Succeeded downloading following link $REPLY"
                echo "$REPLY" >> downloaded-links.txt
                echo
                SUCCESS[$REPLY]=.
        else
                echo
                echo "Failed downloading following link $REPLY"
                echo "$REPLY" >> failed-links.txt
                echo
                FAILED[$REPLY]=.
        fi

        # refresh file

        cat "$FILE" > temp.txt

        while read; do
                [[ -z ${SUCCESS[REPLY]} ]] && echo "$REPLY"
        done < temp.txt > "$FILE"
done

@grail: i think of the solution as something that could work with a file monitor like a looping "cat" or an editor that automatically detects file changes

grail 04-28-2011 03:48 AM

hmmm ... so you have effectively replaced sed work with a while loop ... my thought here is that this would probably take longer to process than sed and lead more to
the issue of not locking the file (maybe)

Ignoring this constant issue, I am curious about this portion of code:
Code:

while read; do
    [[ -z $REPLY || -n ${SUCCESS[$REPLY]} || -n ${FAILED[$REPLY]} ]] && continue
    HASNEW=true
    break
done < temp.txt

I am not sure of the relevance of the null tests?

By virtue of the last while loop in the code, all not null SUCCESSes will be removed, yes?
And the FAILED items, do we not wish to retry these?

Maybe I am following it wrong ... sorry :(

konsolebox 04-28-2011 04:09 AM

Quote:

Originally Posted by grail (Post 4339039)
hmmm ... so you have effectively replaced sed work with a while loop ... my thought here is that this would probably take longer to process than sed and lead more to
the issue of not locking the file (maybe)

I actually agree a bit about the longer process.. it's actually the obvious disadvantage.. not unless we use a faster scripting language that can manipulate external programs too.

Quote:

Originally Posted by grail (Post 4339039)
I am not sure of the relevance of the null tests?

By virtue of the last while loop in the code, all not null SUCCESSes will be removed, yes?
And the FAILED items, do we not wish to retry these?

Maybe I am following it wrong ... sorry :(

I just want to make the one editing the file not lose positions in lines as much as possible when editing. With the last loop, only the ones that were successfully downloaded are the ones that are *excluded. They are checked again and again since we'll never know if the one editing the file will be able to override the new version of the file with every save in the editor.

Btw, it's ok :)

ntubski 04-28-2011 01:01 PM

How about using the trap mechanism:
Code:

#!/bin/bash

FILE="$1"

onexit()
{
    grep -vFf downloaded-links.txt "$FILE" > remaining-links.txt \
      && [ $? -le 1 ] && mv remaining-links.txt "$FILE"
    exit
}

trap onexit EXIT

while read -r LINE
do
    if axel --alternate --num-connections=6 "$LINE"; then
        echo
        echo "Succeeded downloading following link $LINE"
        echo "$LINE" >> downloaded-links.txt
        echo
    else
        echo
        echo "Failed downloading following link $LINE"
        echo "$LINE" >> failed-links.txt
        echo
    fi
done < "$FILE"


grail 04-28-2011 10:11 PM

@ntubski - So I understand how you are waiting till exit to make the changes, but I believe the bigger issue is that the FILE you are reading
from will be actively changed, according to OP, whilst your loop is running. So in turn your script will not be aware of changes until the script is started again.

ntubski 04-29-2011 12:44 PM

Oops, I was confused, second try:
Code:

#!/bin/bash

FILE="$1"

cp "$FILE" remaining-links.txt
touch downloaded-links.txt

while [ -s remaining-links.txt ]
do
    while read -r LINE
    do
        if axel --alternate --num-connections=6 "$LINE"; then
            echo
            echo "Succeeded downloading following link $LINE"
            echo "$LINE" >> downloaded-links.txt
            echo
        else
            echo
            echo "Failed downloading following link $LINE"
            echo "$LINE" >> failed-links.txt
            echo
        fi

        if ! diff -q remaining-links.txt "$FILE" ; then
            break
        fi
    done < remaining-links.txt

    grep -vFf downloaded-links.txt "$FILE" > remaining-links.txt

done


grail 04-29-2011 11:52 PM

Yes that looks like a better alternative :) This appears to address my concerns at least ... let us see what the OP thinks?


All times are GMT -5. The time now is 10:36 PM.