LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Differences between one file and another (https://www.linuxquestions.org/questions/linux-newbie-8/differences-between-one-file-and-another-4175509975/)

jonnybinthemix 07-03-2014 08:10 AM

Differences between one file and another
 
Good afternoon chaps,

I hope you're all well.

I've been reading man page after man page, and article after article to try my best not to ask for help on this one, but I think it best that I resort to the assistance of those more knowledgable to aid in my learning on this one.

I am still working on my SFTP Scripts and now am facing another hurdle.

I need to download certain file based on whether they already exist locally, and therefore just download new ones.

I have come up with a theory that if I connect to the SFTP Server, and do ls to get a list, and store that list.. then do an ls on the local directory, then compare the two.. output the differences, pass that file into a loop and download the filenames that are different. (There's probably an easier way, but this was (I thought) a good idea, so I ran with it).

I hit some hurdles on the way, but the following is what I have:

Code:

/usr/bin/expect <<! > $FTPLIST
        spawn sftp -o$PORT $USER@$HOST
        expect "password:"
        send "$PASS\r"
        expect "sftp>"
        send "cd pics\r"
        expect "sftp>"
        send "ls -l *.jpeg.pgp\r"
        send "bye\r"
        expect eof
!

cat $FTPLIST | grep 'BAT_.*\.jpeg\.pgp'| awk '{print $9}' > $RLS

ls -l BAT_*.jpeg.pgp | awk '{print $9}' > $LLS

#echo "Remote"
#cat $RLS
#echo "Local"
#cat $LLS
sort $RLS $LLS

comm -13 $RLS $LLS

When I uncomment the test lines above, I do get a nicely formatted list of the files in each location.

The problem is with Comm or Diff..

As you can see I've been experimenting with 'Sort' as Comm apparently needs lists to be sorted first. But I've tried using diff, and I don't get the results I need.

I would like to be able to get a list of all files that are in the remote location, which are not in the local location, and then download them.

I was figuring I could generate this list using either comm or diff and then store that somewhere and run a loop on the lines within the file for FTP Download like:

Code:

for line in $NEW; do
/usr/bin/expect <<! > $FTPLIST
        spawn sftp -o$PORT $USER@$HOST
        expect "password:"
        send "$PASS\r"
        expect "sftp>"
        send "cd pics\r"
        expect "sftp>"
        send "get $LINE\r"
        send "bye\r"
        expect eof
!
done

My first question is am I doing something wrong with the Comm or Diff commands? And, am I thinking about this the correct way? I have seen that I could you a sync style command if I use LFTP but I don't want to keep the two folders in sync, just download anything that is not in the local location.

The results of the test, using the cat $var in the above script displays the following:

Code:

Remote
BAT_123456.jpeg.pgp
BAT_234567.jpeg.pgp
BAT_345678.jpeg.pgp
Local
BAT_123456.jpeg.pgp
BAT_234567.jpeg.pgp

As always, I really do appreciate any help.

Thanks
Jon

schneidz 07-03-2014 08:19 AM

maybe these programs will help:
md5sum
sha256sum
sha1sum
sha384sum
sha224sum
sha512sum

instead of ls maybe you should use find.

YankeePride13 07-03-2014 08:46 AM

Would using Rsync work for you?

oneandoneis2 07-03-2014 08:47 AM

..is there a reason not to use rsync for this?

jonnybinthemix 07-03-2014 08:47 AM

Hi,

Thanks for your response.

I tried out find in place of LS but it's not a valid SFTP command, so assuming you meant in place of the local LS command? Would that offer any benefits to the ls in the way it is?

I'll research the other commands you've posted to see if they will achieve the goal.

Thanks
Jon

---------- Post added 07-03-14 at 08:48 AM ----------

Rsync could be an option. But would I still need the list of filenames to download?

schneidz 07-03-2014 08:57 AM

^ i would log into the remote server using ssh and run something like:
Code:

find /whatever/floats/your/boat -type f -exec md5sum '{}' \; > local.md5
ssh user@host find /whatever/floats/your/boat -type f -exec md5sum '{}' \; > host.md5
diff local.md5 host.md5
# diff local.md5 host.md5 | grep \> # something like this mite be helpful for identifying files that are different/missing from local.md5


jonnybinthemix 07-03-2014 09:00 AM

I have no SSH Access to the Server, SFTP Only...

schneidz 07-03-2014 09:53 AM

i guess you have to do things the hard way.

try doing ls -1 on both remote and local and diff the results ?

jonnybinthemix 07-03-2014 02:10 PM

Ahh now we're talking :)

I've been playing around at home on the mac... as I don't want to faff about with the VPN to play around with the actual script and I have got the following to work:

Code:

#!/bin/bash

tmp1="/tmp/tmp1"
tmp2="/tmp/tmp2"
tmp3="/tmp/tmp3"

cd one/

ls -1 > $tmp1

cd ../two/

ls -1 > $tmp2

diff $tmp1 $tmp2 | awk '{print $2}' > $tmp3

for i in $tmp3; do
        cat $i
done

Just a simple test script to see if I could get every aspect working, and it works like a dream :)

I'll try the above philosophy in the live script tomorrow :)

Thanks for you help

jonnybinthemix 07-04-2014 04:09 AM

Update:

Although it worked nicely at home on the mac, with the above test... it doesn't work on the Red Hat box at the office... I think it's something to do with the way I'm getting the ls data in the first place on the remote location..

The only way I could think of getting the ls data was to capture the entire output of the Expect Script and therefore SFTP Session, and then grep the file names out...

Code:

/usr/bin/expect <<! > $FTPLIST
        spawn sftp -o$PORT $USER@$HOST
        expect "password:"
        send "$PASS\r"
        expect "sftp>"
        send "cd pics\r"
        expect "sftp>"
        send "ls -1 *.jpeg.pgp\r"
        send "bye\r"
        expect eof
!
cat $FTPLIST | grep 'BAT_.*\.jpeg\.pgp' > $RLS

ls -1 > $LLS

So $LLS is giving a good listing of just filenames using ls -1 and but $RLS although containing a nice list of filenames, it is achieved by capturing the whole SFTP Session, which includes all verbose nonsense from the SFTP Session, and then I've searched that content for all filenames and stored that into $RLS.. so it's not quite done in the same way.

So the upshot is, it doesn't work and still displays the same..

Anyone have any other ideas? :)

jonnybinthemix 07-04-2014 05:35 AM

The following:
Code:

grep 'BAT_.*\.jpeg\.pgp' $FTPLIST > $RLS

ls -1 > $LLS

echo "Remote"
cat $RLS
echo "Local"
cat $LLS
echo -e "\n"

diff $LLS $RLS

Results in:

Code:

# ./img.sh
Remote
BAT_123456.jpeg.pgp
BAT_234567.jpeg.pgp
BAT_234755.jpeg.pgp
Local
BAT_123456.jpeg.pgp
BAT_234567.jpeg.pgp


1,2c1,3
< BAT_123456.jpeg.pgp
< BAT_234567.jpeg.pgp
---
> BAT_123456.jpeg.pgp
> BAT_234567.jpeg.pgp
> BAT_234755.jpeg.pgp

Where as when I ran the tests locally last night, the last part just displayed the difference. In this test is displays the contents of each, and not what's different.

jonnybinthemix 07-04-2014 08:15 AM

I'm coming to the conclusion that there is something wrong with the DATA in the files. Maybe it doesn't like the _'s? Because everything I try just displays the contents of one file or the other.

I've tried, Comm, Diff, AWK, SED, and some crazy loops, nothing will work :(

schneidz 07-04-2014 09:25 AM

maybe if you supply a small example of both $LLS and $RLS someone would be able to hax something together.

jonnybinthemix 07-04-2014 09:37 AM

Hi,

Thanks for your response...

The contents are above:

$RLS
Code:

BAT_123456.jpeg.pgp
BAT_234567.jpeg.pgp
BAT_234755.jpeg.pgp

$LLS
Code:

BAT_123456.jpeg.pgp
BAT_234567.jpeg.pgp


schneidz 07-04-2014 09:45 AM

Code:

[schneidz@hyper jonnybinthemix]$ uname -a -m -p
Linux hyper 2.6.43.8-1.fc15.x86_64 #1 SMP Mon Jun 4 20:33:44 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
[schneidz@hyper jonnybinthemix]$ head *
==> lls <==
BAT_123456.jpeg.pgp
BAT_234567.jpeg.pgp

==> rls <==
BAT_123456.jpeg.pgp
BAT_234567.jpeg.pgp
BAT_234755.jpeg.pgp
[schneidz@hyper jonnybinthemix]$ diff *
2a3
> BAT_234755.jpeg.pgp

heres an alternate:
Code:

[schneidz@hyper jonnybinthemix]$ cat rls | while read jpg
do
 if [ -z "`grep $jpg lls`" ]
 then
  echo $jpg
 fi
done
BAT_234755.jpeg.pgp



All times are GMT -5. The time now is 03:41 PM.