LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Search and tar (https://www.linuxquestions.org/questions/linux-newbie-8/search-and-tar-906040/)

ougk 10-02-2011 06:53 AM

Search and tar
 
Hello all,

I have a friend on a remote linux box and I want him to search and send me some files in the easiest way possible. Ideally I am looking for a bash shell command that will do the following :

- Search for all the files and directories on the machine, created and modified for the last 5 months (hidden files included)
- Grep this to tar but be able to preserve the path and directory structure

And then I can tell him to email me that file.

Can you help?

Thanks a lot!

ps. Incidentally, do I need to explicitly search for created and modified files? I was thinking whether only modified timestamp will suffice since a file created would have this as a modified timestamp right?

tronayne 10-02-2011 07:47 AM

It's a pretty straight-forward process.

First, you need to decide from-what-date you want. That's done by touching a temporary file like this
Code:

touch -d "01 May 2011 00:00:00" /tmp/date_stamp
What this does is create the file /tmp/date_stamp with the date and time you specify.

Then you use find to locate all the files that have been modified since that date:
Code:

find / -type f -newer /tmp/date_stamp
Note that this will find every file that has time stamps newer than that date -- you probably don't want that because all the log files (in /var/log) will be found along with all the cache files (in /home/user/.mozilla and /home/user/.thunderbird and a bunch of others). So, you really want to think trough just exactly what you really want so you don't get "everything" that you may or may not want.

Additionally, you do not want his kernel files (in /usr/src) or much of anything system-specific, so really think through what you do want.

You probably don't want everything but more like a few directories, say
Code:

find /etc /home /usr/local -type f -newer /tmp/date_stamp
You can provide find with a list of directories as above.

Once you've got that figured out, just combine all the above in a shell program something like this:
Code:

#!/bin/sh
# List of directories to search
DIR_LIST=/etc /home /usr/local
# Create a time stamp file
touch -d "01 May 2011 00:00:00" /tmp/date_stamp
# Find files and create a tar archive
find ${DIR_LIST} -type f -newer /tmp/date_stamp | tar cf /tmp/files.tar -
# Compress the tar archive with gzip
gzip /tmp/file.tar
# Remove the time stamp file
rm /tmp/time_stamp

Your buddy will have to execute this as root and can send it to you via e-mail attachment, with scp, on a CD-ROM or DVD, whatever the two of you agree on. Let's say he sends you file.tar.gz and you save it in /tmp.

At you end, you probably want to look at what you got before you extract it onto your system (it will have absolute path names that will overwrite the same file name on your system -- be careful with this). You would look at what your got with
Code:

tar tzf /tmp/file.tar.gz | more
If you're happy with what you see, get to be root (su -, sudo, whatever method you want) and
Code:

tar xzf /tmp/file.tag.gz
and that'll do it.

I can't emphasize enough that you must determine exactly what directories you want avoiding system directories -- you don't want to blow up your system.

Hope this helps some.

[EDIT]
Ouch! Sometime my head and my fingers get out of whack.

In the above, I've added -type f to the finds. You really only want to find files (not, as pointed out by @rknichols, below, directories).

The -newer reference option finds files that have been modified more recently than the reference file, /tmp/date_stamp in this case.

You could also use the -newerXY reference option, where XY is, say, cm, change time, modification time. This would be a little finer than simply -newer.

Either -newer option will find files newer than or modified since the date stamp.

Of course either of these options must be supported by the find utility on your systems -- a quick look at the manual page would let you know.

Sorry about missing the -type f option to find, duh.
[/EDIT]

unSpawn 10-02-2011 08:02 AM

...in addition another way to define a time range for files could be with '\( -atime +$[30*5] -o -ctime +$[30*5] \)'. Good point about directory exclusion. For tar you will also want '$(awk '/fs/ {print "--exclude "$2}' /proc/mounts|grep -v /$)'. As for size there's also split or mpack.


Quote:

Originally Posted by ougk (Post 4487984)
do I need to explicitly search for created and modified files? I was thinking whether only modified timestamp will suffice since a file created would have this as a modified timestamp right?

Change time means changes made to the inode like ownership and permissions. Modify time means change of contents. In UNIX there is no file creation time.


*BTW are you investigating something security-wise?

rknichols 10-02-2011 01:02 PM

In that 'find' command, you also should use a '! -type d' test to exclude directories. When you pass a directory name to 'tar', it will dutifully include everything in that subtree, including a lot of old files you didn't want to see.

ougk 10-03-2011 12:24 PM

Hello guys,

Firstly let me thank all of you for the significant help you provided! thanks indeed!

Secondly, perhaps I owe to say a bit more about what we are trying to do, in case you have a better solution. Let's use same names as well ;-)

So Alice was programming something on a linux box and left the PC to Bob. The problem is that we now cannot contact Alice and ask her exactly what files/directories was she using, where are her notes, where are her config files etc and Bob is a newbie and hasn't got a lot of time to help. So what I basically want is to give to Bob a script that will create a tar file with all the files that Alice modified over the last 4-5 months. Here are some questions :

1) If Bob runs the scripts as root, will he get on the archive all users' files on that machine?

2) Will I have a problem opening these files in another machine under another user?

3) The following line doesn't work for me

Quote:

Originally Posted by tronayne (Post 4488008)
Code:

# Find files and create a tar archive
find /home -type f -newer /tmp/date_stamp | tar cf /tmp/files.tar -


it gives me the following :
Code:

tar: -: Cannot stat: No such file or directory
tar: Error exit delayed from previous errors

any ideas why?

Is the following equivalent?

Code:

tar -cvf temp.tar `find /home/ -type f -newer /tmp/date_stamp`
4) How can I make sure that the timestamps of the files in the tar are preserved both when creating the tar and also when extracting it?

5) Why do you say that when I'll extract the file it will overwrite my system's files? If i extract it on a directory, won't it create the tar file's structure under that directory?

6) I am trying to think whether the modified timestamp will suffice in getting all the files that Alice was using on Bob's machine. Going by what you told me, the scrips you suggested will give me all the files which context has been modified. If Alice though, just read a file, that won't be included right? What if I use the accessed timestamp?

Thanks very much! Looking forward to your replies!

anomie 10-03-2011 01:10 PM

Did Alice have an account on the system? (I hope so.) It would easier to search by UID, if so.
Code:

# find / -uid alice_UID_here 2>/dev/null
Also, the Alice and Bob naming you're using (even if it reflects real names) is confusing. Those are classic crypto characters. :)

tronayne 10-03-2011 04:17 PM

As @anomie suggests, you probably want to look for only files that belong[ed] to Alice -- she must have had an account on the machine which means that she had a unique user identification number (UID) which you can find simply by looking in /etc/passwd. She should have had a home directory in something like /home/alice. She may have worked in other directories (and you ought to know what those are, if any).

If you use the find utility with, for example, find /home/alice, you will get the absolute path name of any file you identify; i.e., a single file in a tar archive will look like /home/alice/stuff/nonsense.txt. That's probably not what you want simply because if you extract that file from a tar archive, it's going to get written to /home/alice/stuff/nonsense.txt (and, well, Alice doesn't live here anymore, eh?).

So, now you've got a monkey-wrench in the works, don't you: what to do, what to do.

Identify Alice's home directory on Bob's machine. Then
Code:

cd /home/alice (or wherever the heck it is)
tar cf /tmp/files.tar `find . -type f -user alice`

You really don't need to worry about what changed, just get 'em all.

Then, compress the tar file:
Code:

gzip /tmp/files.tar
Then get it to the machine where you want it, or, simply unpack the thing in Bob's directory; e.g.,
Code:

log in as Bob
mkdir alice_stuff
cd alice_stuff
tar xzvf /tmp/files.tar.gz

He's the guy that has to work on it, probably ought to put it where he can get at it and go through what you actually have, moving what's worth keeping to somewhere else.

While you're at it, it might not hurt to walk the entire system looking for anything that Alice did something to -- if she edited it and saved the edit, she'll own it (if, of course, she was working in a directory she had write permission in); use find to do that:
Code:

su -
DIR_LIST=whatever directories Alice could write in, but not her home directory
find ${DIR_LIST} -type f -user alice 2>/dev/null

That's going to give you a list of anything Alice owns that is not in her home directory -- send it to a printer and go though it or pipe the output into more or pg or whatever you like so you can peruse the list. There probably are not very many of these.

Then, do a little manual work: identify the directories Alice could write to outside of her own home directory, make a list, examine the files to see if they're worth saving, copy or move the files to a temporary directory, tar them, compress the archive and go from there.

When you extract, use the -atime-preserve option to preserve the access time stamp.

Now, if you guys did something really, really silly (like just changing "alice" to "bob" in /etc/passwd, well, good luck with that.

Hope this helps some.

ougk 10-07-2011 04:23 AM

Hello guys,

Thanks a lot for your elaborate support on this! I am amazed by your responses!

What you say makes a lot of sense. I'll grab all the /home/username and then find all the files under this user and do some manual sorting out.

By the way, I know that Alice and Bob are classic characters' names, that's why I used them ;-)

By the way (2), do you know how do I add e.g. the hostname and a timestamp on the tar filename from within a script?

Thanks a lot!


ps. I had to write my reply twice, because although the quick reply box was enabled, I was logged out and thus I lost my text :-/

tronayne 10-07-2011 05:40 AM

You get your host name, quick and dirty,
Code:

HOST=`uname -n`
You get a time-date stamp,
Code:

DATE=`date +%F-%T`
You use those to create a file name
Code:

TARFILE=${HOST}-${DATE}.tar
or whatever rocks your boat -- look at the manual page for the date utility and, if you don't like the date-time stamp above...

Hope this helps some.

ougk 10-07-2011 03:12 PM

Thanks a lot, that helped!


All times are GMT -5. The time now is 09:17 PM.