LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   how to find all files NOT part of the installed distro (https://www.linuxquestions.org/questions/linux-newbie-8/how-to-find-all-files-not-part-of-the-installed-distro-881678/)

SaintDanBert 05-19-2011 05:04 PM

how to find all files NOT part of the installed distro
 
I loaded a distro (which does not seem relevant) onto my laptop and used it for a while. Applications did whatever they do creating and saving files. I know that I have images and documents and videos and music and such on the laptop among other non-distro data files.

Is there a simple (straightforward) way to identify which files on disk are NOT part of the installed distro?

I know how to use find.

I know that find lets me locate files based on some date-time-stamp.
I know, too, that I can use any selected file as a benchmark date-time instead of some specific command line string. For example:
Code:

Find files whose modification date is before (or after) the date(s) associated with the file /path/foo.bar.
Is there any one file that I could use to peg the distro install date?
Can I get that date from somewhere else like a file system details?

Thanks in advance,
~~~ 0;-Dan

Tinkster 05-19-2011 07:28 PM

Quote:

Originally Posted by SaintDanBert (Post 4361283)

Is there any one file that I could use to peg the distro install date?
Can I get that date from somewhere else like a file system details?

Thanks in advance,
~~~ 0;-Dan

Even if there were a file or method to find the installation time, there's
no guarantee that the packaged file don't have either pre- or even future
dated files distributed. So I really don't believe there's a distro-agnostic
way of doing it; the only "generic" way would be to install something like
tripwire, AIDE, samhain and take a inventory of the box using those; then
you can find what was changed/added at a later time quite easily.


The only sensible way to go about this w/o the above mentioned tools is to
generate a list of distro supplied files, and match that against your reality,
e.g., in RPM based distros:
Code:

rpm -ql $( rpm -qa ) | sort -u


Cheers,
Tink

jefro 05-19-2011 07:48 PM

The time stamp won't work I'd guess.

I'd make a test install and diff the two.

SaintDanBert 05-23-2011 10:34 AM

Quote:

Originally Posted by jefro (Post 4361366)
The time stamp won't work I'd guess.
...

Timestamp works fine ... If you have a "date line."

Since I need to know this, the obvious thing to do is create that date line
weeks ago when I did the install ... NOT! Now I need to discover how to answer
these questions long after the fact.

Does anyone know an easy way to answer the question: How many files have XYZ attribute(s)? For example
  • How many files modified/accessed/created after/before XYZ timestamp?
  • How many files not/owned by XYZ user? group?

Thanks in advance,
~~~ 0;-Dan

MTK358 05-23-2011 11:32 AM

Quote:

Originally Posted by SaintDanBert (Post 4364657)
How many files modified/accessed/created after/before XYZ timestamp?

man find

catkin 05-23-2011 11:41 AM

@SaintDanBert: Why do you want to identify the non-distro files? What do you want to achieve? Are you wanting to preserve user data? How much storage space is used on the system? How much storage space do you have for backup?

SaintDanBert 05-25-2011 09:46 AM

Quote:

Originally Posted by catkin (Post 4364724)
@SaintDanBert: Why do you want to identify the non-distro files? What do you want to achieve? Are you wanting to preserve user data? How much storage space is used on the system? How much storage space do you have for backup?

I'm going to do a distro upgrade (clean install) and a drive upgrade (more space).
I can grab /home/* and /wrk/* and /root/* and various other places where I know that I've put things over the past year(s). I don't remember everything that I've done and so I'm looking for non-distro files and folders as reminders of things I might have done that I also need to grab.

Now that you force me to think about things a bit more, when I say "non-distro" I'm really trying to indicate ... files that did not install when I spun the distro ISO
and were not a result of update-manager activity ...
I need to identify packages that I installed manually (blush) and I've forgotten were extra added later parts so that I can add them after the update and can grab their data before the update.

Thanks for forcing me to think,
~~~ 8d;-Dan

MTK358 05-25-2011 10:24 AM

Quote:

Originally Posted by SaintDanBert (Post 4366619)
when I say "non-distro" I'm really trying to indicate ... files that did not install when I spun the distro ISO
and were not a result of update-manager activity ...

As far as I know, everything that comes with the distro is installed via the package manager. Also, most package managers have a command to see which package ownas a file (it's "pacman -Qo") in Arch Linux. Here's what I would do:

Code:

find / -type f | while read file
do
    pacman -Qo "${file}" &> /dev/null # replace this with te appropriate command for your package manager
    if [ $? '!=' 0 ]
    then
        echo "${file}"
    fi
done

Note that this could take a VERY long time to complete, since you are basically scanning your entire package database for each file on your hard drive. It might not be too hard to add a progress meter to the script, if you like :).

EDIT: with progress meter:

Code:

function echo_err
{
        echo "$@" 1>&2
}

files=$(find / -type f)

total_files="$(echo "${files}" | wc -l)"
progress=0

echo "${files}" | while read file
do
    pacman -Qo "${file}" &> /dev/null # replace this with te appropriate command for your package manager
    if [ $? '!=' 0 ]
    then
        echo "${file}"
    fi
    progress=$((progress + 1))
    echo_err "${progress}/${total_files} ($(( (progress * 100) / total_files ))%) ${file}"
done

echo_err 'Done!'


brianL 05-25-2011 10:34 AM

In Slackware, anything that isn't part of the install from CD or DVD usually has a suffix, examples:
htop-0.9-x86_64-1_SBo.tgz (SBo, from slackbuilds.org)
vlc-1.1.9-x86_64-1alien.txz (alien = Alien Bob = Eric Hameleers)

SaintDanBert 05-25-2011 01:58 PM

re: (BrianL) communication and equals

Never attempt a battle of wits with an unarmed opponent.

Never play leap frog with a unicorn. (grin)

Thanks for the tip,
~~~ 0;-Dan

chrism01 05-25-2011 07:12 PM

Can't resist posting this ....

"Never argue with an idiot. They'll drag you down to their level and beat you with experience"

:)

MTK358 05-25-2011 07:18 PM

@ SaintDanBert and chrism01

I don't understand, what is all this about?

brianL 05-25-2011 07:20 PM

Quote:

Originally Posted by SaintDanBert (Post 4366810)
re: (BrianL) communication and equals

My signature set them off. Sorry. :)

SaintDanBert 05-25-2011 09:03 PM

Quote:

Originally Posted by MTK358 (Post 4366662)
As far as I know, everything that comes with the distro is installed via the package manager. Also, most package managers have a command to see which package ownas a file (it's "pacman -Qo") in Arch Linux. Here's what I would do...

I run *-buntu so I'd need to use apt-get or aptitude or synaptic
or dpkg but your "solution" provides a nice specification*.

Thanks,
~~~ 0;-Dan

____________________
* specification -- "Working code makes the best specification."
Corolary to Brooks's Law, "Prepare to throw the first implementation away."

MTK358 05-26-2011 06:36 AM

Quote:

Originally Posted by SaintDanBert (Post 4367175)
I run *-buntu so I'd need to use apt-get or aptitude or synaptic
or dpkg but your "solution" provides a nice specification*.

Note that if I would really use it, I would make it skip directories that you know you want to or don't want to back up, such as /home /sys, /var, /proc, etc.


All times are GMT -5. The time now is 01:57 AM.