LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 08-04-2010, 11:36 PM   #1
stf92
Senior Member
 
Registered: Apr 2007
Location: Buenos Aires.
Distribution: Slackware
Posts: 3,052

Rep: Reputation: 45
Looking for the last five files created on the hard disk.


Kernel 2.6.21.5, GNU (Slackware 12.0).
Bash 3.1.17.

Hi:
I want to search an entire subtree of /, in the file system, for all files, with extension html, created on the hard disk. In addition, these have to be the last five created.

I think I could split the problem into two parts: (a) Forget about the last condition. Then this is a job for the find command. (b) Sort the output of find using the date as the key, then use 'head' to print the desired output.

But even two such simple steps are enough to justify the writing of a shell script. And here lies my weakness. My script writing knowledge is rudimentary.

What's the final purpose? Well, I lately saved four or five LQ pages onto disk containing information I consider valuable to me. But I don't exactly remember where on the disk. So...

Then: either the problem posed is really of a very simple nature or it is not, in the latter case a script being mandatory. Any suggestion will be welcome. Thank you for reading.

EDIT: one of the algorithm drawbacks (the one described above) is that find may be running a great deal of time. My machine resources (RAM and CPU speed are low) are scarce and there possible are a large number of HTML files on the disk.

Last edited by stf92; 08-04-2010 at 11:48 PM.
 
Old 08-05-2010, 12:17 AM   #2
sag47
Senior Member
 
Registered: Sep 2009
Location: Philly, PA
Distribution: Kubuntu x64, RHEL, Fedora Core, FreeBSD, Windows x64
Posts: 1,413
Blog Entries: 33

Rep: Reputation: 355Reputation: 355Reputation: 355Reputation: 355
Try this command (I'm away from my linux machine until tomorrow so this is off my head). I'll be able to give you a more correct command tomorrow if this one is wrong...
Code:
find / -type f -name *.htm* -print0 | xargs -0 ls -t | head
better to be run as root with su or sudo command.

SAM

Last edited by sag47; 08-05-2010 at 12:19 AM.
 
Old 08-05-2010, 12:37 AM   #3
stf92
Senior Member
 
Registered: Apr 2007
Location: Buenos Aires.
Distribution: Slackware
Posts: 3,052

Original Poster
Rep: Reputation: 45
Hi:
and thanks. It began outputting file names until it was so interrupted:
xargs: ls: terminated by signal 13.

I see the option 't' given to 'ls' is a key piece of the command. Unfortunately these things of signals, are yet beyond the scope of my knowledge (linux/unix). Regards.

EDIT: I ran it as root.

Last edited by stf92; 08-05-2010 at 12:39 AM.
 
Old 08-05-2010, 02:54 AM   #4
Guttorm
Senior Member
 
Registered: Dec 2003
Location: Trondheim, Norway
Distribution: Debian and Ubuntu
Posts: 1,115

Rep: Reputation: 218Reputation: 218Reputation: 218
Hi

Try this:

find / -name "*.html" -type f -printf "%C+ %P\n" |sort |tail -n 5

Last edited by Guttorm; 08-05-2010 at 02:55 AM. Reason: Added -type f to exclude anything that's not a regular file
 
1 members found this post helpful.
Old 08-05-2010, 03:16 AM   #5
stf92
Senior Member
 
Registered: Apr 2007
Location: Buenos Aires.
Distribution: Slackware
Posts: 3,052

Original Poster
Rep: Reputation: 45
Hi:
and welcome. I tried it and it worked fine. Although it won't show all the usefulness it is capable of until I understand what is the argument of find's option 'name'. Is it a regexp? The manual does not say. So it is left for the shell alone to expand the argument. Or perhaps both things happen, one after the other. It's always been a mistery to me. Thanks and regards.
 
Old 08-05-2010, 07:07 AM   #6
colucix
Moderator
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,458

Rep: Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941
Actually it is not a regexp, nor it is expanded by the shell. The double quotes protect the asterisk from the shell, so that it is passed literally to the find command. Find interprets it in a way similar to the shell, matching any name terminated by the suffix .html.

Instead in a regexp the asterisk means any number of repetitions (0, 1 or more) of the previous expression, but in this case there isn't any previous expression.

On the contrary, if you let the shell expand it (without quotes or without escaping with backslash), the expression will be substituted by the name of the html files (if any) in the current working dir. If there were no html files, the asterisk would have been passed literally. If there were only one file, the find command would have searched only files with that specific name. For two or more it could result in a syntax error.
 
1 members found this post helpful.
Old 08-05-2010, 11:27 AM   #7
sag47
Senior Member
 
Registered: Sep 2009
Location: Philly, PA
Distribution: Kubuntu x64, RHEL, Fedora Core, FreeBSD, Windows x64
Posts: 1,413
Blog Entries: 33

Rep: Reputation: 355Reputation: 355Reputation: 355Reputation: 355
I know this is already solved but since you were specifically looking for linuxquestions.org files here is a nice search string which will show you all files that contain one or more instances of the word you are looking for. I'll break the command down for you since you seem to be new to Unix/Linux shell. In the future just replace "linuxquestions" with the search term you want. If you want to search more than just html files then you have to modify *.htm* next to iname.
Code:
find -iname "*.htm*" -type f -print0 | xargs -0 grep -iH "linuxquestions" | cut -d: -f1 | sort -u
The pipe (|) character is used to pipe the output of one command as input to another command (at the end of the command). You can formulate a command with more than one pipe. Just remember the Unix tool philosophy. Make a tool that does one thing, and one thing well (then with multiple commands pipe them all together to do what you want).

find -iname "*.htm*" -type f -print0 (for more information use "man find" in terminal)
  • Find finds files based on arguments you give it. If you give it no arguments then it outputs every file from the current directory and all sub directories (including hidden files).
  • -iname filters files found to files only matching the pattern which follows it. In this case all .htm and .html files ("*.htm*")
  • -type f shows only files (not directories)
  • -print0 uses null characters to separate the results instead of new lines. This way processed output can contain spaces and other special characters which will be handled by xargs.

xargs -0 grep -iH "linuxquestions" (for more information use "man xargs" and "man grep" in terminal)
  • xargs processes input for use as input for another command.
  • -0 separates breaks in input with a null space character instead of a new line. This way spaces, new lines, and other special characters can be used as input.
  • grep OPTIONS PATTERN FILE processes the input given by xargs (in this case a file path) so grep will process the contents of the given file.
  • grep -iH allows the search PATTERN to be case insensitive. The -H option tells grep to print the file path of the file which the search pattern is found. The separator between the file name and the contents of the line where the pattern is found is a colon ( : ).

cut -d: -f1 (for more information use "man cut" in terminal)
  • cut is for splitting up a string or line of output. Similar to str.split() in other programming languages.
  • -d: tells cut to use a colon ( : ) as a delimiter and split the contents.
  • -f1 tells cut to only show the first field of the split. In our case it is just the file name and not the contents found by grep since that is what is useful in our case.

sort -u (for more information use "man sort" in terminal)
  • sort does what it says. It sorts a list alphabetically by default.
  • -u means sort as a unique list. This way when there's a hundred instances of our search term found in a file we just see a single file name.

Last edited by sag47; 08-05-2010 at 07:00 PM.
 
Old 08-05-2010, 07:18 PM   #8
sag47
Senior Member
 
Registered: Sep 2009
Location: Philly, PA
Distribution: Kubuntu x64, RHEL, Fedora Core, FreeBSD, Windows x64
Posts: 1,413
Blog Entries: 33

Rep: Reputation: 355Reputation: 355Reputation: 355Reputation: 355
Quote:
Originally Posted by stf92 View Post
Hi:
and welcome. I tried it and it worked fine. Although it won't show all the usefulness it is capable of until I understand what is the argument of find's option 'name'. Is it a regexp? The manual does not say. So it is left for the shell alone to expand the argument. Or perhaps both things happen, one after the other. It's always been a mistery to me. Thanks and regards.
Yes, it uses regular expressions which are similar to perl. Not exact though. Refer to this link...
http://www.grymoire.com/Unix/Regular.html

Here's an example testing if a line starts with an ip address.

Code:
#IP shows up in output because the line starts with it
echo "192.168.1.1" | grep "^[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}.[0-9]\{1,3\}"

#IP does not show up because the line doesn't start with it
echo "hello 192.168.1.1" | grep "^[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}.[0-9]\{1,3\}"
So if you know your way around perl regex then you should be right at home.

Here's two ways of using the find command to search for case insensitive Windows (or windows or WINDOWS)...
Code:
#case insensitive argument in the find command
find -iname "*windows*"

#using regex to define case insensitivity.
find -name "*[Ww][Ii][Nn][Dd][Oo][Ww][Ss]*"

Last edited by sag47; 08-05-2010 at 07:24 PM.
 
1 members found this post helpful.
Old 08-05-2010, 10:35 PM   #9
stf92
Senior Member
 
Registered: Apr 2007
Location: Buenos Aires.
Distribution: Slackware
Posts: 3,052

Original Poster
Rep: Reputation: 45
Thanking everybody for their useful information and the explanation about the argument 'name' of 'find'. I can't leave the thread without saying this. Trondheim was mentioned in Harold Foster's Prince Valiant and you, colucix, shoudn't put Italy after the name of Bologna, whose university once was illustrious in Europe.
 
  


Reply

Tags
file, search


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
mounting linux files from another hard disk(IDE) to current hard disk(sata) the lord protector Linux - Hardware 5 05-04-2008 11:30 AM
installation from iso files on hard disk da lord Linux - Newbie 3 03-12-2008 11:13 AM
Mounted hard drive's files/folders are invisible when created jay_rod101 Linux - Newbie 6 06-16-2006 01:57 AM
How to view win98 files on hard disk alb1954 Feather 3 09-02-2004 01:08 PM
How do i read files from another hard disk? timberwolf Linux - Networking 10 12-03-2003 09:27 PM


All times are GMT -5. The time now is 04:58 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration