ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I have the following file:
www/home/pictures/bird.gif
www/home/pictures/home.gif
www/home/documents/home.gif
I want to filter out the records that have a file name containing "home". I don't want the records with a directory named "home" to be included. To only pick out only the filename, I do this:
fn=`echo $tmp | grep '[^\ /]*$' -o`
where $tmp is the first record in the file.
So, can I do this operation using awk? I have tried something like shown below, but I only get syntax errors (invalid char '`' in expression). Am I not allowed to use grep inside an awk?
awk '{ fn=`echo $1 | grep '[^\/]*$' -o` if ( $fn ~/'home'/ ) print $0 }' $myFile > newfile
It's really unclear what you're trying to do. You say you want to "filter out" filenames w/'home' in them, but that you don't want "records with a directory named 'home' to be included".
1) by filter out, do you mean to ignore files with 'home' in them or that you want to "work with" files with 'home' in them?
2) all of the directories contain directories named 'home'
3) your regex "[^\ /]*$" matches any input line (see end of this post for why). Also, you needn't escape '/' in a character class - [^/] works fine.
4) what about something with awk -F'/' '{blah blah}' to get at each compenent of the full pathname (whatever it is you're doing )?
the regex "[^/]*$" matches any line which contains zero or more occurences of any character but '/', followed by an end of line marker. Note that even "//////" matches, because [^/]* is in a sense "optional". Think about that one. Also, the -o option puts grep into an endless loop with that regex (at least for me it does). I've never used -o so I'm not familiar with what can go wrong with it.
1) I want to work with the records that actually contain a file with "home" in. I don't want to work with www/home/pictures/bird.gif, and to avoid this I use the regular expression. If I don't use a regular expression, this record will be selected because it has "home" in it.
2)Yes, all the records contain "home", and that's the problem. As mentioned above.
3) As far as I can see, my reg.expr is correct because it only returns the filename. I want to use this filename for further testing in the awk:
awk '{ fn=`echo $1 | grep '[^\/]*$' -o` if ( $fn ~/'home'/) print $0 }' $myFile > newfile
You are trying to start the "grep" program from inside a awk script. I'm not very sure, but I think that's not (easily) possible. But it's also doesn't make a lot of sense, since awk is more powerfull than grep. It can do the same, and a lot more, but with different syntax and semantics.
Thank you for help! I realise that I should have told you the whole scenario instead of simplify it too much. Sorry
This is my actual file:
anna;www/home/pictures/bird.gif;23
arna;www/pictures/animal.gif;4
emma;www/home/documents/home.gif;333
kim;www/nothome/somebitmap.gif;123
sarah;www/somedir/pictures/next.gif;43
alf;www/home/documents/home.gif;1
It is semicolon separated, so I can't use field separator you suggested (-F/).
Is it possible to put your awk-command inside mine? Like this:
awk -F';' '{ fn=`your awk` if ( $fn ~/'home'/ ) print $0 }' $myFile > newfile
Again; I want to print out the records that has a filename with "home" in it.
Now (I think) I understand the "home"-part of the problem, but it's still unclear to me what part of an input line, you want to see in the output.
So here are 3 options:
Code:
# Entire line in ouput. I this case you can also use grep.
awk '/\/[^/]*home[^/]*;/{print $0}' files.txt
grep '\/[^/]*home[^/]*;' files.txt
# Only file in output, including the path:
awk -F\; '/\/[^/]*home[^/]*;/{print $2}' files.txt
# Only file in output, with path stripped:
awk -F\; '/\/[^/]*home[^/]*;/{sub(".*/","",$2) ; print $2 }' files.txt
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.