use Awk to isolate a specific directory level...
hello
I have used Awk in the past to isolate the file name from a given path..that is to say, I may have a list of files contained in list.txt: FIG. 1. dir1/dir2/dir3/file1.dat dir4/dir5/dir6/file2.dat dir7/dir8/dir9/file3.dat dir10/dir11/dir12/file4.dat ...and so on.... and I used the Awk command: Code:
cat list.txt | awk -F "/" '{print $NF}' FIG. 2. file1.dat file2.dat file3.dat file4.dat ..and so on... I now want to do almost the exact opposite and instead of isolate the file name I want to isolate, say the middle directory in the list I have shown in Fig. 1, that is to say I want to end up with an output that would read: Fig. 3. dir2 dir5 dir8 dir11 ...and so on... Can someone please post the Awk command that would do this? (I assume it will be very similar in form to the Awk command I showed above.) The point is, sometimes I may want to isolate the second directory, sometimes I may want to isolate the third directory or tenth or whatever - so I am hoping that if someone posts the Awk command to isolate the second level directory (to produce the output I showed in Fig.3) it should be fairly obvious by looking at the form of this command how to alter it and so isolate any other directory I want. I hope I've been clear in what I'm asking! |
Yes, very similar, replace $NF with $2.
|
What about checking the man page of awk, section Fields.
|
Quote:
Code:
$ basename path/to/file Code:
path=path/to/file Quote:
|
I would add that cat is a wasted command here as well .. Just pass the file name to awk.
|
If, however, you want a list of the unique directory names, do something like this:
gawk -F'/' '{++directory[$3]} END {for (i in directory) {print i " (" directory[i] " files)"}}' Here's what the output looks like: Code:
$ ls -1 */*/*/* | gawk -F'/' '{++directory[$3]}; END {for (i in directory) {print i " (" directory[i] " files)"}}' |
Quote:
|
Quote:
Code:
var=0 Code:
$ bash test.sh |
+1 to kurumi's post as my sentiments exactly.
|
Quote:
|
Bash v4.2 has introduced the lastpipe shell option, which makes the last command in a pipe chain run in the current environment, ksh-style. So the variable-scope problem can now be avoided, at least. However, I think it's still better to use bash's built-in file access instead of forking off a process for the external cat.
As for the OP's request, there are also several ways we can go about it inside bash. The first and probably best is use an array to separate the name into fields. Code:
IFS=/ Code:
while read dirname; do Code:
re='([^/]+)/([^/]+)/([^/]+)/([^/]+)' |
Quote:
If it’s important to have the name of the file in question at the beginning of the statement, I would suggest to define a function for it. Inside the function you can put it at the end to feed the while loop, but in the function call it’s the argument. |
David you forgot one of my favourite array style options :)
Code:
while read -r dirs; do |
My bad. :(
Actually I don't like recommending the positional parameters, at least not without a warning for the newbies. Since set overwrites any previous values, you might mess up your script if they're already in use for other things. Still, it does have the benefit of not needing to set IFS. BTW, the UUOC award text demonstrates how you can list the filename first without the use of cat. I don't know if it's any more readable, though. Code:
<cat list.txt awk -F "/" '{print $NF}' |
All times are GMT -5. The time now is 09:06 AM. |