ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I am just now learning awk, and I really want to do some basic things in a script.
I have, essentially a key-value pair file, on each line is a text key, then a text key, separated by a tab. I want to search the file for the key, then return the value.
The "TWO=$2" is because I need all that to be enclosed in the same parenthesis, but I need KEY_VAL to be expanded, but I need the literal text $2.
This worked fine, until I realized that keys could contain the '/' character. Now I'm stuck, as I don't know how to escape the '/' character to prevent the expansion of KEY_VAL from artificially ending the search block.
Ideally, I would also like to only return the entry from the first line the key is found on.
And yes, I swear I did search this first, but I couldn't find anything on escaping the '/'.
How is the goofy KEY_VAL being read in? Perhaps the simplest thing is to edit the input, so that an argument passed like 'goofy/key' becomes 'goofy\/key' before awk ever sees it, like 'echo $ORIG_KEY_VAL | sed 's,/,\\/,'
In particular, my file holds in the first column the output of file and in the second collumn an executable handler. When I feed it the the key "PNG image data, 253 x 338, 8-bit/color RGBA, non-interlaced" I get the error message
but this works fine for keys not containing "/". If I escape the '/' before sending it, what will awk see in argv, "...bit\/color..." or just "...bit/color..."?
Okay, so a quick test script confirms that awk will see ...bits/color...
----
But alas, this leaves me with the same problem. The '\/' escape sequence is preventing /bin/sh from reading any special meaning into the '/', but its awk who is having trouble with it.
Last edited by PatrickNew; 08-19-2007 at 02:00 AM.
Okay, so a quick test script confirms that awk will see ...bits/color...
----
But alas, this leaves me with the same problem. The '\/' escape sequence is preventing /bin/sh from reading any special meaning into the '/', but its awk who is having trouble with it.
No, awk will see 'bits\/color' if it's quoted. This isn't directed at you, as such, but at *everybody*. Every question like this should have three code blocks. One for input, one for the script, one for expected or actual output.
Code:
:cat test.txt
foo bar
baz mu
goofy/key fnord
PNG image data, 253 x 338, 8-bit/color RGBA, non-interlaced picture wtf
Now, I have to do that absolutely daft bit of assignment to KEY_VAL because I don't know how you're getting it. But that's one way to illustrate a demonstration case that extracts 'wtf' from the file, where it's $2 attached to the PNG junk. If that's not what you're doing, throw three code blocks back at me.
-- Oh, and I should note that the delimiter for cut is a literal tab, though it looks like a space.
Allright, the KEY_VAL is the exact output of `file -b` on a file, so I cannot trust its formatting, except that I know it does not contain tabs, since /etc/magic is tab-delimited.
~/rundb.gz (zcat-ed)
Code:
Rich Text Format data, version 1, ANSI oowriter
OpenDocument Text oowriter
ASCII C program text gedit
ASCII text gedit
PNG image data, 253 x 338, 8-bit/color RGBA, non-interlaced gthumb
run.sh
Code:
KEY_VAL=`file -b $1`
#to prevent expansion of $2
TWO='$2'
#look it up
CMD=`zcat ~/rundb.gz | awk -F'\t' "/$FILE_GIVES/ {print $TWO}"`
#echo the command instead of executing while developing
echo $CMD
Yes, I know better implementations of a file-type recognizing launcher have been done, but I'm writing this as a learning experience, not to produce a new tool
Glad it works for you, but I thought we were learning awk. If you can change that much, you should probably change a whole lot more. For instance, within the context of creating a 'rundb' I might use file's '-i' option because how many 253 x 338 PNGs are you going to have? I just feel that most of the work should be done getting a rundb in a useful format in the first place.
But my brain is obviously *completely* non-functional these days, so I'll leave you in the hands of the rest of LQ.
Anyway - FWIW, for my 'open files with programs script', I alsos used file, but just used the shell's case statement to match stuff:
Code:
TYPE="$(file -Lb -- "$1")"
case $TYPE in
...
JPEG*|PNG*|X\ pixmap*|GIF*) f_pictureapp "$1" ;;
...
I am just now learning awk, and I really want to do some basic things in a script.
I have, essentially a key-value pair file, on each line is a text key, then a text key, separated by a tab. I want to search the file for the key, then return the value.
The "TWO=$2" is because I need all that to be enclosed in the same parenthesis, but I need KEY_VAL to be expanded, but I need the literal text $2.
This worked fine, until I realized that keys could contain the '/' character. Now I'm stuck, as I don't know how to escape the '/' character to prevent the expansion of KEY_VAL from artificially ending the search block.
Ideally, I would also like to only return the entry from the first line the key is found on.
And yes, I swear I did search this first, but I couldn't find anything on escaping the '/'.
Glad it works for you, but I thought we were learning awk.
Alas, you are correct. My instinct to get something working overpowered my common sense. Learning a bit of awk is indeed one of the major goals of this project
Quote:
If you can change that much, you should probably change a whole lot more. For instance, within the context of creating a 'rundb' I might use file's '-i' option because how many 253 x 338 PNGs are you going to have?
Actually, mime types would have been my first choice, but they provide the opposite problem as file's native file type. My 'file' identifies OpenDocument Text files as simply "application/x-zip". The inability to recognize odt's is a deal-breaker for me.
[QUOTE]I just feel that most of the work should be done getting a rundb in a useful format in the first place.[QUOTE]
And if this silly little script had an intended user base beyond myself, I would agree. However, this is little more than a toy that I can design and redesign at will, no legacy users to support. If I have to start over, at least I'll have learned a bit of awk.
And in future versions of this script, it will probably use a more intelligent matching algorithm, perhaps only matching before the comma, as file's format seems to use that.
Many thanks, the match() function was exactly what I wanted. Using it instead of the /some_stuff/ syntax allowed awk to ignore the '/' in the string. I also implemented a way of ensuring that only one match is made. Here it is now:
And by adding this line before the search for the command,
Code:
KEY_VAL=`echo $KEY_VAL | awk -F, '{print $1}'`
I can search for only the part preceding a comma, if there is one. While I don't know this for certain, it appears that the part of file preceding the comma is sufficient to identify the file type, and after the comma is merely whatever additional information could be gotten. So one rundb entry for PNG's can match all PNG's, etc.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.