ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
My question here is,
Which one is efficient(using cut cmd or awk). Which one works better in all situations.
Right now, i can't think of any situation. However, I would like to know from all your experince that Have you find any difference.
Is there any basic guidelines to write a shell script on efficiency part.
Thanks in advance.
Cheers,
Suresh
Last edited by suresh.chola; 01-21-2010 at 01:17 AM.
Click here to see the post LQ members have rated as the most helpful post in this thread.
In my opinion, in this example, they are the same. You can check by using 'time' command. 'cut' is use when separator is colons, commas... while 'awk' is use when columns are separated by a varying number of spaces.
You cannot get reliable timing doing something once. I wrapped them in a for loop to repeat them 1000 times, and then did 3 tests for each (to get a handle on the consistency). Cut averaged 18.735s for the thousand, while awk averaged 20.079s, which is 7% longer. A small difference; if one is seeking to improve performance, there are likely to be bigger gains elsewhere. (Starting with not using shell scripting!)
Last time I did a similar comparison, I compared a 10,000 iteration loop operating on a long silly string.
cut was faster than tr; tr was faster than sed; sed was faster than awk. cut and tr were very close. sed significantly faster than awk, doing substring replacements.
My question here is,
Which one is efficient(using cut cmd or awk).
if you are acting on one string, not much difference. If your task is to get fields 2 to 5, using cut's range -f2-5 may be more "clean", whereas you have to use a loop in awk. (but that's not a big problem)
Quote:
Which one works better in all situations.
awk of course. Its a programming language. cut is just a small tool to do one task. you seldom need to use cut (or grep/sed etc) when you know awk.
Quote:
Is there any basic guidelines to write a shell script on efficiency part.
if you are concerned with efficiency, always try to use the shell's internal commands. If you want to cut a string up, make use of IFS, set etc . That way, there's no need to use external tools. However, if you want to process BIG files, the shell is not the way. Use awk, or languages such as Perl/Python for that.
Cut averaged 18.735s for the thousand, while awk averaged 20.079s, which is 7% longer.
you cannot compare it like that. awk is a bigger executable than cut. And cut only performs a simple task, ie to cut up a string. awk does a lot more, takes a little more time to "load" when executed. If you want to compare apple to apple, make cut do the same things awk does.
Last time I did a similar comparison, I compared a 10,000 iteration loop operating on a long silly string.
cut was faster than tr; tr was faster than sed; sed was faster than awk. cut and tr were very close. sed significantly faster than awk, doing substring replacements.
FWIW
a lot depends also on how well one know his tools, and how well one understand the problem to solve.
you cannot compare it like that. awk is a bigger executable than cut. And cut only performs a simple task, ie to cut up a string. awk does a lot more, takes a little more time to "load" when executed. If you want to compare apple to apple, make cut do the same things awk does.
One of the OP's questions was which is more efficient for the job of getting the third field from a comma separated line. I did the relevant test for that usage, taking 'efficient' as referring to computer time.
Obviously awk can do many things cut cannot. But for those things cut can do, I expect it will generally be a little faster than awk.
Of course, 'efficiency' can also refer to the programming/scripting stage. But that cannot be measured objectively, since as mentioned it depends on personal familiarity with the tools.
Thankyou guys for all your valuable inputs on the efficiency part.
So, Can I conclude that
" In my above scenario where I need to get the 3rd field from a record, CUT is better option than AWK.
However, AWK is better in terms of options and flexibility"
It really helped me coz,
right now I need to loop through a file which has over 1 million records and get the 3rd field from each line and pass it to a routine.
Look at it this way, 'cut' is a one shot utility (Like eg wc). To process 1 million recs, you'd have to invoke it 1 million times.
'awk' is a programming lang and you should (I'm not an awk man, I'd use Perl) be able to invoke awk once(!) and write the entire (1 million recs) process inside that one awk process (I believe...).
Look at it this way, 'cut' is a one shot utility (Like eg wc). To process 1 million recs, you'd have to invoke it 1 million times.
'awk' is a programming lang and you should (I'm not an awk man, I'd use Perl) be able to invoke awk once(!) and write the entire (1 million recs) process inside that one awk process (I believe...).
That is wrong, cut works on files too and it is run only once. In fact, I'm still betting that it would be somewhat faster than awk on large files, but I don't have any benchmarks.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.