Count number of times ONE punctuation mark occurs in a file
Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Introduction to Linux - A Hands on Guide
This guide was created as an overview of the Linux Operating System, geared toward new users as an exploration tour and getting started guide, with exercises at the end of each chapter.
For more advanced trainees it can be a desktop reference, and a collection of the base knowledge needed to proceed with system and network administration. This book contains many real life examples derived from the author's experience as a Linux system and network administrator, trainer and consultant. They hope these examples will help you to get a better understanding of the Linux system and that you feel encouraged to try out things on your own.
Click Here to receive this Complete Guide absolutely free.
Count number of times ONE punctuation mark occurs in a file
My "add keywords" script uses both carat ("^") and comma (",") as delimiters. The text files for input should have one of each, but often there are multiple occurrences of periods (".") in them, and I want to check the files before hand from the command line to make sure the period "." occurs no more than once.
Code:
grep -o . foo | wc -l
in one file returns 137, even though there is one and only one period in the file.
Code:
cat foo |echo $x | tr -d -c '.' | wc -m
returns 0.
I know I must be doing something wrong, but mu question is, what am I doing wrong? This is one of those instances which proves Google is entirely useless; if it weren't I'd have found a solution there and wouldn't be asking this question.
in one file returns 137, even though there is one and only one period in the file.
Grep match strings use regular expressions. In a regex, '.' is equivalent to '?', which matches any character. To match a literal '.' you need to delimit it:
I would have predicted just one newline (from the 'echo' command).
Where does the newline number come from in this example?
Thanks,
Dave
[No sooner did I post this then it occurred to me that the newlines probably come from the three instances of grep finding the three periods. Sorry for the clutter.]
@ Dave
The output from "grep -o" is (to quote) "Print the matched parts of a matching line, with each such part on a separate output line" so in this case you get three lines with "." on them.
My 'tr' example will extract only matching characters and then use 'wc -c' to count the characters. If you use 'grep -o' you will end up with characters + '\n' on seperate lines and you will have to count with 'wc -l'.
Pipe your file through and it should count the intsances of the period. In regex a period signifies 'any character' so it has to be escaped.
Reading this and the other replies, I like this method the best, since it's done the job the few times I've applied it to the task I meant to find a method for. Sorry if that sounds loop-y and redundant.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.