LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 01-23-2017, 10:16 AM   #1
L_Carver
Member
 
Registered: Sep 2016
Location: Webster MA USA
Posts: 243

Rep: Reputation: Disabled
Count number of times ONE punctuation mark occurs in a file


My "add keywords" script uses both carat ("^") and comma (",") as delimiters. The text files for input should have one of each, but often there are multiple occurrences of periods (".") in them, and I want to check the files before hand from the command line to make sure the period "." occurs no more than once.

Code:
grep -o . foo | wc -l
in one file returns 137, even though there is one and only one period in the file.

Code:
cat foo |echo $x | tr -d -c '.' | wc -m
returns 0.

I know I must be doing something wrong, but mu question is, what am I doing wrong? This is one of those instances which proves Google is entirely useless; if it weren't I'd have found a solution there and wouldn't be asking this question.

Please help.

Carver
 
Old 01-23-2017, 10:24 AM   #2
szboardstretcher
Senior Member
 
Registered: Aug 2006
Location: Detroit, MI
Distribution: GNU/Linux systemd
Posts: 4,278

Rep: Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694
Gives the answer 3.

Code:
echo "this. and. that." | tr -cd "\." | wc -c
Pipe your file through and it should count the intsances of the period. In regex a period signifies 'any character' so it has to be escaped.

Last edited by szboardstretcher; 01-23-2017 at 10:26 AM.
 
Old 01-23-2017, 11:06 AM   #3
suicidaleggroll
LQ Guru
 
Registered: Nov 2010
Location: Colorado
Distribution: OpenSUSE, CentOS
Posts: 5,573

Rep: Reputation: 2143Reputation: 2143Reputation: 2143Reputation: 2143Reputation: 2143Reputation: 2143Reputation: 2143Reputation: 2143Reputation: 2143Reputation: 2143Reputation: 2143
Quote:
Originally Posted by L_Carver View Post
Code:
grep -o . foo | wc -l
in one file returns 137, even though there is one and only one period in the file.
Grep match strings use regular expressions. In a regex, '.' is equivalent to '?', which matches any character. To match a literal '.' you need to delimit it:
Code:
grep -o '\.' foo | wc -l
 
Old 01-23-2017, 11:48 AM   #4
dlb101010
Member
 
Registered: Dec 2016
Posts: 61

Rep: Reputation: 18
If you guys don't mind a related question, the output of 'wc' baffles me.
For example running 'wc' without any options,
Code:
$ echo "this. and. that." | grep -o '\.' | wc   
      3       3       6
I would have predicted just one newline (from the 'echo' command).

Where does the newline number come from in this example?

Thanks,
Dave

[No sooner did I post this then it occurred to me that the newlines probably come from the three instances of grep finding the three periods. Sorry for the clutter.]

Last edited by dlb101010; 01-23-2017 at 11:52 AM.
 
Old 01-23-2017, 11:58 AM   #5
DavidMcCann
LQ Veteran
 
Registered: Jul 2006
Location: London
Distribution: PCLinuxOS, Salix
Posts: 6,213

Rep: Reputation: 2344Reputation: 2344Reputation: 2344Reputation: 2344Reputation: 2344Reputation: 2344Reputation: 2344Reputation: 2344Reputation: 2344Reputation: 2344Reputation: 2344
@ Dave
The output from "grep -o" is (to quote) "Print the matched parts of a matching line, with each such part on a separate output line" so in this case you get three lines with "." on them.
 
1 members found this post helpful.
Old 01-23-2017, 12:36 PM   #6
szboardstretcher
Senior Member
 
Registered: Aug 2006
Location: Detroit, MI
Distribution: GNU/Linux systemd
Posts: 4,278

Rep: Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694
My 'tr' example will extract only matching characters and then use 'wc -c' to count the characters. If you use 'grep -o' you will end up with characters + '\n' on seperate lines and you will have to count with 'wc -l'.
 
1 members found this post helpful.
Old 02-21-2017, 09:22 PM   #7
L_Carver
Member
 
Registered: Sep 2016
Location: Webster MA USA
Posts: 243

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by szboardstretcher View Post
Gives the answer 3.

Code:
echo "this. and. that." | tr -cd "\." | wc -c
Pipe your file through and it should count the intsances of the period. In regex a period signifies 'any character' so it has to be escaped.
Reading this and the other replies, I like this method the best, since it's done the job the few times I've applied it to the task I meant to find a method for. Sorry if that sounds loop-y and redundant.

Carver
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
bash script to count number of times a command is found in history file Mike0 Programming 3 04-28-2015 07:43 AM
[SOLVED] Using an array in awk to count the number of times a term shows up itachi8009 Programming 6 08-08-2013 09:46 AM
Is there a software audit tool in Linux to count number of times software is run? MikeyCarter Linux - Software 2 12-30-2009 10:27 AM
how to delete last number/word of a file and incude file count at the end of the chennaiguy Linux - Newbie 2 02-18-2008 10:08 PM
daemon to count number of times programs are being run dizzutch Linux - Software 0 10-27-2004 09:41 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 09:51 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration