LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices

Reply
 
Search this Thread
Old 10-05-2011, 08:52 AM   #1
sumeet inani
Member
 
Registered: Oct 2008
Posts: 881
Blog Entries: 26

Rep: Reputation: 47
replace 8 times successive spaces by |


Hi
I am trying to make GUI for cmdow
Attached here is the log.
My aim is to separate columns by | which is understood by 'autoit' software.
i have attached the log.
The problem is I have to replace one or more consecutive spaces by a single | . And the title of window may contain spaces but that has to be discarded.
Is there a way using command line that I replace ' \+' by '|' only 8 times in each line ?

Thank You.
Attached Files
File Type: txt cmdow-log.txt (37.3 KB, 7 views)
 
Old 10-05-2011, 10:27 AM   #2
crts
Senior Member
 
Registered: Jan 2010
Posts: 1,604

Rep: Reputation: 446Reputation: 446Reputation: 446Reputation: 446Reputation: 446
Hi,

a somewhat convoluted sed will do:
Code:
sed -r 'h;s/[[:blank:]]+/|/g;x;s/([^[:blank:]]+[[:blank:]]+){8}//;x;s/\|[^|]+//g8;G;s/\n/|/' cmdow-log
This will also work if the Window Title (last field) contains more than one consecutive space. If we can assume that there are no more than one consecutive spaces (or if we want to transform them to a single space) then things can be simplified:
Code:
sed -r 's/[[:blank:]]+/|/g;s/\|/ /g9' cmdow-log
 
1 members found this post helpful.
Old 10-05-2011, 10:57 AM   #3
lithos
Senior Member
 
Registered: Jan 2010
Location: SI : 45.9531, 15.4894
Distribution: CentOS, OpenNA/Trustix, testing desktop openSuse 12.1 /Cinnamon/KDE4.8
Posts: 1,144

Rep: Reputation: 217Reputation: 217Reputation: 217
if you would wrote:

I have a file:
Code:
0x010064 1 2880 Res Ina Ena Hid explorer WorkerW
0x030032 1 2880 Res Ina Ena Hid explorer WorkerW
0x030052 1 2880 Res Ina Ena Hid explorer DDEMLEvent
0x03004E 1 2880 Res Ina Ena Hid explorer DDEMLMom
0x01008E 1 2372 Min Ina Ena Hid msseces  GDI+ Window
0x01007E 1 2880 Res Ina Ena Hid explorer tooltips_class32
0x01006E 1 2880 Res Ina Ena Vis explorer Program Manager
in which you would like to replace " " spaces with "|"

it would be understood,
but saying "The problem is I have to replace one or more consecutive spaces by a single | "

I understand like you have " ...8x... " spaces and need to replace it with single "|".

Of course it can be done with many commands like "sed" or "awk"


@crts NICE WORK !

Last edited by lithos; 10-05-2011 at 11:06 AM.
 
Old 10-05-2011, 10:42 PM   #4
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,489

Rep: Reputation: 1891Reputation: 1891Reputation: 1891Reputation: 1891Reputation: 1891Reputation: 1891Reputation: 1891Reputation: 1891Reputation: 1891Reputation: 1891Reputation: 1891
Not sure what you wanted to do about the Header line, but the following will have the rest the way you want:
Code:
ruby -ane '(0..($F.length - 1)).each{|i| $F[i]+=(i<=7)?"|":" "};puts $F.join' cmdow-log
 
Old 10-05-2011, 11:33 PM   #5
sumeet inani
Member
 
Registered: Oct 2008
Posts: 881
Blog Entries: 26

Original Poster
Rep: Reputation: 47
to crts
Code:
sed -r 'h;s/[[:blank:]]+/|/g;x;s/([^[:blank:]]+[[:blank:]]+){8}//;x;s/\|[^|]+//g8;G;s/\n/|/' cmdow-log
It works like a charm.

Actually i am using unxutils on windows which has no binary for 'ruby'.

I will try to decipher the code.

Last edited by sumeet inani; 10-05-2011 at 11:51 PM.
 
Old 10-05-2011, 11:55 PM   #6
sumeet inani
Member
 
Registered: Oct 2008
Posts: 881
Blog Entries: 26

Original Poster
Rep: Reputation: 47
Can you explain me this ?
Code:
echo "123 abc" | sed 's/[0-9]*/& &/'
123 123 abc
understandable
echo "abc 123" | sed 's/[0-9]*/& &/'
 abc 123 
In this case 123 should have matched so I was expecting 'abc 123 123'
since 'abc ' was not search pattern . It would be unchanged . 
 
Old 10-06-2011, 02:08 AM   #7
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,489

Rep: Reputation: 1891Reputation: 1891Reputation: 1891Reputation: 1891Reputation: 1891Reputation: 1891Reputation: 1891Reputation: 1891Reputation: 1891Reputation: 1891Reputation: 1891
The issue is where the match starts as you are using asterisk, meaning zero or more, it is looking
from left to right and saying that there are zero digits at the front of the string so replace that with
itself, a space and itself. So 2 lots of nothing with a space leaves you with a space at the start.

What makes less sense to me however is:
Code:
$ echo "abc 123" | sed 's/[0-9][0-9]*/& &/'
abc 123 123
#understandable as now you have asked for a digit followed by zero or more

$ echo "abc 123" | sed 's/[0-9]+/& &/'
abc 123
# this asks for one or more (which I would interpret the same as above) but does not work
 
Old 10-06-2011, 05:18 AM   #8
sumeet inani
Member
 
Registered: Oct 2008
Posts: 881
Blog Entries: 26

Original Poster
Rep: Reputation: 47
I get it .
I was reading this sed totorial by Bruce Barnett.
I haven't used g flag so first occurence is substituted & rest printed as it is.
so 'abc 123' is actually ^abc 123$ thus output is '^ ^abc 123' where ^ can be called nothing.
 
Old 10-06-2011, 05:29 AM   #9
sumeet inani
Member
 
Registered: Oct 2008
Posts: 881
Blog Entries: 26

Original Poster
Rep: Reputation: 47
to grail
I think
Code:
echo abc 123 | sed 's/[0-9]\+/& &/' gives
abc 123 123
NOTE:\+ not +
 
Old 10-06-2011, 05:31 AM   #10
sumeet inani
Member
 
Registered: Oct 2008
Posts: 881
Blog Entries: 26

Original Poster
Rep: Reputation: 47
also I found a solution to question I asked .
It is simple,workable though it removes extra spaces from last column
Code:
sed -e "s/ \+/|/g" -e "s/|/ /9g" cmdow-log.txt

Last edited by sumeet inani; 10-06-2011 at 05:32 AM.
 
Old 10-06-2011, 10:13 AM   #11
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,489

Rep: Reputation: 1891Reputation: 1891Reputation: 1891Reputation: 1891Reputation: 1891Reputation: 1891Reputation: 1891Reputation: 1891Reputation: 1891Reputation: 1891Reputation: 1891
Or you could just use '-r' switch. What I presented does confuse me a little but I do also know that it is an extended regular expression solution (I was trying to direct you to this):
Code:
$ echo "abc 123" | sed -r 's/[0-9]+/& &/'
abc 123 123
 
Old 10-06-2011, 12:27 PM   #12
crts
Senior Member
 
Registered: Jan 2010
Posts: 1,604

Rep: Reputation: 446Reputation: 446Reputation: 446Reputation: 446Reputation: 446
about '-r' option

So, reading the last few posts there appear to be some misconceptions about the '-r' option and extended RegEx in GNU sed.
The most important thing to notice is that GNU sed by default understands extended RegEx. Supplying the '-r' option does not add any additional functionality. It simply avoids the need for escaping them. E.g., the "+" is an extended RegEx. To have sed interpret it as such you have to prepend a backslash, like "\+". Using the -r option simply makes the backslash obsolete in most cases:
Code:
echo 'word' | sed 's/w\+/C/'
echo 'word' | sed -r 's/w+/C/' # same as above
Same goes for parenthesis:
Code:
echo 'word word' | sed 's/\(word\) \1/\1 CHANGE/'
echo 'word word' | sed -r 's/(word) \1/\1 CHANGE/' # same as above
However, this is not true for "\<" and "\>:
Code:
echo 'word' | sed 's/\<w.*\>/CHANGE/'
echo 'word' | sed -r 's/\<w.*\>/CHANGE/' # same as above
echo 'word' | sed -r 's/<w.*>/CHANGE/' # not same as above; expects literal '<' and '>' in input string.
So word boundary symbols "\<" and "\>" need to be escaped in any case. This behavior is a bit inconsistent.

@OP: You said that you are reading the tutorial by Bruce Barnett. I suppose you mean the tutorials on this site:
http://www.grymoire.com/Unix

You might get a bit confused when you read the tutorial about Regex in general on that site, especially this chapter:
http://www.grymoire.com/Unix/Regular.html#uh-12

It says that "\{" and "\}" are basic RegEx and that they cannot be used as extended RegEx. However, in the table further down it is marked as extended RegEx. This is contradictory.

Anyway, RegExes are a great source for confusion since every language/program seems to add its own small modifications to them.

BTW, this is how sed handles "{}":
Code:
echo 'hello' | sed 's/l\{2\}/CC/'
echo 'hello' | sed -r 's/l{2}/CC/' # same as above

Last edited by crts; 10-06-2011 at 12:29 PM.
 
Old 10-06-2011, 06:31 PM   #13
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,489

Rep: Reputation: 1891Reputation: 1891Reputation: 1891Reputation: 1891Reputation: 1891Reputation: 1891Reputation: 1891Reputation: 1891Reputation: 1891Reputation: 1891Reputation: 1891
Thanks for the clarity crts
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Using Sed to search and replace does not accept spaces? timdvtemp Linux - Software 5 02-16-2011 04:19 PM
Replace Ctrl-M (^M) characters with spaces.... visitnag Linux - Newbie 3 04-16-2008 09:05 AM
script to replace spaces with - lleb Linux - Newbie 16 01-04-2008 04:15 PM
Cannot replace spaces w/ underscores clem_c_rock Linux - Newbie 7 09-27-2007 01:17 PM
How to replace spaces in filenames with underscores rosslaird Linux - Software 4 02-22-2005 01:08 AM


All times are GMT -5. The time now is 07:53 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration