LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 07-06-2006, 03:09 AM   #16
Tinkster
Moderator
 
Registered: Apr 2002
Location: earth
Distribution: slackware by choice, others too :} ... android.
Posts: 23,067
Blog Entries: 11

Rep: Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928

Quote:
Originally Posted by patrokov
6. Token searching should be an addition to and complement to regular expression searching. It already works flawlessly in Notetab. Why should FOSS software be more limited? Well the answer resides in the ignorance expressed in the quote below.
Um... no offense, but that's like comparing a toddlers tricycle
to a Kawasaki and saying the Kawa is limited because one has
to learn how to shift gears.

/me shakes the head.


Cheers,
Tink
 
Old 07-07-2006, 01:47 AM   #17
patrokov
Member
 
Registered: Jan 2006
Location: Riviera Beach
Distribution: Slackware -current, ArchLinux
Posts: 59

Original Poster
Rep: Reputation: 15
Quote:
Originally Posted by Tinkster
that's like comparing a toddlers tricycle to a Kawasaki and saying the Kawa is limited because one has to learn how to shift gears.
Just smile at the nice moderator with the faulty analogy trying to argue against the straw man...
But seriously, what you're telling is me is, "You already have a blowtorch; why do you want a soldering iron?"

Well, it's simpler to operate, less dangerous, and more appropriate for some tasks. But to go back to my original point, the regular expression search and replace in KDE and OpenOffice cannot do what simple little token searching can: replace text across multiple lines. Does no one see this as a problem other than me? Does it not bother anyone that you have to run a sed script just to do a multi-line search and replace? And even then, you have to use the 'N'ext command and build up a multiline pattern space (jschwial).

Perhaps a concrete example will help. Some times we get test banks from textbook publishers that look something like this:

Code:
____	1.	Which kidney function is most affected by the administration of diuretics?
a.
Cleansing of extracellular fluid (ECF)
b.
Excretion of metabolic wastes
c.
Maintenance of extracellular (ECF) volume
d.
Control of acid-base balance


____	2.	With the knowledge of where each class of diuretics works in the kidney, which agent would the nurse expect to produce the greatest volume of diuresis?
a.
Hydrochlorothiazide
b.
Furosemide
c.
Spironolactone
d.
Triamterene
In order to import them into our online testing system, I need to strip out the line numbers, remove the extra line breaks after a., b., c., d., and add an extra line break between the question and the first answer. In notetab, I would use a regular expression to remove the line numbers. Then I would search with tokens for "^pa.^p" and replace it with "^p^pa. " Then search for "^pb.^p" replace "^pb. " Repeat with c and d.

In less than three minutes I would have the entire file formatted for import looking like:

Code:
Which kidney function is most affected by the administration of diuretics?

a. Cleansing of extracellular fluid (ECF)
b. Excretion of metabolic wastes
c. Maintenance of extracellular (ECF) volume
d. Control of acid-base balance


With the knowledge of where each class of diuretics works in the kidney, which agent would the nurse expect to produce the greatest volume of diuresis?

a. Hydrochlorothiazide
b. Furosemide
c. Spironolactone
d. Triamterene
How would you suggest that I replicate this functionality? because so far, it's thwarted all my efforts both trial and error and searching for an existing answer.

Last edited by patrokov; 07-07-2006 at 02:18 AM.
 
Old 07-07-2006, 02:55 AM   #18
Tinkster
Moderator
 
Registered: Apr 2002
Location: earth
Distribution: slackware by choice, others too :} ... android.
Posts: 23,067
Blog Entries: 11

Rep: Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928
Code:
$ sed -e 's/\(^__*\t*[0-9][0-9]*\.\t*\)\(.*\)/\2\n/g' -e '/[a-d]\./ {
N
s/ *\n/ /}' testing
Which kidney function is most affected by the administration of diuretics?

a. Cleansing of extracellular fluid (ECF)
b. Excretion of metabolic wastes
c. Maintenance of extracellular (ECF) volume
d. Control of acid-base balance


With the knowledge of where each class of diuretics works in the kidney, which agent would the nurse expect to produce the greatest volume of diuresis?

a. Hydrochlorothiazide
b. Furosemide
c. Spironolactone
d. Triamterene
Something like that? I admit, it took me 5 minutes to write that up and
test it (2 minutes of that I spent on trying to figure out why my original
" *" didn't match the whitespace in the first expression until I realised
that you were using TABs).
But it sits in my bash-history (or I could put it in a script or bash-function
if I felt so inclined) and the next hundred times I'd be finished before you
managed to start your first editing adventure in goatpad, which, as far as
I'm concerned, is a tremendous gain of efficiency.

And
Quote:
Just smile at the nice moderator with the faulty analogy trying to argue against the straw man...
But seriously, what you're telling is me is, "You already have a blowtorch; why do you want a soldering iron?"
who are you to talk of bad analogies?! :D


Cheers,
Tink

Last edited by Tinkster; 07-07-2006 at 04:04 PM.
 
Old 07-07-2006, 10:06 AM   #19
theNbomr
LQ 5k Club
 
Registered: Aug 2005
Distribution: OpenSuse, Fedora, Redhat, Debian
Posts: 5,399
Blog Entries: 2

Rep: Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908
Using nedit, which is a text editor, not a word processor, since this is just plain text:

search & replace
search for: ^([abcd]\.)\s*\n(.+$)
replace with: \1 \2

Once again, search & replace
search for: ^_+\s[0-9]+\.\s+(.+$)
replace with: \1\n

Result:

Which kidney function is most affected by the administration of diuretics?

a. Cleansing of extracellular fluid (ECF)
b. Excretion of metabolic wastes
c. Maintenance of extracellular (ECF) volume
d. Control of acid-base balance


With the knowledge of where each class of diuretics works in the kidney, which agent would the nurse expect to produce the greatest volume of diuresis?

a. Hydrochlorothiazide
b. Furosemide
c. Spironolactone
d. Triamterene


Okay, I needed two separate steps, so not as elegant as Tinkster's solution, but using regular expressions reduced the repetition of "replace it with "^p^pa. " Then search for "^pb.^p" replace "^pb. " Repeat with c and d." down to a single step, and could have been generalized to include steps 'e', 'f', ... 'z'.

Once I determined exactly what your objective was (a step that was automatic for you), this took all of 2 minutes, including cutting and pasting between browser and editor.

I will admit that I did try this experiment using the OpenOffice word processor, and the regex implementation there is, uh, primitive. Having now done that, I hope that someone does indeed lobby the developers to build in full regex search & replace support. I would hope that they would do that in lieu of any kind of orphan 'token-search' functionality, which is a non-standard idiom in Unix land.

I have used other visual style editors, the names of which escape me, that could also have done this. Certainly vi could also have been used, and being as ubiquitous as it is, might be the definitive tool, in terms of editors. A command line perl script would probably have been my weapon of choice, but that's just me.

--- rod.


edit: fix [benign] error in regex

Last edited by theNbomr; 07-07-2006 at 05:30 PM.
 
Old 07-07-2006, 05:15 PM   #20
patrokov
Member
 
Registered: Jan 2006
Location: Riviera Beach
Distribution: Slackware -current, ArchLinux
Posts: 59

Original Poster
Rep: Reputation: 15
Quote:
Originally Posted by Tinkster
And who are you to talk of bad analogies?!
Hey, as long as we're all guilty...

I guess my main problem was expecting kate and openoffice's implementations of regular expressions to be more robust than they are. Now, I'll have to take half an hour to grok the respective solutions (unless SundialCVS wants to explain to the gentle readers again) and then several hours looking at text editor alternatives. (If only my parents had gotten me a Unix workstation instead of a Vic-20 as a kid.)
 
Old 07-07-2006, 05:20 PM   #21
webazoid
Member
 
Registered: Jun 2004
Posts: 224

Rep: Reputation: 30
on a sidenote, in msword, if u wanted to delete all instances of a word, such as "car" in a list, how do you set it up so that it inserts a backspace so that the entire line is deleted and not just the word and doesn't leave a blank line? i.e.

1. cat
2. car
3. cog
4. cup

how do i get it so the new list becomes:
1. cat
2. cog
3. cup
?

This is what i'd generally get if I leave the replace column blank:
1. cat
2.
3. cog
4. cup

or, how do i optimze a page to remove empty lines w/o text, such as if someone triple spaced between paragraphs and I want to make it single spaced instead?

===
line1
line2
line3
return (empty space)
return (empty space)
return (empty space)
line4
line5
line6

desired format:
line1
line2
line3
return (empty space)--deleted two empty returns/paragraphs.
line4
line5
line6

Last edited by webazoid; 07-07-2006 at 05:28 PM.
 
Old 07-07-2006, 05:56 PM   #22
theNbomr
LQ 5k Club
 
Registered: Aug 2005
Distribution: OpenSuse, Fedora, Redhat, Debian
Posts: 5,399
Blog Entries: 2

Rep: Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908
Quote:
Now, I'll have to take half an hour to grok the respective solutions
Here, I'll help, and explain mine...

Code:
search for: ^([abcd]\.)\s*\n(.+$)

Anchoring at beginning of line                    ^ 
find any lower case a,b,c or d                    [abcd] 
followed by a period                              '.' 
and save all of that as component 1               () 
Continue to find zero or more whitespace char's   \s* 
followed by a newline                             \n 
then save as component 2,                         () 
everything (non-empty) up to the end-of-line      .+$


replace with: \1 \2

Whatever was found as component 1                 \1
space
Whatever was found as component 2                 \2
That does the joining of lines. Next do the left alignment.

Code:
search for: ^_+\s[0-9]+\.\s+(.+$)

Anchoring at beginniong of line                   ^
Find 1 or more underscores                        _+
Followed by a single whitespace (hmmm)            \s
Followed by 1 or more digits                      [0-9]+
Followed by one period                            \.
Followed by one or more whitespace char's         \s+
Save as component 1...                            ()
....the rest of the line                          .+$

replace with: \1\n

Component 1, newline

Hope this is tasting like cool-aid

--- rod.

Last edited by theNbomr; 07-07-2006 at 06:17 PM.
 
Old 07-07-2006, 07:29 PM   #23
Tinkster
Moderator
 
Registered: Apr 2002
Location: earth
Distribution: slackware by choice, others too :} ... android.
Posts: 23,067
Blog Entries: 11

Rep: Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928
Quote:
Originally Posted by patrokov
Now, I'll have to take half an hour to grok the respective solutions (unless SundialCVS wants to explain to the gentle readers again) and then several hours looking at text editor alternatives.
Code:
  's/\(^__*\t*[0-9][0-9]*\.\t*\)\(.*\)/\2\n/g'
\(^__*\t*[0-9][0-9]*\.\t*\)
Search for lines that begin with at least one underscore,
followed by none or more tabs, followed by a one or
more digit-number, a literal period and a tab,

\(.*\)
mark the rest of the line to the newline and put it in memory;

\2\n
replace whole line with the bit in memory and add a newline

Code:
  '/[a-d]\./ {
 N
 s/ *\n/ /}' testing
If a line has a,b,c or d followed by a literal period, print the
pattern space replacing the trailing spaces and a newline with
a space (effectively stripping the newline from [a-d].

I think that your greatest problem here is that you think
of interactive solutions to repetitive tasks, a very common
problem of windows-victims. They think they have freedom
if they are allowed to do chores 9reasonably quickly). I
think I have freedom if I can solve issues like that with little
thought once off and have the computer to the rest ;}


Quote:
Originally Posted by patrokov
(If only my parents had gotten me a Unix workstation instead of a Vic-20 as a kid.)
Heh - I've only been doing Linux since 97, grew up on programmable
calculators and a C-64 ;P


Cheers,
Tink

Last edited by Tinkster; 07-07-2006 at 07:30 PM.
 
Old 07-07-2006, 07:45 PM   #24
theNbomr
LQ 5k Club
 
Registered: Aug 2005
Distribution: OpenSuse, Fedora, Redhat, Debian
Posts: 5,399
Blog Entries: 2

Rep: Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908
Tink:

I notice your tendency to use '__*' and '[0-9][0-9]*' as way of expressing 'one or more', where I always use '_+', which amounts to the same thing. Is there some desireable side effect that I don't know about, using your method, or is it just a personal tendency? Nice one-liner, BTW.

--- rod.
 
Old 07-07-2006, 08:37 PM   #25
Tinkster
Moderator
 
Registered: Apr 2002
Location: earth
Distribution: slackware by choice, others too :} ... android.
Posts: 23,067
Blog Entries: 11

Rep: Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928
The effect you don't know about is that sed doesn't know about '+' ;}


Cheers,
Tink
 
Old 07-08-2006, 12:41 AM   #26
spirit receiver
Member
 
Registered: May 2006
Location: Frankfurt, Germany
Distribution: SUSE 10.2
Posts: 424

Rep: Reputation: 33
GNU sed does know about it:
Code:
 `\+'
      As `*', but matches one or more.  It is a GNU extension.
(from info sed)
 
Old 07-08-2006, 12:58 AM   #27
Tinkster
Moderator
 
Registered: Apr 2002
Location: earth
Distribution: slackware by choice, others too :} ... android.
Posts: 23,067
Blog Entries: 11

Rep: Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928
Quote:
Originally Posted by spirit receiver
GNU sed does know about it:
Code:
 `\+'
      As `*', but matches one or more.  It is a GNU extension.
(from info sed)
Which goes to show that one should read man and info pages
after an update ;} ... even if the tool is well familiar.


Cheers,
Tink
 
Old 07-08-2006, 12:22 PM   #28
patrokov
Member
 
Registered: Jan 2006
Location: Riviera Beach
Distribution: Slackware -current, ArchLinux
Posts: 59

Original Poster
Rep: Reputation: 15
Quote:
Originally Posted by Tinkster
I think that your greatest problem here is that you think
of interactive solutions to repetitive tasks, a very common
problem of windows-victims. They think they have freedom
if they are allowed to do chores 9reasonably quickly). I
think I have freedom if I can solve issues like that with little
thought once off and have the computer to the rest ;}
In all fairness though, in Word, it's relatively easy to write/record a macro that does the repetitive tasks with a single keystroke. The real problem in my mind is that I'm used to the Microsoft mindset where I am literally searching for exact text including the nonprinting escape codes and then replacing those codes with new ones. In the *nix, you're searching for conceptual patterns of strings and not replacing them, but manipulating them. Unfortunately, for me, my first big foray into the *nix way of doing things used programs that are not fully cooked in their implementations. (For example, if you put \n into the replace box in kate with regex turned on, you get "\n" not a new line.)

But this conversation has definitely been enlightening. Thanks to everyone for the help.

And to rod...I always liked my cool-aid double strength, because if you make it according to the recipe, it tastes watered down. ;-)

Last edited by patrokov; 07-08-2006 at 12:26 PM.
 
Old 07-08-2006, 04:01 PM   #29
Tinkster
Moderator
 
Registered: Apr 2002
Location: earth
Distribution: slackware by choice, others too :} ... android.
Posts: 23,067
Blog Entries: 11

Rep: Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928
Quote:
Originally Posted by patrokov
In all fairness though, in Word, it's relatively easy to write/record a macro that does the repetitive tasks with a single keystroke. The real problem in my mind is that I'm used to the Microsoft mindset where I am literally searching for exact text including the nonprinting escape codes and then replacing those codes with new ones.
I understand that :}

But in the unix way we could take this a notch further, and
write a milter; so if you received files of the same type from
the same people all the time you could have a program modify
the text, and add the modified version to the mail you receive ;}

NO interaction at all.


Cheers,
Tink
 
Old 07-08-2006, 04:31 PM   #30
theNbomr
LQ 5k Club
 
Registered: Aug 2005
Distribution: OpenSuse, Fedora, Redhat, Debian
Posts: 5,399
Blog Entries: 2

Rep: Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908
sed -e 's/milter/filter/g'

yes?

---
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
search & replace raj_sony2001 Linux - General 4 10-05-2006 02:05 PM
making search more advanced perfect_circle LQ Suggestions & Feedback 2 06-13-2005 08:12 AM
Python search and replace Accordion Programming 1 02-22-2005 07:54 PM
problem in perl replace command with slash (/) in search/replace string ramesh_ps1 Red Hat 4 09-10-2003 01:04 AM
Advanced Member Search anonE9 LQ Suggestions & Feedback 0 03-16-2003 07:48 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 10:52 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration