Linux - SoftwareThis forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
6. Token searching should be an addition to and complement to regular expression searching. It already works flawlessly in Notetab. Why should FOSS software be more limited? Well the answer resides in the ignorance expressed in the quote below.
Um... no offense, but that's like comparing a toddlers tricycle
to a Kawasaki and saying the Kawa is limited because one has
to learn how to shift gears.
that's like comparing a toddlers tricycle to a Kawasaki and saying the Kawa is limited because one has to learn how to shift gears.
Just smile at the nice moderator with the faulty analogy trying to argue against the straw man...
But seriously, what you're telling is me is, "You already have a blowtorch; why do you want a soldering iron?"
Well, it's simpler to operate, less dangerous, and more appropriate for some tasks. But to go back to my original point, the regular expression search and replace in KDE and OpenOffice cannot do what simple little token searching can: replace text across multiple lines. Does no one see this as a problem other than me? Does it not bother anyone that you have to run a sed script just to do a multi-line search and replace? And even then, you have to use the 'N'ext command and build up a multiline pattern space (jschwial).
Perhaps a concrete example will help. Some times we get test banks from textbook publishers that look something like this:
Code:
____ 1. Which kidney function is most affected by the administration of diuretics?
a.
Cleansing of extracellular fluid (ECF)
b.
Excretion of metabolic wastes
c.
Maintenance of extracellular (ECF) volume
d.
Control of acid-base balance
____ 2. With the knowledge of where each class of diuretics works in the kidney, which agent would the nurse expect to produce the greatest volume of diuresis?
a.
Hydrochlorothiazide
b.
Furosemide
c.
Spironolactone
d.
Triamterene
In order to import them into our online testing system, I need to strip out the line numbers, remove the extra line breaks after a., b., c., d., and add an extra line break between the question and the first answer. In notetab, I would use a regular expression to remove the line numbers. Then I would search with tokens for "^pa.^p" and replace it with "^p^pa. " Then search for "^pb.^p" replace "^pb. " Repeat with c and d.
In less than three minutes I would have the entire file formatted for import looking like:
Code:
Which kidney function is most affected by the administration of diuretics?
a. Cleansing of extracellular fluid (ECF)
b. Excretion of metabolic wastes
c. Maintenance of extracellular (ECF) volume
d. Control of acid-base balance
With the knowledge of where each class of diuretics works in the kidney, which agent would the nurse expect to produce the greatest volume of diuresis?
a. Hydrochlorothiazide
b. Furosemide
c. Spironolactone
d. Triamterene
How would you suggest that I replicate this functionality? because so far, it's thwarted all my efforts both trial and error and searching for an existing answer.
$ sed -e 's/\(^__*\t*[0-9][0-9]*\.\t*\)\(.*\)/\2\n/g' -e '/[a-d]\./ {
N
s/ *\n/ /}' testing
Which kidney function is most affected by the administration of diuretics?
a. Cleansing of extracellular fluid (ECF)
b. Excretion of metabolic wastes
c. Maintenance of extracellular (ECF) volume
d. Control of acid-base balance
With the knowledge of where each class of diuretics works in the kidney, which agent would the nurse expect to produce the greatest volume of diuresis?
a. Hydrochlorothiazide
b. Furosemide
c. Spironolactone
d. Triamterene
Something like that? I admit, it took me 5 minutes to write that up and
test it (2 minutes of that I spent on trying to figure out why my original
" *" didn't match the whitespace in the first expression until I realised
that you were using TABs).
But it sits in my bash-history (or I could put it in a script or bash-function
if I felt so inclined) and the next hundred times I'd be finished before you
managed to start your first editing adventure in goatpad, which, as far as
I'm concerned, is a tremendous gain of efficiency.
And
Quote:
Just smile at the nice moderator with the faulty analogy trying to argue against the straw man...
But seriously, what you're telling is me is, "You already have a blowtorch; why do you want a soldering iron?"
Which kidney function is most affected by the administration of diuretics?
a. Cleansing of extracellular fluid (ECF)
b. Excretion of metabolic wastes
c. Maintenance of extracellular (ECF) volume
d. Control of acid-base balance
With the knowledge of where each class of diuretics works in the kidney, which agent would the nurse expect to produce the greatest volume of diuresis?
a. Hydrochlorothiazide
b. Furosemide
c. Spironolactone
d. Triamterene
Okay, I needed two separate steps, so not as elegant as Tinkster's solution, but using regular expressions reduced the repetition of "replace it with "^p^pa. " Then search for "^pb.^p" replace "^pb. " Repeat with c and d." down to a single step, and could have been generalized to include steps 'e', 'f', ... 'z'.
Once I determined exactly what your objective was (a step that was automatic for you), this took all of 2 minutes, including cutting and pasting between browser and editor.
I will admit that I did try this experiment using the OpenOffice word processor, and the regex implementation there is, uh, primitive. Having now done that, I hope that someone does indeed lobby the developers to build in full regex search & replace support. I would hope that they would do that in lieu of any kind of orphan 'token-search' functionality, which is a non-standard idiom in Unix land.
I have used other visual style editors, the names of which escape me, that could also have done this. Certainly vi could also have been used, and being as ubiquitous as it is, might be the definitive tool, in terms of editors. A command line perl script would probably have been my weapon of choice, but that's just me.
I guess my main problem was expecting kate and openoffice's implementations of regular expressions to be more robust than they are. Now, I'll have to take half an hour to grok the respective solutions (unless SundialCVS wants to explain to the gentle readers again) and then several hours looking at text editor alternatives. (If only my parents had gotten me a Unix workstation instead of a Vic-20 as a kid.)
on a sidenote, in msword, if u wanted to delete all instances of a word, such as "car" in a list, how do you set it up so that it inserts a backspace so that the entire line is deleted and not just the word and doesn't leave a blank line? i.e.
1. cat
2. car
3. cog
4. cup
how do i get it so the new list becomes:
1. cat
2. cog
3. cup
?
This is what i'd generally get if I leave the replace column blank:
1. cat
2.
3. cog
4. cup
or, how do i optimze a page to remove empty lines w/o text, such as if someone triple spaced between paragraphs and I want to make it single spaced instead?
Now, I'll have to take half an hour to grok the respective solutions
Here, I'll help, and explain mine...
Code:
search for: ^([abcd]\.)\s*\n(.+$)
Anchoring at beginning of line ^
find any lower case a,b,c or d [abcd]
followed by a period '.'
and save all of that as component 1 ()
Continue to find zero or more whitespace char's \s*
followed by a newline \n
then save as component 2, ()
everything (non-empty) up to the end-of-line .+$
replace with: \1 \2
Whatever was found as component 1 \1
space
Whatever was found as component 2 \2
That does the joining of lines. Next do the left alignment.
Code:
search for: ^_+\s[0-9]+\.\s+(.+$)
Anchoring at beginniong of line ^
Find 1 or more underscores _+
Followed by a single whitespace (hmmm) \s
Followed by 1 or more digits [0-9]+
Followed by one period \.
Followed by one or more whitespace char's \s+
Save as component 1... ()
....the rest of the line .+$
replace with: \1\n
Component 1, newline
Now, I'll have to take half an hour to grok the respective solutions (unless SundialCVS wants to explain to the gentle readers again) and then several hours looking at text editor alternatives.
Code:
's/\(^__*\t*[0-9][0-9]*\.\t*\)\(.*\)/\2\n/g'
\(^__*\t*[0-9][0-9]*\.\t*\)
Search for lines that begin with at least one underscore,
followed by none or more tabs, followed by a one or
more digit-number, a literal period and a tab,
\(.*\)
mark the rest of the line to the newline and put it in memory;
\2\n
replace whole line with the bit in memory and add a newline
Code:
'/[a-d]\./ {
N
s/ *\n/ /}' testing
If a line has a,b,c or d followed by a literal period, print the
pattern space replacing the trailing spaces and a newline with
a space (effectively stripping the newline from [a-d].
I think that your greatest problem here is that you think
of interactive solutions to repetitive tasks, a very common
problem of windows-victims. They think they have freedom
if they are allowed to do chores 9reasonably quickly). I
think I have freedom if I can solve issues like that with little
thought once off and have the computer to the rest ;}
Quote:
Originally Posted by patrokov
(If only my parents had gotten me a Unix workstation instead of a Vic-20 as a kid.)
Heh - I've only been doing Linux since 97, grew up on programmable
calculators and a C-64 ;P
I notice your tendency to use '__*' and '[0-9][0-9]*' as way of expressing 'one or more', where I always use '_+', which amounts to the same thing. Is there some desireable side effect that I don't know about, using your method, or is it just a personal tendency? Nice one-liner, BTW.
I think that your greatest problem here is that you think
of interactive solutions to repetitive tasks, a very common
problem of windows-victims. They think they have freedom
if they are allowed to do chores 9reasonably quickly). I
think I have freedom if I can solve issues like that with little
thought once off and have the computer to the rest ;}
In all fairness though, in Word, it's relatively easy to write/record a macro that does the repetitive tasks with a single keystroke. The real problem in my mind is that I'm used to the Microsoft mindset where I am literally searching for exact text including the nonprinting escape codes and then replacing those codes with new ones. In the *nix, you're searching for conceptual patterns of strings and not replacing them, but manipulating them. Unfortunately, for me, my first big foray into the *nix way of doing things used programs that are not fully cooked in their implementations. (For example, if you put \n into the replace box in kate with regex turned on, you get "\n" not a new line.)
But this conversation has definitely been enlightening. Thanks to everyone for the help.
And to rod...I always liked my cool-aid double strength, because if you make it according to the recipe, it tastes watered down. ;-)
In all fairness though, in Word, it's relatively easy to write/record a macro that does the repetitive tasks with a single keystroke. The real problem in my mind is that I'm used to the Microsoft mindset where I am literally searching for exact text including the nonprinting escape codes and then replacing those codes with new ones.
I understand that :}
But in the unix way we could take this a notch further, and
write a milter; so if you received files of the same type from
the same people all the time you could have a program modify
the text, and add the modified version to the mail you receive ;}
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.