LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Replicate a field (https://www.linuxquestions.org/questions/linux-newbie-8/replicate-a-field-803957/)

danielbmartin 04-24-2010 11:59 AM

Replicate a field
 
Hello.

I have a file containing text. I want to replicate a specific field.
For example, I might want to append a copy of the second word of each line to the end of that line.

Have:
Once upon a midnight dreary, while I pondered weak and weary,
Over many a quaint and curious volume of forgotten lore,

Want:
Once upon a midnight dreary, while I pondered weak and weary, upon
Over many a quaint and curious volume of forgotten lore, many

Is there a Linux command which will do this?
I seek a basic command, not awk, not Perl, because I haven't learned those things yet.

Daniel B. Martin

pixellany 04-24-2010 12:10 PM

First, let's define some terms: There are no "Linux commands". There are shell commands---BASH being the most common shell---and there are a bazillion utilities, applications, etc.

For text manipulation, common utilities include SED, AWK, and Perl. (Maybe Python also)

The BASH man pages will tell you about the commands built-in to BASH.

Second, I do not recommend posting a question here, and then placing restrictions on what solutions are offered. In fact, since you are talking about fields, I suspect that AWK may be one of the better choices.

I assume you want to do this on a line by line basis. Thus, you cannot simply use one tool to grab a word into a variable, and then make a second pass to add that variable to the end of the line.

Tinkster 04-26-2010 01:19 AM

Indeed ... this one screams "awk" and "perl" at the top of
its lungs...

Code:

awk '{print $0", "$2}' file
You could use a shell script and treat each line like so:
Code:

...
scnd=$( echo $line|sed 's/[[:space:]][[:space:]]*/ /g'| cut -d" " -f2)
echo ${line}", "${scnd}
...

The sed is in there in case there's a few consecutive spaces or tabs
in the line which would throw "cut" off.

Personally I find the awk version cleaner and more concise.


Cheers,
Tink

catkin 04-26-2010 01:39 AM

Quote:

Originally Posted by danielbmartin (Post 3946420)
Is there a Linux command which will do this?
I seek a basic command, not awk, not Perl, because I haven't learned those things yet.

pixellany usefully defined terms. Since you didn't exclude bash (bash commands could be held as "basic"), here's a pure bash solution
Code:

#!/bin/bash

while read line
do
    array=( $line )
    echo $line ${array[1]}
done <  input.txt

EDIT: or, more neatly
Code:

#!/bin/bash
while read -a array
do
    echo ${array[*]} ${array[1]}
done <  input.txt


danielbmartin 04-26-2010 06:30 AM

Quote:

Originally Posted by pixellany (Post 3946429)
... I do not recommend posting a question here, and then placing restrictions on what solutions are offered. ...

I respect your expertise and long service to this forum. This is a counterargument which you may find reasonable.

This is the Newbie Forum. I am a newbie, learning Linux on my own. I can't learn all of it at once, so I'm starting with what I mistakenly called Linux commands. Commands such as sed and grep are so powerful that I want to develop competence and confidence with them before moving on to awk or Perl.

If I place no bounds on solutions some members will produce awk or Perl solutions. Then they feel betrayed when I won't use their hard work. That's because I am unwilling to use code that I don't understand.

Daniel B. Martin

pixellany 04-26-2010 08:01 AM

Quote:

Originally Posted by danielbmartin (Post 3947961)
I respect your expertise and long service to this forum. This is a counterargument which you may find reasonable.

This is the Newbie Forum. I am a newbie, learning Linux on my own. I can't learn all of it at once, so I'm starting with what I mistakenly called Linux commands. Commands such as sed and grep are so powerful that I want to develop competence and confidence with them before moving on to awk or Perl.

If I place no bounds on solutions some members will produce awk or Perl solutions. Then they feel betrayed when I won't use their hard work. That's because I am unwilling to use code that I don't understand.

Daniel B. Martin

I totally understand your point of view---and I especially agree with the last sentence.

The only thing I can offer is that the work required to apply the wrong tool often eclipses the work required to learn the right tool. I have personally demonstrated this by coming up with some totally convoluted SED code and then watching the AWK experts swoop in with something far better.

I recommend learning all of the most common tools in the depth required to get your work done. In my case, I know SED and GREP well enough to know what problems will be difficult or even impossible. From this, I know when I need to dig back into AWK and learn a bit more.

MTK358 04-26-2010 08:18 AM

Code:

$ echo "this is test text" | sed -r 's:^([^ \t]+[ \t]+)([^ \t]+)(.*)$:\1\2\3, \2:'
this is test text, is


pixellany 04-26-2010 08:28 AM

I love it!!!! Another SED fanatic is released into the world.

This eloquently demonstrates my point above:
Quote:

The only thing I can offer is that the work required to apply the wrong tool often eclipses the work required to learn the right tool. I have personally demonstrated this by coming up with some totally convoluted SED code and then watching the AWK experts swoop in with something far better.
But then MTK's SED solution is NOT convoluted at all---it is a very simple and elegant use of backreferences.

grail 04-26-2010 09:10 AM

I think we can do it a little different, but same result:
Code:

echo "this is test text" | sed -r 's:[ \t]+([^ \t]+).*:&, \1:'

colucix 04-26-2010 09:28 AM

Yet another different approach...
Code:

paste -d' ' file <(cut -d' ' -f2 file)
but maybe too specific for the example shown in the original post.

catkin 04-26-2010 10:14 AM

Quote:

Originally Posted by danielbmartin (Post 3947961)
This is the Newbie Forum. I am a newbie, learning Linux on my own. I can't learn all of it at once, so I'm starting with what I mistakenly called Linux commands. Commands such as sed and grep are so powerful that I want to develop competence and confidence with them before moving on to awk or Perl.

I respect and understand your position; I would like to offer a counter argument.

There are many commands in the toolset, each with pros and cons for solving various problems. I doubt that any of us are totally fluent with them all. It is not necessary, even if possible, to completely master each before moving on to the next. Another approach is to learn simple usage of an increasing number and gradually extend that knowledge as convenient, as need arises.

This problem suits awk particularly well, allowing Tinkster to offer the simple and comparatively comprehensible
Code:

awk '{print $0", "$2}' file
Hoping to tempt you, it breaks down like this:
  1. awk <string> file means run awk with program <string>, taking input from file.
  2. awk processes each line in turn.
  3. An awk program comprises patterns and actions; when the pattern matches the line the action is performed.
  4. In this case no pattern is given; for awk that matches all lines.
  5. The action is contained in { }.
  6. awk puts the whole line in variable $0 and parses the line into $1, $2, $3 ... words according to its word separator.
  7. The default word separator is a space.
  8. awk's print function prints its arguments to standard output, by default the terminal.
  9. In awk, literal strings are given in double quotes.
  10. awk concatenates adjacent strings.
  11. Thus $0", "$2 is the whole line, followed by comma and space followed by the second word of the line. For every line of file, awk prints that to standard output.

danielbmartin 04-27-2010 07:03 PM

Quote:

Originally Posted by catkin (Post 3948165)
... This problem suits awk particularly well, allowing Tinkster to offer the simple and comparatively comprehensible
Code:

awk '{print $0", "$2}' file
Hoping to tempt you, it breaks down like this:
  1. awk <string> file means run awk with program <string>, taking input from file.
  2. awk processes each line in turn.
  3. An awk program comprises patterns and actions; when the pattern matches the line the action is performed.
  4. In this case no pattern is given; for awk that matches all lines.
  5. The action is contained in { }.
  6. awk puts the whole line in variable $0 and parses the line into $1, $2, $3 ... words according to its word separator.
  7. The default word separator is a space.
  8. awk's print function prints its arguments to standard output, by default the terminal.
  9. In awk, literal strings are given in double quotes.
  10. awk concatenates adjacent strings.
  11. Thus $0", "$2 is the whole line, followed by comma and space followed by the second word of the line. For every line of file, awk prints that to standard output.

Thank you for the detailed explanation. It whets my appetite for learning awk.

Some respondents misread the original post. The objective is to append the second word in each line to that line. There was no need for an additional comma. With that clarification, several of the offered code segments could be simplified.

Daniel B. Martin

danielbmartin 04-27-2010 07:08 PM

Quote:

Originally Posted by colucix (Post 3948125)
Yet another different approach...
Code:

paste -d' ' file <(cut -d' ' -f2 file)
but maybe too specific for the example shown in the original post.

Love it! One line of code, and more readable than some of the other suggested code segments. (More readable, at least, to this newbie.)

Technical Excellence may be defined as "completeness of function coupled with economy of means." Your solution qualifies as TE!

For extra credit: show how the output may be directed to a file rather than standard output.

Daniel B. Martin

catkin 04-28-2010 02:53 AM

Quote:

Originally Posted by danielbmartin (Post 3949921)
show how the output may be directed to a file rather than standard output.

Standard output can be directed to a file using the output redirection operator ">" as in
Code:

ls > my_file
Sometimes a command produces standard error as well. If you want it inn the same file then
Code:

command > my_file 2>&1
where "2>&1" means "send standard error (stream 2) to the same place as standard input (&1) is going".
In case you want them in different files
Code:

command > my_file.stdout 2> my_file.stderr

MTK358 04-28-2010 06:52 AM

> redirects stdout to a file.

2> redirects stderr to a file.

&> redirects both stdout and stderr to a file.


All times are GMT -5. The time now is 08:16 PM.