LinuxQuestions.org
LinuxAnswers - the LQ Linux tutorial section.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
LinkBack Search this Thread
Old 08-03-2012, 09:13 AM   #1
udiubu
Member
 
Registered: Oct 2011
Posts: 54

Rep: Reputation: Disabled
Inserting text at specific line X column coordinates


Dear all,

I have a txt file with signal change values containing for example 20 lines of different column-length (ranging between 40 and 50):

0.345644 0.453233 0.567872 ...
0.354234 0.452223 0.589872 ...
0.323445 0.451111 0.567822 ...

Each value has the same number of digits.

Each line belongs to a different subject and for some coding problems some NaN are missing. I know where these NaN should be inserted, and the inserting positions change according to subject and column. As I have lots of files, but the missing values are always in the same position, I was wondering whether there is a way to automatize my insertions. I should literally do what "insert cell" in excel does: shifting columns and inserting values.

The output should look for example like the one below:

0.345644 0.453233 0.567872 ...
0.354234 NaN 0.589872 ...
0.323445 0.451111 NaN ...

Can CAT and SED help me in this?
I wouldn't mind writing a long series of inserting actions per line, as long as I can make lines insensitive to each others, so that column insertion wouldn't affect the rest.

One very space-consuming idea would be to make each line independent to the others (i.e. copy the line to a third file), add a column to it and re-append lines once again. Anything faster?

Any help would be highly appreciated!

I thank you very much for your help.

Sincerely,

Udiubu
 
Old 08-03-2012, 09:21 AM   #2
sycamorex
LQ Veteran
 
Registered: Nov 2005
Location: London
Distribution: Slackware64-current
Posts: 5,412
Blog Entries: 1

Rep: Reputation: 951Reputation: 951Reputation: 951Reputation: 951Reputation: 951Reputation: 951Reputation: 951Reputation: 951
I'm quite sure you could accomplish whatever you want to accomplish in either sed or awk but without more information it's really hard to say anything more. Just a set of my assumptions:

Quote:
different column-length
Quote:
Each value has the same number of digits.
These two pieces of info are contradictory. I think you mean a differing number of columns per line with a fixed number of digits per column


Quote:
I know where these NaN should be inserted, and the inserting positions change according to subject and column.
but we don't know it.

Quote:
I was wondering whether there is a way to automatize my insertions.
I have no idea unless I know the criteria for insertions.
 
1 members found this post helpful.
Old 08-03-2012, 09:27 AM   #3
udiubu
Member
 
Registered: Oct 2011
Posts: 54

Original Poster
Rep: Reputation: Disabled
All right sycamorex, and sorry for having been misleading.
I indeed mean "differing number of columns per line with a fixed number of digits per column".
I have a long list of coordinates, but we could use just two of them as a test, as in the second array I proposed:

0.345644 0.453233 0.567872 ...
0.354234 NaN 0.589872 ...
0.323445 0.451111 NaN ...

so in this case I would like to put:

a NaN in the second line, column 2
a NaN in the third line, column 3

I hope this helps, and thanks again for your prompt response, Sycamorex.

Best,

Udiubu
 
Old 08-03-2012, 09:33 AM   #4
sycamorex
LQ Veteran
 
Registered: Nov 2005
Location: London
Distribution: Slackware64-current
Posts: 5,412
Blog Entries: 1

Rep: Reputation: 951Reputation: 951Reputation: 951Reputation: 951Reputation: 951Reputation: 951Reputation: 951Reputation: 951
Thanks for the clarification. A few more questions:

1. Is there a pattern like:

2nd line, column 2
3rd line, column 3
4th line, column 4

2. If there's no pattern like that, how would you want to specify the line number and column?
3. It looks like you want to replace the old value with NaN. Is that correct?
 
1 members found this post helpful.
Old 08-03-2012, 09:37 AM   #5
udiubu
Member
 
Registered: Oct 2011
Posts: 54

Original Poster
Rep: Reputation: Disabled
1. No there's no such a pattern.
2. And indeed I was wrong with the array, I don't want to substitute, but to insert NaN and shift cells to the right. The previous array is wrong, this one is the correct one:

0.345644 0.453233 0.567872 ...
0.354234 NaN 0.452223 0.589872 ...
0.323445 0.451111 0.567822 NaN ...

I am terribly sorry for this mistake.

Udiubu
 
Old 08-03-2012, 10:47 AM   #6
sycamorex
LQ Veteran
 
Registered: Nov 2005
Location: London
Distribution: Slackware64-current
Posts: 5,412
Blog Entries: 1

Rep: Reputation: 951Reputation: 951Reputation: 951Reputation: 951Reputation: 951Reputation: 951Reputation: 951Reputation: 951
As I've been learning python for the last few days, I wrote the following script for you. The first command line argument is the line number, the second argument will be the column number. For example:

Code:
cat columns.txt 
0.345644 0.453233 0.567872 0.432543
0.354234 0.452223 0.589872 0.233123
0.323445 0.451111 0.567822 0.452345 0.345234
~/data/projects/python/misc % ./columns.py 2 1  # 2nd line, 1st column
0.345644 0.453233 0.567872 0.432543
NaN      0.354234 0.452223 0.589872 0.233123
0.323445 0.451111 0.567822 0.452345 0.34523
~/data/projects/python/misc % ./columns.py 3 2  # 3rd line, 2nd column
0.345644 0.453233 0.567872 0.432543
0.354234 0.452223 0.589872 0.233123
0.323445 NaN      0.451111 0.567822 0.452345 0.345234
Code:
#!/usr/bin/python
import sys

# The first argument is the line number starting with 1
# The second argument is the column number starting with 1

def main():
    if (len(sys.argv) != 3):
        print('{0} takes exactly 2 args!'.format(sys.argv[0]))
    else:
        pattern = "NaN     "
        line_number = int(sys.argv[1]) - 1
        column_number = int(sys.argv[2]) - 1
        columns = [column.rstrip() for column in open('columns.txt')]
        current_line = (columns[line_number]).split(" ")
        current_line.insert(column_number, pattern)
        columns[line_number] = ' '.join(current_line)
        new_output = '\n'.join(columns)
        print(new_output)

if __name__ == '__main__':
    main()
The only validation that it does is checking the number of command line arguments (has to be 2). It will spit an error if the line/column number you provide is out of range.
 
Old 08-05-2012, 10:30 AM   #7
theNbomr
LQ 5k Club
 
Registered: Aug 2005
Distribution: OpenSuse, Fedora, Redhat, Debian
Posts: 5,390
Blog Entries: 2

Rep: Reputation: 900Reputation: 900Reputation: 900Reputation: 900Reputation: 900Reputation: 900Reputation: 900Reputation: 900
This isn't so much a solution as a bit of methodology that I use to solve such problems. I try to identify the key aspects of the problem, and then match those aspects to programming languages and/or tools that I know. In this case, I see that the problem involves text files that are row/column oriented. Immediately, this suggests a tool such as awk or for me, Perl. When I see 'insert', I think of the splice function in Perl. Having selected the tool and a basic operation to perform, I can develop and test on a single line of input data, and once that works, wrap it up in the file-reading and iteration control.
This kind of problem decomposition can be helpful not only when you are trying to solve a problem yourself, but also when trying to describe the parameters of your problem to others, such on forums like this one.
--- rod.
 
Old 08-05-2012, 11:45 AM   #8
danielbmartin
Senior Member
 
Registered: Apr 2010
Location: Apex, NC, USA
Distribution: Ubuntu
Posts: 1,032

Rep: Reputation: 275Reputation: 275Reputation: 275
The solution offered by grail (below) is superior. Therefore, I have withdrawn mine.

Daniel B. Martin

Last edited by danielbmartin; 08-05-2012 at 08:54 PM. Reason: Solution withdrawn
 
Old 08-05-2012, 12:32 PM   #9
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,193

Rep: Reputation: 1784Reputation: 1784Reputation: 1784Reputation: 1784Reputation: 1784Reputation: 1784Reputation: 1784Reputation: 1784Reputation: 1784Reputation: 1784Reputation: 1784
Maybe something simple like:
Code:
awk -vrow=2 -vcol=3 'NR == row{$col = "NaN "$col}1' file
The negative here would be redirecting this to a new file and then renaming once completed.

If we assumed you had say a 100 insertions to make you could also place each row, column and new value on lines in a file and read them in to be used to change the file, like so:
Code:
$ cat columns.txt 
0.345644 0.453233 0.567872 0.432543
0.354234 0.452223 0.589872 0.233123
0.323445 0.451111 0.567822 0.452345 0.345234
$ cat changes.txt
2 3 Nan
3 1 0.123456
$ awk 'FNR == NR{col[$1]=$2;value[$1]=$3;next}FNR in col{$(col[FNR]) = value[FNR]" "$(col[FNR])}1' changes.txt columns.txt
0.345644 0.453233 0.567872 0.432543
0.354234 0.452223 Nan 0.589872 0.233123
0.123456 0.323445 0.451111 0.567822 0.452345 0.345234
 
1 members found this post helpful.
Old 08-06-2012, 01:14 AM   #10
udiubu
Member
 
Registered: Oct 2011
Posts: 54

Original Poster
Rep: Reputation: Disabled
Grail this works perfectly!

Thanks a lot.

Best,

Udiubu
 
Old 08-06-2012, 02:32 AM   #11
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,193

Rep: Reputation: 1784Reputation: 1784Reputation: 1784Reputation: 1784Reputation: 1784Reputation: 1784Reputation: 1784Reputation: 1784Reputation: 1784Reputation: 1784Reputation: 1784
@daniel - please don't withdraw an answer as it may assist others in solving not only this problem but to also see alternate ways to go about it. Remember that yours
may not be the shortest or fastest in this instance but may be better suited to a more perplexing problem. As you have outlined in previous posts, some times mine may
be a little too complex or advanced for others to follow and hence an alternative is always appreciated
 
1 members found this post helpful.
Old 08-07-2012, 09:11 AM   #12
danielbmartin
Senior Member
 
Registered: Apr 2010
Location: Apex, NC, USA
Distribution: Ubuntu
Posts: 1,032

Rep: Reputation: 275Reputation: 275Reputation: 275
Quote:
Originally Posted by grail View Post
@daniel - please don't withdraw an answer as it may assist others in solving not only this problem but to also see alternate ways to go about it. Remember that yours may not be the shortest or fastest in this instance but may be better suited to a more perplexing problem. ...
Okay, here is a different solution.

Input file 1...
Code:
0.345644 0.453233 0.567872 0.432543
0.354234 0.452223 0.589872 0.233123
0.323445 0.451111 0.567822 0.452345 0.345234
Input file 2...
Code:
2 2
3 4
The desired output file has "NaN" inserted before line 2, field 2, and also before line 3, field 4.

Desired output file...
Code:
0.345644 0.453233 0.567872 0.432543
0.354234 NaN 0.452223 0.589872 0.233123
0.323445 0.451111 0.567822 NaN 0.452345 0.345234
I like to develop code stepwise, writing work files along the way. Examination of these work files verifies that the code is working as intended. They also help others understand the method.

The development-level code is this ...
Code:
sed -r 's|^|s/L|; s| |F|; s|$|/NaN/|' < $InFile2 > $Work1
awk '{for (i = 1; i <= NF; i++)
  printf("%s", "L"NR "F" i " " $i " ")}
   {printf("%s","\n")}' $InFile1 \
|tee $Work2                      \
|sed "-f" $Work1 -               \
|tee $Work3                      \
|sed -r 's/L[0-9]+F[0-9]+ ?//g'  \
> $OutFile
Begin by converting InFile2 into a series of instructions to be performed by a sed.
This...
Code:
sed -r 's|^|s/L|; s| |F|; s|$|/NaN/|' < $InFile2 > $Work1
... creates Work1 containing this:
Code:
s/L2F2/NaN/
s/L3F4/NaN/
Note that "2 2" meaning "line 2, field 2" has been converted to a single word "L2F2". Same for all other lines in InFile2.

Now, turn our attention to InFile1.
This...
Code:
awk '{for (i = 1; i <= NF; i++)
  printf("%s", "L"NR "F" i " " $i " ")}
   {printf("%s","\n")}' $InFile1 \
... creates Work2 containing this:
Code:
L1F1 0.345644 L1F2 0.453233 L1F3 0.567872 L1F4 0.432543
L2F1 0.354234 L2F2 0.452223 L2F3 0.589872 L2F4 0.233123
L3F1 0.323445 L3F2 0.451111 L3F3 0.567822 L3F4 0.452345 L3F5 0.345234
No data has been lost. A Line-and-Field designation has been inserted ahead of each data item. These will serve as "targets" for substitutions done by a sed.

This...
Code:
|sed "-f" $Work1 -               \
... performs those substitutions, creating Work3:
Code:
L1F1 0.345644 L1F2 0.453233 L1F3 0.567872 L1F4 0.432543
L2F1 0.354234 NaN 0.452223 L2F3 0.589872 L2F4 0.233123
L3F1 0.323445 L3F2 0.451111 L3F3 0.567822 NaN 0.452345 L3F5 0.345234
Observe that the "targets" L2F2 and L3F4 have been replaced with the character string "NaN". Now the remaining (unused) targets must be removed.


This blows away those unused targets ...
Code:
|sed -r 's/L[0-9]+F[0-9]+ ?//g'  \          \
... to create the desired output file.

When satisfied that the code is working properly, the tees which create the workfiles may be removed, and the finished product is ...
Code:
sed -r 's|^|s/L|; s| |F|; s|$|/NaN/|' < $InFile2 > $Work1
awk '{for (i = 1; i <= NF; i++)
  printf("%s", "L"NR "F" i " " $i " ")}
   {printf("%s","\n")}' $InFile1 \
|sed "-f" $Work1 -               \
|sed -r 's/L[0-9]+F[0-9]+ ?//g'  \
> $OutFile
Daniel B. Martin
 
  


Reply

Tags
column, insert, line, sed


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Sed append text to end of line if line contains specific text? How can this be done? helptonewbie Linux - Newbie 4 10-23-2013 01:48 PM
[SOLVED] matching string in specific column and delete line udiubu Linux - Newbie 5 05-25-2012 02:29 AM
How-to cut specific text from cell and paste into new column ivn Linux - Newbie 5 12-17-2011 08:53 PM
inserting text from one line into others below Eppo Programming 4 06-20-2011 07:59 PM
SED - display text on specific line of text file 3saul Linux - Software 3 12-29-2005 04:32 PM


All times are GMT -5. The time now is 09:39 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration