[SOLVED] Inserting text at specific line X column coordinates
ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Each line belongs to a different subject and for some coding problems some NaN are missing. I know where these NaN should be inserted, and the inserting positions change according to subject and column. As I have lots of files, but the missing values are always in the same position, I was wondering whether there is a way to automatize my insertions. I should literally do what "insert cell" in excel does: shifting columns and inserting values.
The output should look for example like the one below:
0.345644 0.453233 0.567872 ...
0.354234 NaN 0.589872 ...
0.323445 0.451111 NaN ...
Can CAT and SED help me in this?
I wouldn't mind writing a long series of inserting actions per line, as long as I can make lines insensitive to each others, so that column insertion wouldn't affect the rest.
One very space-consuming idea would be to make each line independent to the others (i.e. copy the line to a third file), add a column to it and re-append lines once again. Anything faster?
I'm quite sure you could accomplish whatever you want to accomplish in either sed or awk but without more information it's really hard to say anything more. Just a set of my assumptions:
Quote:
different column-length
Quote:
Each value has the same number of digits.
These two pieces of info are contradictory. I think you mean a differing number of columns per line with a fixed number of digits per column
Quote:
I know where these NaN should be inserted, and the inserting positions change according to subject and column.
but we don't know it.
Quote:
I was wondering whether there is a way to automatize my insertions.
I have no idea unless I know the criteria for insertions.
All right sycamorex, and sorry for having been misleading.
I indeed mean "differing number of columns per line with a fixed number of digits per column".
I have a long list of coordinates, but we could use just two of them as a test, as in the second array I proposed:
0.345644 0.453233 0.567872 ...
0.354234 NaN 0.589872 ...
0.323445 0.451111 NaN ...
so in this case I would like to put:
a NaN in the second line, column 2
a NaN in the third line, column 3
I hope this helps, and thanks again for your prompt response, Sycamorex.
2. If there's no pattern like that, how would you want to specify the line number and column?
3. It looks like you want to replace the old value with NaN. Is that correct?
1. No there's no such a pattern.
2. And indeed I was wrong with the array, I don't want to substitute, but to insert NaN and shift cells to the right. The previous array is wrong, this one is the correct one:
0.345644 0.453233 0.567872 ...
0.354234 NaN 0.452223 0.589872 ...
0.323445 0.451111 0.567822 NaN ...
As I've been learning python for the last few days, I wrote the following script for you. The first command line argument is the line number, the second argument will be the column number. For example:
#!/usr/bin/python
import sys
# The first argument is the line number starting with 1
# The second argument is the column number starting with 1
def main():
if (len(sys.argv) != 3):
print('{0} takes exactly 2 args!'.format(sys.argv[0]))
else:
pattern = "NaN "
line_number = int(sys.argv[1]) - 1
column_number = int(sys.argv[2]) - 1
columns = [column.rstrip() for column in open('columns.txt')]
current_line = (columns[line_number]).split(" ")
current_line.insert(column_number, pattern)
columns[line_number] = ' '.join(current_line)
new_output = '\n'.join(columns)
print(new_output)
if __name__ == '__main__':
main()
The only validation that it does is checking the number of command line arguments (has to be 2). It will spit an error if the line/column number you provide is out of range.
This isn't so much a solution as a bit of methodology that I use to solve such problems. I try to identify the key aspects of the problem, and then match those aspects to programming languages and/or tools that I know. In this case, I see that the problem involves text files that are row/column oriented. Immediately, this suggests a tool such as awk or for me, Perl. When I see 'insert', I think of the splice function in Perl. Having selected the tool and a basic operation to perform, I can develop and test on a single line of input data, and once that works, wrap it up in the file-reading and iteration control.
This kind of problem decomposition can be helpful not only when you are trying to solve a problem yourself, but also when trying to describe the parameters of your problem to others, such on forums like this one.
--- rod.
The negative here would be redirecting this to a new file and then renaming once completed.
If we assumed you had say a 100 insertions to make you could also place each row, column and new value on lines in a file and read them in to be used to change the file, like so:
@daniel - please don't withdraw an answer as it may assist others in solving not only this problem but to also see alternate ways to go about it. Remember that yours
may not be the shortest or fastest in this instance but may be better suited to a more perplexing problem. As you have outlined in previous posts, some times mine may
be a little too complex or advanced for others to follow and hence an alternative is always appreciated
@daniel - please don't withdraw an answer as it may assist others in solving not only this problem but to also see alternate ways to go about it. Remember that yours may not be the shortest or fastest in this instance but may be better suited to a more perplexing problem. ...
The desired output file has "NaN" inserted before line 2, field 2, and also before line 3, field 4.
Desired output file...
Code:
0.345644 0.453233 0.567872 0.432543
0.354234 NaN 0.452223 0.589872 0.233123
0.323445 0.451111 0.567822 NaN 0.452345 0.345234
I like to develop code stepwise, writing work files along the way. Examination of these work files verifies that the code is working as intended. They also help others understand the method.
No data has been lost. A Line-and-Field designation has been inserted ahead of each data item. These will serve as "targets" for substitutions done by a sed.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.