ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Distribution: RH 7.3/8.0/9.0, Debian Stable 3.0, FreeBSD 5.2, Solaris 8/9/10,HP-UX
Posts: 340
Rep:
Some techniques for text file editing
I normally use C and bash scripting when I need to deal with text files. I have encountered a situation in which I got stuck, and helping me, means helping many other since this is not a "rare situation" where you need to program this type of application. Enough crap.
Suppose I have a text file, having a format like the following (but can be generic):
(I've replaced the white space with a period '.' in this post since multiple white spaces are boiled into one white space, thus losing the scope of my query)
The spacings between the fields are not the same on purpose! Is there a way to maintain the same structure of the text file (in terms of spacing, indentation, and delimitation) but modifying/replacing specific fields using C or bash?? I did it on bash by assiging each field to a variable as in:
var1=field1
var2=field2
... and so on
then:
line1=$var1............$var2....$var3..................$var5
(maintaining the same indentation)
and then redirected it to a text file:
echo $line1 > textfile
this resulted in loss of correct indentation:
output of text file-> field1 field2 field3 field5
Thanks in advance!!
(PS: The problem is coincidentally similar to when you write on this post and many white spaces are boiled to one space!!!)
Distribution: RH 6.2, Gen2, Knoppix,arch, bodhi, studio, suse, mint
Posts: 3,304
Rep:
this might be ugly, but the first thing that came to mind
with me was not to carry the spaces with the variables,
but make the spaces seperate variables, or really
constants.
var1=field1
var1s=" "
var2=field2
var2s=" "
line1=$var1 $var1s $var2 $var2s
i don't know. probably a stupid idea.
i haven't programmed anything in 8 years.
I must say there is nothing better (IMO) than perl for text processing. For some it seems strange to start with, but it is so, so, so much easier than using bash, c etc style languages for text processing/scripting. Just get the hang of perl's regular expressions and you are armed with the best weapon for these types of tasks.
Give it a go, you'll thank your self for it!
I've done a lot of scripting with sh,ksh,tchs,bash and once I discovered perl, I nearly never use anything else... (for scripting that is)...
Distribution: RH 7.3/8.0/9.0, Debian Stable 3.0, FreeBSD 5.2, Solaris 8/9/10,HP-UX
Posts: 340
Original Poster
Rep:
thanks mr_segfault, I'll take your advice. Do you think that with perl I am able to just edit only particular fields of a file instead of simulating the thing by replicating the whole text file from scratch (with new fields of course) and thus not being generic?
If you post your original file inside [KODE] your file text here[/KODE] (replacing the K's with C's (I dont know how to do that without it making my text into tags then i'll show you the perl scripts to go what your trying to do...
And as whansard said, even awk is good fot that, although I prefer perl since it is a little more like structured languages (C type) than awk.. I'm no awk guru
Distribution: RH 7.3/8.0/9.0, Debian Stable 3.0, FreeBSD 5.2, Solaris 8/9/10,HP-UX
Posts: 340
Original Poster
Rep:
In actual fact I'm fully aware that awk is very powerful - I simply adore it and encourage many newbies to use it (although I'm not an awk guru). Yet I use it for snatching and filtering purposes - i've never used it to "replace" a field inside a text file. anyways. my file looks like this: (again, I'm gonna replace the white space with a period since the Linuxquestions.org post filter white spaces into one):
i don't know how far you want to go with this thing..
but it looks pretty much like a little database. with perl you could store all your info in a file with each item seperated by a "|" and all the info for a given person on one line. then you can reformat with the script into a more readable form.
#!/usr/bin/perl
$newName = $ARGV[0];
$newSurname = $ARGV[1];
$newHeight = $ARGV[2];
$newAge = $ARGV[3];
open IFILE, ">&STDIN" or die "Unable to open stdin";
while(<IFILE>) #loops reading 1 line from file each time
{
$line = $_; # $_ is a line read from file, I take a copy of it, probably not needed, you could work on $_ itself.
$line =~ s/(\s+)name(\s+)/$1$newName$2/g;
$line =~ s/(\s+)Surname(\s+)/$1$newSurname$2/g;
$line =~ s/(\s+)height(\s+)/$1$height$2/g;
$line =~ s/(\s+)age(\s+)/$1$newAge$2/g;
print $line;
}
I use the args from the command line as the items that are going to be substituted.
Then the while statement will take a line from the file at a time and put it into $_ and will stop at EOF.
Then I copy the line (not needed but what the heck! : )
The the substitustion lines:
$line =~ s/(\s+)name(\s+)/$1$newName$2/g;
this says make $line = the result of the following operation on that line.
the s means substitute.
the expression is /<what to match>/<what to replace with>/ the g means global (replace more than just the first match, again this is not needed in this example).
now the bits between, first the what to match.
(\s+) says match 1 or more (the +) white space characters (the \s) and store the match in $1, then match the token 'name' then again match 1 or more white space characters and store the match in $2.
Now the substitution bit:
the $1 says to put here what was matched in the first wild card match (/s+) then put the contents of $newName then the contents of $2 (from the second (/s+)). etc
so this example workes only if you have white space seperating your text, if you were to use say '.' as in you example you would replace the (/s+) with (\.+) (that is and escaped . and a + in brackets)..
my test input file looked like:
Code:
surname name height
age junk
I hope this is what you were after..
Let me know if i've totally missed the task
The code is tested but hand typed into this post since I couldnt cut and past from my vmware linux window (just set it up), so there could be typos, so if it doesnt work, look for a simple typo..
There is probably a more simple way to do this, but this is the first method that came to mind
Distribution: RH 7.3/8.0/9.0, Debian Stable 3.0, FreeBSD 5.2, Solaris 8/9/10,HP-UX
Posts: 340
Original Poster
Rep:
It's not really a database. It's more like a buffer... It's a bit complex to explain how/why am I using it. But for our matters, I am saving only one entry of:
Then I have another process which reads a new variable, say, id, and put the new id instead of 8373. Each time this process is run, a new id is read, and replaces the older one, yet keeping the same structure and other field values. I need to keep the same spaces and everything as this data is passed to a serial terminal which will get confused if even a single character is shifted by 1.
So the new entry becomes:
Linus....Torvaldis....................123456....9999 (<-- new id)
.......brown.........1.75.............................
Anyways, for now I've kept it simple, and have used the echo rudimentary technique. It works, but it's not generic. I'll try to find a better solution. Thanks ocularbob. We can close the case
think you now have two parts that might go rather nicely together.
im about to go use some of what mr_seg posted in a couple of my own scripts.
there's alot of learning going on here....
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.