LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   why can't I operate on a file and replace it (https://www.linuxquestions.org/questions/linux-newbie-8/why-cant-i-operate-on-a-file-and-replace-it-4175444022/)

atjurhs 01-03-2013 03:42 PM

why can't I operate on a file and replace it
 
hi guys,

i want to do something that's simple but i can't figure out the right syntax, so far everything i've tried (except creating a temp file that i have to later rename) destroys the original file leaving it empty, here's what I'm trying to do.....

awk -F " " '{print $1, $2, $3, $48*(-1), $5}' file.dat > file.dat

so that the output file is a replacement (with the same name) as the input file, like i said, it works fine if i give it a different output file name but i don't want to do that

is there something easy?

Tabby

AwesomeMachine 01-03-2013 03:52 PM

Temp File
 
Generally, you have to use a temp file.

theNbomr 01-03-2013 03:59 PM

If your task can be accomplished with sed or Perl, you can use the '-i' switch to edit the file 'in-place'.
--- rod.

atjurhs 01-03-2013 04:45 PM

:(

i have sed, but i knew the math and print part of this was so easy to do in awk so i wrote the one-liner in awk. maybe there's an easy way to do math and print in sed???

i know it's not exactly right but could i do something like....

awk -F " " '{print $1, $2, $3, $48*(-1), $5}' inputfile.dat > outputfile.dat | mv outputfile.dat inputfile.dat

thanks guys for your help!

Tabby

theNbomr 01-03-2013 06:08 PM

I'm pretty sure math is out of range for sed, but I can offer a Perl one-liner (test without the -i switch, first):

Code:

perl -i -e 'while(<>){ @z=split; print "$z[0], $z[1], $z[2], ",$z[47]*-1,"$z[4]\n";}' inputfile.dat
--- rod.

atjurhs 01-03-2013 06:49 PM

hi Rod,

i'm REALLY a newbie to script writing and barely waddle thru awk sed and bash script and usually with help.

i know perl is really powerful and really with math stuff, but it looks so cryptic, idk....

thanks soooooo much for the script! i'll give it a go tomorrow....

Tabby

rknichols 01-03-2013 07:22 PM

Quote:

Originally Posted by theNbomr (Post 4862434)
I'm pretty sure math is out of range for sed,

The sed language has been shown to be Turing-complete, so math is at least theoretically within its range. Whether that use of sed is practical, ..., well it would almost certainly be easier than coding that in Ook!

theNbomr 01-03-2013 07:24 PM

I think if you examine it somewhat, you'll see that it quite closely resembles the Awk script. The key differences are:
  • Perl needs explicit looping constructs ( while(<>){ .... } )
  • Perl needs to explicitly split into fields ( 'split', and the default separator is whitespace, just like Awk )
  • Perl uses zero-based array indexing in contrast to the built-in Awk variables named with non-zero positive integers

Yes, Perl is cryptic to those who have not drunk the magical elixir....

--- rod.

theNbomr 01-03-2013 07:30 PM

Quote:

Originally Posted by rknichols (Post 4862480)
The sed language has been shown to be Turing-complete, so math is at least theoretically within its range. Whether that use of sed is practical, ..., well it would almost certainly be easier than coding that in Ook!

I'd much rather see the sed version than the Ook version. Really, that would be impressive, either by virtue of the usefulness of learning something new, or by the length someone might go to to accomplish it.
--- rod.

AnanthaP 01-03-2013 07:53 PM

Quote:

awk -F " " '{print $1, $2, $3, $48*(-1), $5}' file.dat > file_temp.dat
rm file.dat
mv file_temp.dat file.dat
I think this is what was meant in post #2.

By the way, perl is readily available with ALL distros.

OK

ntubski 01-03-2013 08:12 PM

Quote:

Originally Posted by AnanthaP (Post 4862494)
Quote:

awk -F " " '{print $1, $2, $3, $48*(-1), $5}' file.dat > file_temp.dat
rm file.dat
mv file_temp.dat file.dat
I think this is what was meant in post #2.

The rm is not necessary.

Quote:

Originally Posted by theNbomr (Post 4862481)
I think if you examine it somewhat, you'll see that it quite closely resembles the Awk script. The key differences are:[list][*]Perl needs explicit looping constructs ( while(<>){ .... } )[*]Perl needs to explicitly split into fields ( 'split', and the default separator is whitespace, just like Awk )

Quote:

perlrun:
...
-a

turns on autosplit mode when used with a -n or -p. An implicit split command to the @F array is done as the first thing inside the implicit while loop produced by the -n or -p.
...
-n

causes Perl to assume the following loop around your program, which makes it iterate over filename arguments somewhat like sed -n or awk:
Quote:

Originally Posted by theNbomr (Post 4862487)
I'd much rather see the sed version than the Ook version. Really, that would be impressive, either by virtue of the usefulness of learning something new, or by the length someone might go to to accomplish it.
--- rod.

Web search turns up dc.sed.

AwesomeMachine 01-03-2013 09:11 PM

You don't want to blow away the original file usually right away, at least not until you test the result of the changes. Coreutils programs have some bizarre behaviors under certain conditions. For instance: if you overwrite the first 20 bytes of a file, using the dd command, the outfile will be 20 bytes, unless you use notrunc. Sed has issues with certain charaters used in file names. So, you really want to keep the original file until the result file is tested, and then rm the original, or however you want to do it. I usually use cat.

jpollard 01-04-2013 07:44 AM

The problem you are doing is reading a data file and outputting an update to the file.

When the input and output are the same file, then the output will modify the input... causing problems.

Tools like sed use a tmp file internally, and then do the equivalent of "mv tmp originalfilename".

Think about how updates occur. If the original file had:

Code:

a
b
c

And you want to update it by replacing b with bb. If you use the same filename for both input and output, what happens is:
Code:

a
bb

because the second b in your update replaces the newline at the end with a b, and then puts a newline after that. Then there is the newline from the former "c" line, which has newline,newline...

The only saving grace (for very small files) is that the system buffers (or the runtime library buffers) could hold the entire file in memory... and give you the illusion of a tmp file. That doesn't always work either (updates to a file go to the same system buffer as used in input... though if the input has already been read it isn't a problem).

This is the same problem as having two people edit a file simultaneously... the output will be whoever closes the file last...

There is also the problem of making data shorter (replacing bb b for instance). You might get a duplicated data... or other funny looking stuff. This is closely related to issues with random access files (usually opened read/write). It works with fixed length records.. but if you extend/shorten a record your file gets corrupted unless you also do something to compensate (like using a temp file).

rknichols 01-04-2013 12:26 PM

Quote:

Originally Posted by jpollard (Post 4862815)
The problem you are doing is reading a data file and outputting an update to the file.

When the input and output are the same file, then the output will modify the input... causing problems.

It's worse than that. When you try to run something like
Code:

awk '......' file.dat >file.dat
The shell will open file.dat for output, truncating it to zero length, before awk is even invoked, and all awk will see is the empty input file.

shivaa 01-04-2013 10:59 PM

Quote:

... So that the output file is a replacement (with the same name) as the input file, like i said, it works fine if i give it a different output file name but i don't want to do that

is there something easy?

As you mentioned, to store the output of some operation in input file again, though it can be done by combining two commands like first command plus mv.

But in order to do it in a one-liner command, you can use process substitution, as:-
Code:

(cat outfile <(command...infile))> infile
For example, if infile.txt has:
Code:

A A
B
C C
D D
E
F F

Then invoke following:-
Code:

(cat outfile.txt <(awk 'NF>=2 {print $0}' infile.txt))> infile.txt
And now infile.txt will have:-
Code:

A A
C C
D D
F F

So make a try on this.


All times are GMT -5. The time now is 12:28 AM.