LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   need help with text manipulation (https://www.linuxquestions.org/questions/programming-9/need-help-with-text-manipulation-510473/)

pcorajr 12-14-2006 01:58 PM

need help with text manipulation
 
Hello all I'm a newbie to programming and scripting but so far I been working with a friend on a bash script to automate the download/decryption/text-editing/upload of certain files.

Basically our script downloads a file from a predetermined FTP site, the file is encrypted using PGP so the script also decrypts the file and names it. The script has a lot of code that enables it to send an e-mail wanning to me if it encounters an error (File not been in the ftp site, name of file not matching or file been corrupted etc etc). Anyways its been a fun project so far but now i have hit a wall.

My next step in the project is to do some basic text editing on the file basically i need to delete the first and last line of the file. the file looks something like this:

01030e939-3-20339302039393039403942-12-03920399203
jdkfjei349893894838294
98320498023948928490483489
3924803928409238409388
92304803928402384098320498
9203893r023jf039840390894403984
293-0482390840923840983209
32094802398409238409823039
39480293840923840983204980
3204832948023984023809483029480
932482039840239840392849048
230482309-02398402398402398409238
8320p498230948230948230984230984
4954868

Everyday i will be downloading a file that will have more or less data. but everytimne i have to delete the first and last line, i have read and tried AWK with no luck

anybody can point me in the right direction please.

Byenary 12-14-2006 02:12 PM

sed
 
I think you can do such thin with the sed instruction

pcorajr 12-14-2006 02:20 PM

Thanks much for pointing out SED i found some stuff on line that looks like is going to help with this:

# delete the first 10 lines of a file
sed '1,10d'

so is this right ?

#Delete the first line of a file
sed '1d'

also to delete last line

sed '$d'

Thanks much.

pcorajr 12-14-2006 03:11 PM

For some reason when I try to combine sed '1d' and sed '$d' They dont play nice =(

chrism01 12-14-2006 05:22 PM

Well, this suggestion isn't terribly elegant, but if you are using bash, you can use
wc -l
to find how many lines in the file, then subtract 1 to get newlinecnt and use
head, tail to get all but last line & first lines eg if t.t has 9 lines, then
head -8 t.t|tail -7
chops off 1st & last lines.

ntubski 12-14-2006 07:27 PM

Quote:

Originally Posted by pcorajr
For some reason when I try to combine sed '1d' and sed '$d' They dont play nice =(

How did you combine them? If you did something like
Code:

sed '1d$d'
it won't work because sed will think that is one (malformed) compound command. Try
Code:

sed -e '1d' -e '$d'
works for me.

matthewg42 12-14-2006 07:46 PM

You can either specify multiple commands using multiple -e arguments, or separate commands with a ;
Code:

matthew@chubby:~/tmp$ cat testfile
one Bob a-dealin'
two Bobs a-dealin'
three Bobs a-dealin'
four Bobs a-dealin'
five Bobs a-dealin'
matthew@chubby:~/tmp$ sed -e '1 d' -e '$ d' testfile
two Bobs a-dealin'
three Bobs a-dealin'
four Bobs a-dealin'
matthew@chubby:~/tmp$ sed '1 d ; $ d' testfile
two Bobs a-dealin'
three Bobs a-dealin'
four Bobs a-dealin'


ghostdog74 12-14-2006 08:16 PM

Python alternative:

Code:

data = open("file").readlines()
open("output","w").write(''.join(data[1:-1]))


matthewg42 12-14-2006 08:47 PM

Quote:

Originally Posted by ghostdog74
Python alternative:

Code:

data = open("file").readlines()
open("output","w").write(''.join(data[1:-1]))


It's OK for short files, but if you have a 10 GiG file with millions of lines... not so hot.

ghostdog74 12-14-2006 10:35 PM

Quote:

Originally Posted by matthewg42
It's OK for short files, but if you have a 10 GiG file with millions of lines... not so hot.

true, but there are ways to do it in Python,eg memory map, seek(),tell(), etc even a simple for loop over the lines ...
correct me if i am wrong, but even if sed, awk, grep over 10G file, its slow too right?

matthewg42 12-14-2006 10:46 PM

Quote:

Originally Posted by ghostdog74
true, but there are ways to do it in Python,eg memory map, seek(),tell(), etc even a simple for loop over the lines ...
correct me if i am wrong, but even if sed, awk, grep over 10G file, its slow too right?

Well they'll be slow in that they'll need to read and filter a lot of data, but they'll only keep one line in memory at a time (or a few in the case of a sed program using the N command).

However, I think the approach above with python will try slurp the whole file into memory.

ghostdog74 12-15-2006 01:31 AM

Quote:

Originally Posted by matthewg42
Well they'll be slow in that they'll need to read and filter a lot of data, but they'll only keep one line in memory at a time (or a few in the case of a sed program using the N command).

sure. with a looping in Python, it can achieve "one line in memory at a time"

Quote:

However, I think the approach above with python will try slurp the whole file into memory.
yes, you are right. the alternative assumes the file is small enough to fit into the memory of the system which its running.

pcorajr 12-15-2006 07:33 AM

Quote:

Originally Posted by ntubski
How did you combine them? If you did something like
Code:

sed '1d$d'
it won't work because sed will think that is one (malformed) compound command. Try
Code:

sed -e '1d' -e '$d'
works for me.


Yes that was the mistake i was doing thanks much

matthewg42, thank you for the example it really helped.

Thank you all very much for taking the time to help me out.


All times are GMT -5. The time now is 06:06 PM.