Linux - DesktopThis forum is for the discussion of all Linux Software used in a desktop context.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
hi
i have fasthost linux server web log files that i need to change to make them acceptable to Funnelweb.
I can change each line by hand but i need to write a script so i can make these changes automatically - but i don't know where to begin - i have never written one before.
the line looks like this:
floppyrecords.co.uk: [01/Nov/2006:00:06:55 +0000] 64.12.186.235 - - "GET /floppyrecords-free-mp3s/tantrum/maisies/8-tantrum-filmsongpart2.mp3 HTTP/1.1" 200 12913 "a url but blocked as my first posting" "asterias/2.0"
i need to change it to this:
64.12.186.235 - - [01/Nov/2006:00:06:55 +0000] "GET /floppyrecords-free-mp3s/tantrum/maisies/8-tantrum-filmsongpart2.mp3 HTTP/1.1" 200 12913 "a url but blocked as my first posting" "asterias/2.0"
ie delete:
floppyrecords.co.uk:
and move
[01/Nov/2006:00:06:55 +0000] or[*] perhaps??
to after:
64.12.186.235 - - or *.*.*.* - - perhaps??
First take a look at the way the webserver creates the logfile (which fields are printed). Apache has a lot of option to format the log(s) the way you want to, although not everything is possible.
If that is not an option, try the following sed one-liner:
sed 's/.*: \(\[.*\]\) \(.* - -\)\(.*\)/\2 \1 \3/' infile
You state that you don not have any experience with scripting (and probably one-liners), so here's a little breakdown of the sed command command:
With sed it is possible to change/rearrange content of a file (in normal use: one line at the time). This would look like this: sed 's/this/THAT/g' infile. This will change all instances of this into THAT. The g at the end makes sure it is done on the whole line, not just the first hit.
Regular expressions (regexp) and backreferencing is also possible:
.*: \(\[.*\]\) \(.* - -\)\(.*\) => this will split the line in 4 parts. Taking your example as infile it will brake that line up into the following pieces:
1) floppyrecords.co.uk: (including the space after the : )
2) [01/Nov/2006:00:06:55 +0000]
3) 64.12.186.235 - -
4) 'the rest of the line'
Parts 2, 3 and 4 have \( and \) around them, these are used so you can 'backreference' them (use them in the replace part of the statement). The replace part (\2 \1 \3) will discard part 1 and rearrange parts 2, 3 and 4.
The given sed statement does not save the output (you need to check for correctness first). If all is ok, add the -ibak option. This will make the changes in place and creates a backup of the original file.
Final command would look like this:
sed -ibak 's/.*: \(\[.*\]\) \(.* - -\)\(.*\)/\2 \1 \3/' infile
hi drunna
thanks for this - i don't have access to the fasthost servers (and i have asked them in the past about this).
So i'll have a go at the script.
do i replace infile with the file name or do i add that after infile (or have i completely misunderstood this, you do mean in a terminal don't you?)
hi drunna
that worked - thank you, a problem that has bugged me for over 5 years!
interestingly (for me only perhaps) the resulting logs didn't work in Funnelweb linux commandline - a segmentation fault. But did work for 2 out of 3 months in the windows version program - but without listing the referrals.
so i will now try and understand your script.
could there be hidden tabs in the logs that have to be accurately copied? how would you see them?
could there be hidden tabs in the logs that have to be accurately copied? how would you see them?
That could be.
You could use the od command to check: head -5 infile | od -t c
For example, I created a file (infile) with 2 lines:
Code:
$ cat infile
space: (space)
tab: (tab)
Using the od command on that infile gives the following:
Code:
$ head -2 infile | od -t c
0000000 s p a c e : ( s p a c e ) \n t
0000020 a b : \t ( t a b ) \n
0000032
A tab shows up as \t and a space is just that (an empty spot). The \n indicates an end of line.
I also use the head -2 infile part to only show the first 2 lines (could be of use if the infile is rather large)
Depending on where the tab is present, you might have to edit the sed command to include this tab. But this is only needed if the tab character is in the first part of the line (In this part: floppyrecords.co.uk: [01/Nov/2006:00:06:55 +0000] 64.12.186.235 - -). Everything after and including "GET....... is taken care of in the sed statement.
hi druuna
i ran head -5 infile | od -t c on both before and after sed files but there were no tabs. so its probably not that.
but thank you very much anyway, its very good to learn these things.
thanks for all your help.
tom
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.