Modify a text files with awk/sed/perl
Hi,
I have a huge text file where I have to filter different command lines which starts everytime with the same characters. So here's a litte example ------------------------------ This is the first line with no information This is the second important line with the following Code in the middle RC0xxx Command = Important code" This is another line with no important information This is another line with no important information This is a further important line in the middle with RC0xxx Command = Important code" .. ... .... and so on ------------------------------------ From these code "RC0xxx Command = Important code" I need the following structure: RC0xxx;Important Code RC0xxx;Important Code This output can be written to a separat file. I also have to take care that I have no doubles inside the new files. It's equal whether it's written in shell/perl/awk/sed or mixed code. Thanks for your help :-) climber75 |
What have you tried so far? Anyway, here is a little awk code:
Code:
/RC0xxx Command = / { |
Thanks for your really quick answer!!! I tried it with awk but I'm not so familiar with it. So I'm amused that you use it :-)
Maybe I've to post a view lines of the real text file so it's getting clearer what I want. ------------------------------------------------------------------------------------------------------------------- COMMANDLINE CHANGE FROM APPROVED LIST patchinstall.exe /g:144 /n /z:s /f /c:90 /p /t:30 /m:"patchauthorize.xml" changed to PatchInstall.exe /g:168 /n /z:s /f /c:90 /p /t:30 /m:"PatchAuthorize.xml" Microsoft Update - Mandatory November - Ran at - 21.11.2006 11:29:00 Program Name = Microsoft Update - Mandatory November ID = RC020376 Commandline = PatchInstall.exe /g:168 /n /z:s /f /c:90 /p /t:30 /m:"PatchAuthorize.xml" Ending process 21.11.2006 11:30:01 21.11.2006 13:00:01 NEW ADVERTISEMENT LAP - Ran at - 13.01.2007 18:00:00 Program Name = LAP ID = RC020232 Commandline = wscript.exe ResetPassword.vbs Current NEW ADVERTISEMENT RC_RealPlayer_10.5 - Ran at - 13.01.2007 18:00:00 Program Name = Remove_RC_RealPlayer_8.0 ID = RC020382 Commandline = wscript.exe Remove_Legacy_Realplayer_SMS.vbs NEW ADVERTISEMENT RC_RealPlayer_10.5 - Ran at - 13.01.2007 18:00:00 Program Name = RC_Real_Player_10.5_Upgrade_3 ID = RC020383 Commandline = wscript.exe \\ccanet\approot\installs\apps\RC_RealPlayer_10.5\RC_RealPlayer_10.5_push_2.vbs NEW ADVERTISEMENT Microsoft Update - Critical January 2007 - Ran at - 13.01.2007 18:00:00 Program Name = Microsoft Update - Critical January 2007 ID = RC020384 Commandline = PatchInstall.exe /g:312 /n /z:s /f /c:90 /p /t:30 /m:"PatchAuthorize.xml" NEW ADVERTISEMENT Microsoft Update - Critical January 2007 - Ran at - 13.01.2007 18:00:00 Program Name = Microsoft Update - Critical January 2007 LoggedOff ID = RC020385 Commandline = PatchInstall.exe /g:312 /n /z:s /f /c:5 /t:30 /m:"PatchAuthorize.xml" Ending process 13.01.2007 19:00:02 NEW ADVERTISEMENT Microsoft Update - Critical January 2007 - Ran at - 13.01.2007 18:00:00 Program Name = Microsoft Update - Critical January 2007 LoggedOff ID = RC020385 Commandline = PatchInstall.exe /g:312 /n /z:s /f /c:5 /t:30 /m:"PatchAuthorize.xml" 13.01.2007 20:30:00 Ending process 13.01.2007 20:30:00 13.01.2007 22:00:00 ... ... and so on ------------------------------------------------------------------------------------------------------------ Here's the output how it should be: RC020376;PatchInstall.exe /g:168 /n /z:s /f /c:90 /p /t:30 /m:"PatchAuthorize.xml" RC020232;wscript.exe ResetPassword.vbs Current RC020382;wscript.exe Remove_Legacy_Realplayer_SMS.vbs RC020383;wscript.exe \\ccanet\approot\installs\apps\RC_RealPlayer_10.5\RC_RealPlayer_10.5_push_2.vbs RC020384;PatchInstall.exe /g:312 /n /z:s /f /c:90 /p /t:30 /m:"PatchAuthorize.xml" RC020385;PatchInstall.exe /g:312 /n /z:s /f /c:5 /t:30 /m:"PatchAuthorize.xml" All informations have the following same: Begin with RC0 and end with CRLF (Carriage Return, Line Feed) I hope this helps Thanks climber75 |
Looks pretty easy to parse with any of them if, as appears likely, the wanted data extends to eol.
What have you tried ??? - better to get help with specific problems than to expect others to do your work for you. |
Quote:
|
You can use SED to strip out the desired patterns.....something like this:
sed -n 's/.*\(pattern\)/\1/p' filename > newfilename To this you simply add another SED command (using -e) to replace the "Commandline = " with ";" Alternatively, it call all be done inside one SED "s" command. Really good SEd tutorial here: http://www.grymoire.com/Unix/Sed.html |
Normally I use shell scripting but in this case I thought awk would be helpful. So I bought a awk book
and go step by step. I still prefer awk to solve this problem because of the learning effect. So If you have already time to help me this would be nice! Why this script: Every RCxxx entry of this textfile cause a e-mail as long as I put it in the demanded format. So we are everytime informed of automatic installations which happens in the background. Regards climber75 |
A very good awk book is the official guide, here. Anyway, following my previous post, you can try something like
Code:
/RC0..... Commandline =/ { |
Perl:
Code:
#!/usr/bin/perl -w See http://perldoc.perl.org/ f you want more perl explanations, or ask again here. |
a way in bash
Code:
#!/bin/bash Code:
#!/bin/bash Code:
#!/bin/bash Code:
awk -F"=" '(NF) && $(NF-1)~/^ RC0/ {split($(NF-1),b," "); print b[1]";"$NF}' blah.txt |
Obligatory perl one-liner offering:
Code:
perl -e 'while(<>){if($_ =~ m/(RC0[0-9]+)\s*Commandline\s*=\s*(.+$)/){ print "$1;$2";}}' --- rod. |
I love these multiple contributions using different languages! :)
PS - Waiting for a sed and/or python solution... |
Code:
awk '{ Code:
for n,line in enumerate(open("file")): |
I see a sed contribution is still missing. Try this (note I just stole theNbomr's regex, and told sed to use regex-extended)
Code:
sed -nr 's:.*(RC0[0-9]+)\s*Commandline\s*=\s*(.+$):\1;\2:p' testreg.txt |
OK OK I am SUCH a newbie that I don't get it. I admit it!
So I need to do something similar, but my example is easier and your answer to my question might help me get the above reply's: Thank You in advance. Here is the situation. Linux RHEL4.6 I want to disable "CTRL-ALT-DEL" in the /etc/inittab I want to replace: ca::ctrlaltdel:/sbin/shutdown -t3 -r now With: # Changed 8-5-08 -dfezz1 (disabling ctrl-alt-del at console) ca:12345:ctrlaltdel:/bin/echo "CTRL-ALT-DEL is disabled" I have tried the simplest SED I know: $sed 's/replace_please/REPLACED_THX/g' /tmp/dummy $sed 's/ca::ctrlaltdel:/sbin/shutdown -t3 -r now/ca:12345:ctrlaltdel:/bin/echo "CTRL-ALT-DEL is disabled"/g' /tmp/dummy As you can tell from my feeble attempt, it didn't work, spaces and quotes seem to be the main reason. Any help??? PS NO LAUGHING....I HATE TO BE LAUGHED AT :) just joking Thanks -dfezz1 My /etc/inittab: For Ref. ######################################## [root@myserver Project_Server_Files]# cat /etc/inittab # # inittab This file describes how the INIT process should set up # the system in a certain run-level. # # Author: Miquel van Smoorenburg, <miquels@drinkel.nl.mugnet.org> # Modified for RHS Linux by Marc Ewing and Donnie Barnes # # Default runlevel. The runlevels used by RHS are: # 0 - halt (Do NOT set initdefault to this) # 1 - Single user mode # 2 - Multiuser, without NFS (The same as 3, if you do not have networking) # 3 - Full multiuser mode # 4 - unused # 5 - X11 # 6 - reboot (Do NOT set initdefault to this) # id:3:initdefault: # System initialization. si::sysinit:/etc/rc.d/rc.sysinit l0:0:wait:/etc/rc.d/rc 0 l1:1:wait:/etc/rc.d/rc 1 l2:2:wait:/etc/rc.d/rc 2 l3:3:wait:/etc/rc.d/rc 3 l4:4:wait:/etc/rc.d/rc 4 l5:5:wait:/etc/rc.d/rc 5 l6:6:wait:/etc/rc.d/rc 6 # Trap CTRL-ALT-DELETE ca::ctrlaltdel:/sbin/shutdown -t3 -r now # When our UPS tells us power has failed, assume we have a few minutes # of power left. Schedule a shutdown for 2 minutes from now. # This does, of course, assume you have powerd installed and your # UPS connected and working correctly. pf::powerfail:/sbin/shutdown -f -h +2 "Power Failure; System Shutting Down" # If power was restored before the shutdown kicked in, cancel it. pr:12345:powerokwait:/sbin/shutdown -c "Power Restored; Shutdown Cancelled" # Run gettys in standard runlevels 1:2345:respawn:/sbin/mingetty tty1 2:2345:respawn:/sbin/mingetty tty2 3:2345:respawn:/sbin/mingetty tty3 4:2345:respawn:/sbin/mingetty tty4 5:2345:respawn:/sbin/mingetty tty5 6:2345:respawn:/sbin/mingetty tty6 # Run xdm in runlevel 5 x:5:respawn:/etc/X11/prefdm -nodaemon ######################################################################## |
All times are GMT -5. The time now is 06:47 PM. |