LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   sed match (https://www.linuxquestions.org/questions/programming-9/sed-match-876004/)

ted_chou12 04-19-2011 07:42 PM

sed match
 
Hi, I have a new problem with sed matching:
Code:

sed -n '/Content-Disposition/{ /".*"/!N}
                        /Content-Disposition/{ /".*"/!N}
                        /Content-Disposition/{ /".*"/s/\n//g;s/.*"\(.*\)"/\1/p}'

matches my string perfectly like:
Code:

...
Content-Disposition: attachment;
        filename="[isoHunt] fc19192d7eb0f52392ddf96ce10b694eabee63c4.torrent"

But can't seem to match this:
Code:

Content-Disposition: attachment; filename=
        "=?gb2312?B?ob6ukNPy19bEu71Nob+h7yU1QrrDz+u45tRWxOMrtdq2/ry+JTVEJTVCS2lt?=
 =?gb2312?B?aStuaStUb2Rva2UrMm5kK1NlYXNvbiU1RCU1Qs3qveG6z7yvJTVEJTVCODQ4?=
 =?gb2312?B?eDQ4MCU1RCU1Qrex83clNUQudG9ycmVudA==?="

Thanks,
Ted

grail 04-19-2011 08:28 PM

Well not having great sed juju, I would guess at the issue being you have not setup looping to cover more than 2 lines.

I would look at this and implement it to replace the first 2 lines of your script with one that loops until it finds what you need.

Let me know if you get stuck?

crts 04-20-2011 01:48 AM

Hi,

grail is right. Your sed is not able to match over multiple lines. In addition, You will get some whitespace issues in the filename when you simply erase the '\n'. You will also have to erase the trailing blanks in the lines. This worked with your sample data:
Code:

sed -rn '/Content-Disposition/ {:a /filename=".*"/! {N;s/\n[[:blank:]]*//g;ba};s/.*"(.*)"/\1/;p}' file
Notice the -n -r switch. It enables extended RegEx so that you do not have to escape the '()' brackets.

grail 04-20-2011 02:04 AM

Slight correction for crts' typo, the -r switch is the one to notice ;)

I do have a slight OT question for crts though - are you able to explain the difference between using ba as opposed to ta?
They both seem to branch but when do you use which?

crts 04-20-2011 02:40 AM

Quote:

Originally Posted by grail (Post 4330209)
Slight correction for crts' typo, the -r switch is the one to notice ;)

Right :)
Quote:

I do have a slight OT question for crts though - are you able to explain the difference between using ba as opposed to ta?
They both seem to branch but when do you use which?
The difference is that 'b' is an unconditional branch. The 't' command, however, jumps only if a substitution was made before. So it only makes sense to use 't' if you used the 's///' command before.
The 'T' command works similiar but it branches only when no substitutions were made by a preceding 's///'.

This sed works similiar:
Code:

sed -rn '/Content-Disposition/ {:a /filename=".*"/! N;s/\n[[:blank:]]*//g;ta;s/.*"(.*)"/\1/;p}' file
Notice, that if the condition '/filename=".*"/!' is matched now only the next line will be read. The 's///' command will execute in any case. If it replaces a '\n' then the 't' command will branch to label ':a'. A replacement can only happen if '/filename=".*"/!' is true and a new line is read in consequence.

One more thing to know, is that the condition 's/// has made a substitution' is being reset when a 't' command jumps.

4dirk1 04-20-2011 03:36 AM

Hi Sirs,

I have a similar issue, instead of replace the string with multiple lines, I need to replace a keyword in an xml file with the contents of a txt file containing multiple lines. the xml file has a keyowrd 'kw01', and i want to replace this string with the contents of a file named fatal_alerts.txt. is this possible via sed? i badly need this. TIA!

keyword to be replaced:
kw01

fatal_alerts.txt contents:
RAISEDATTIME
--------------------
DESCRIPTION
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
18-APR-2011 06:20:32
Fatal error in Application GATE:
<Error code: 68
Caught by: CTCS_BatchSchedulerBO.HandleException
Raise by: CTCS_FTOServerBTM.ValidateMessageCount()
Message: For Interface: FTO:1315001 in File : 190107079531.txt
File Message Count Mismatch. File reported count: 447 actual message count 797.>

grail 04-20-2011 03:49 AM

thanks crts ... complete answer as always :)

crts 04-20-2011 03:49 AM

Quote:

Originally Posted by 4dirk1 (Post 4330273)
Hi Sirs,

I have a similar issue, instead of replace the string with multiple lines, I need to replace a keyword in an xml file with the contents of a txt file containing multiple lines. the xml file has a keyowrd 'kw01', and i want to replace this string with the contents of a file named fatal_alerts.txt. is this possible via sed? i badly need this. TIA!

keyword to be replaced:
kw01

fatal_alerts.txt contents:
RAISEDATTIME
--------------------
DESCRIPTION
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
18-APR-2011 06:20:32
Fatal error in Application GATE:
<Error code: 68
Caught by: CTCS_BatchSchedulerBO.HandleException
Raise by: CTCS_FTOServerBTM.ValidateMessageCount()
Message: For Interface: FTO:1315001 in File : 190107079531.txt
File Message Count Mismatch. File reported count: 447 actual message count 797.>

Do not crosspost! it is against LQ rules. I already answered you in this thread.


All times are GMT -5. The time now is 07:28 PM.