LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - General (https://www.linuxquestions.org/questions/linux-general-1/)
-   -   extract specific content from file (https://www.linuxquestions.org/questions/linux-general-1/extract-specific-content-from-file-899000/)

vaibhavs17 08-23-2011 01:54 PM

extract specific content from file
 
there is file contains below list of 1000 sample messages. I need to extract very specific reference.Could you please provide the command, I have tried grep it did not work.

ExceptionData>2711231382 </ExceptionData>

Could you please help out

<?xml version="1.0" encoding="UTF-8" ?><ExceptionMessage><ExceptionMessageContext><ExceptionType>MATCHING</ExceptionType><ExceptionMessageFormat>XML</ExceptionMessageFormat><ExceptionMessageDetail><![CDATA[<matchingerror><eqFiTrade><salesInformation><salesmanID><salesmanIDType></salesmanIDType><salesmanIDValue>WIB </salesmanIDValue></salesmanID></salesInformation><bookingParty><entityID>840 </entityID><accountID><accountIDType>N</accountIDType><accountIDValue>83080201 </accountIDValue></accountID></bookingParty><investAgent><investAgentIDvalue></investAgentIDvalue><accountID><accountIDType>N</accountIDType><accountIDValue></accountIDValue></accountID></investAgent><counterParty><counterPartyType>S</counterPartyType><accountID><accountIDType>S</accountIDType><accountIDValue>DB INT AG </accountIDValue></accountID></counterParty><security><identifier><identifierType>0</identifierType><identifierValue>GB0005603997</identifierValue><identifierType>S</identifierType><identifierValue>LGEN.F</identifierValue></identifier><identifier><identifierType>QUICK</identifierType><identifierValue></identifierValue></identifier><securityCcy>GBP</securityCcy><securityCcy>GBP</securityCcy></security><transaction><tradeID><tradeIDType>ESPEAR </tradeIDType><tradeIDValue>2711231382 </tradeIDValue><tradeIDVersion>1</tradeIDVersion></tradeID><tradeDate>20110810</tradeDate><valueDate>20110815</valueDate><buySell>B</buySell><securityPrice><price>0.9938</price><isAvg>N</isAvg></securityPrice><securityNetPrice><price>0</price></securityNetPrice><quantity><quantityType> </quantityType><amount>1465</amount></quantity><source><tradeID><tradeIDType>GEMSLN</tradeIDType><tradeIDValue>ESPEAR-4335318873</tradeIDValue></tradeID></source><tradeLinkage><linkageType>BASKET</linkageType><linkID>LAUG10KJW</linkID></tradeLinkage><commission></commission><tax></tax><accruedInterest></accruedInterest><priceType></priceType></transaction><settlement><payCcy>GBP</payCcy><instructions><depotAccount></depotAccount><agentBicCode></agentBicCode><agentBicName></agentBicName><externalRefNo></externalRefNo><cptyPrtcpntID></cptyPrtcpntID><reasonCodes></reasonCodes><centerReference></centerReference></instructions></settlement><valuation><payCcyNetAmount>1455.92</payCcyNetAmount></valuation><claims><eSpearParentTradeRef></eSpearParentTradeRef><underlyingSecurity></underlyingSecurity><agentSourceID></agentSourceID></claims></eqFiTrade><matching><EspearReference>2711231382 </EspearReference><Exposure>0</Exposure><clientType>Contract Match Client</clientType><contractStatus>UnMatched</contractStatus><agentStatus>UnMatched</agentStatus><proposal>Yes</proposal><key>INTRASPEARTRADE2711231382 END</key><contactNumber></contactNumber><cassitNumber>100348</cassitNumber><Domicile>DE</Domicile><contractMatchRef></contractMatchRef></matching></matchingerror>]]></ExceptionMessageDetail></ExceptionMessageContext><Exception><ExceptionID>MM003221</ExceptionID><ExceptionData>2711231382 </ExceptionData><ExceptionAction>DELETE</ExceptionAction><ExceptionPriority>HIGH</ExceptionPriority></Exception></ExceptionMessage>

netnix99 08-23-2011 02:07 PM

vaibhavs17,

You can do a MORE on the file, then grep it...


more filename | grep "ExceptionData>2711231382 </ExceptionData>"

when you do this, it will show you what you are looking for. You have to use the quotes so the string does not process the special characters in the text you are searching for.

vaibhavs17 08-23-2011 02:16 PM

My dear friend I am looking for 1000 of messages

Quote:

Originally Posted by netnix99 (Post 4451340)
vaibhavs17,

You can do a MORE on the file, then grep it...


more filename | grep "ExceptionData>2711231382 </ExceptionData>"

when you do this, it will show you what you are looking for. You have to use the quotes so the string does not process the special characters in the text you are searching for.


Nylex 08-23-2011 02:44 PM

What do you mean when you say grep didn't work? Post the command you used to extract the line(s) you wanted. Then, someone may be able to help you.

TB0ne 08-23-2011 02:46 PM

Quote:

Originally Posted by vaibhavs17 (Post 4451347)
My dear friend I am looking for 1000 of messages

Ok...then you should read the man page on grep, and look up some examples of regular expressions.

Based on your single example above, you're trying to match a 10 digit number, that has a space after it. So:
Code:

cat <your file name> | grep -o "ExceptionData>[[:digit:]]\{10\} </ExceptionData>" > <name of your output file>
Will return JUST the exception data tags and data, one per line, into an output file.

netnix99 08-23-2011 02:54 PM

Quote:

Originally Posted by vaibhavs17 (Post 4451347)
My dear friend I am looking for 1000 of messages

..... in this single file, or you are looking for this information (ExceptionData>2711231382 </ExceptionData>) in 1000's of files...

sorry for the misunderstanding...

TB0ne 08-23-2011 03:15 PM

Quote:

Originally Posted by netnix99 (Post 4451392)
..... in this single file, or you are looking for this information (ExceptionData>2711231382 </ExceptionData>) in 1000's of files...
sorry for the misunderstanding...

Yeah, I'm not clear on it either, but the command above will work anyway. Instead of "cat <one file name>", you can do "cat *", and it will go through ALL the files in a directory, look for that pattern, and output it to a single output file.

trey85stang 08-23-2011 04:58 PM

Quote:

Originally Posted by TB0ne (Post 4451379)
Ok...then you should read the man page on grep, and look up some examples of regular expressions.

Based on your single example above, you're trying to match a 10 digit number, that has a space after it. So:
Code:

cat <your file name> | grep -o "ExceptionData>[[:digit:]]\{10\} </ExceptionData>" > <name of your output file>
Will return JUST the exception data tags and data, one per line, into an output file.

cat is not needed, just put the file name after grep string.

TB0ne 08-24-2011 09:48 AM

Quote:

Originally Posted by trey85stang (Post 4451477)
cat is not needed, just put the file name after grep string.

Yes, if you want to do just one file, but the OP mentioned thousands, and may need the results into a different output file.

trey85stang 08-24-2011 02:41 PM

Quote:

Originally Posted by TB0ne (Post 4452153)
Yes, if you want to do just one file, but the OP mentioned thousands, and may need the results into a different output file.

bash takes care of that for grep just as it does for cat, just use an * for the file name.


All times are GMT -5. The time now is 02:55 PM.