ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
So I've been tasked with parsing a log file and converting into a nice single line log. The log file outputs information in the following format:
Code:
1 ticket(s) written to the database."^M
"Information","53","04/11/2011 11:08:19 AM","DT Archiver","ARCHIVER3","None","N/A","List of ticket(s):^M
- XXX#999999^M
The problem is that the number of lines can vary from ticket to ticket. Some tickets are 3 lines some are 6 lines. Is there some way to have SED match across those lines and strip out the new line characters for each ticket (Making it a nice single line entry)? Or should I move on to a different tool?
How do you know how long a ticket is? Is it just from "X tickets(s)" to the next time it's in the file? With sed, the best I can think of is this horrific thing:
Code:
sed 's/[[:digit:]]\+ ticket/@&/' YOUR_FILE | sed -e :a -e '$!N;s/\n[^@]/ /;ta' -e 'P;D' | sed 's/^@//'
This will find all lines which begin "_number_ ticket" and change it to "@_number_ ticket", then goes through the file thus produced, appending any line that doesn't begin with '@' to the previous line, then goes through again and removes all the '@'s. It's rather assuming that no line begins with '@', and anyway is a horrendously ugly solution.
I would use a different programming tool, such as awk.
I have to agree with the others that a solution cannot really be presented as you have not provided enough detail.
To form a solution we (and you) need to know what represents the start and end of a ticket and if any of these things vary, then we will need to know all the variations.
I'll try and add some detail here. The beginning of a ticket is indicated by the line "1 ticket(s) written to the database."^M" Every line following this is part of the ticket. The last line of the ticket is "- XXXX#814591^M".
I wasn't really expecting a simple solution, and given the helpful responses I've already gotten, I think I'm going to have to move towards awk or another tool.
Well I still would have liked some more examples, but assuming that multiple lines might look like:
Code:
1 ticket(s) written to the database."^M$
"Information","53","04/11/2011 11:08:19 AM","DT Archiver","ARCHIVER3","None","N/A","List of ticket(s):^M$
- XXX#999997^M$
2 ticket(s) written to the database."^M$
"Information","53","04/11/2011 11:08:19 AM","DT Archiver","ARCHIVER3","None","N/A","List of ticket(s):^M$
- XXX#999998^M$
3 ticket(s) written to the database."^M$
"Information","53","04/11/2011 11:08:19 AM","DT Archiver","ARCHIVER3","None","N/A","List of ticket(s):^M$
- XXX#999999^M$
Then maybe this is what you are looking for:
Code:
$ awk 'BEGIN{RS = "[\r\n]+"}ORS=/^-/?"\n":" "' file
1 ticket(s) written to the database." "Information","53","04/11/2011 11:08:19 AM","DT Archiver","ARCHIVER3","None","N/A","List of ticket(s): - XXX#999997
2 ticket(s) written to the database." "Information","53","04/11/2011 11:08:19 AM","DT Archiver","ARCHIVER3","None","N/A","List of ticket(s): - XXX#999998
3 ticket(s) written to the database." "Information","53","04/11/2011 11:08:19 AM","DT Archiver","ARCHIVER3","None","N/A","List of ticket(s): - XXX#999999
If you still require a windows based output, simply put "\r" prior to "\n"
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.