Remove Word wrap from a file in Unix
Hi friends,
I am trying to remove word wrap from a file in Unix.. The contents of the file are as below.. (just as example) Entries: ENTRY TIME SUMMARY entry-000 2014-10-13 15:49:06 oracle_agent, ep01client01.comfin.ge.com, User logged in via CLI entry-001 2014-10-13 15:49:07 oracle_agent, ep01client01.comfin.ge.com, User logged out of CLI entry-002 2014-10-13 15:49:08 oracle_agent, ep01client01.comfin.ge.com, User logged out of CLI entry-003 2014-10-13 16:06 oracle_agent, ep01client01.comfin.ge.com, User logged in via CLI entry-004 2014-10-13 16:06:02 oracle_agent, ep01client01.comfin.ge.com, User logged out of CLI entry-005 2014-10-13 16:09:05 oracle_agent, ep01client01.comfin.ge.com, User logged in via CLI entry-006 2014-10-13 16:09:07 oracle_agent, ep01client01.comfin.ge.com, User logged in via CLI entry-007 2014-10-13 16:09:08 oracle_agent, ep01client01.comfin.ge.com, User logged out of CLI As you may note, the string 'logged out of CLI' goes on the next line , since it exceeds the screen width of 80 chars. I want to have all of the contents for a line starting with entryxxx , in the same line and not have it jump over to the new line. Please advise , what would be the best way to do it. Thanks in advance.. Best Dev |
What program are you using the view the file?
|
Hello Frank,
Thanks for your reply. The file is normal text file which is generated from another server.. I want to remove the word wrap and filter out a few other things from the file. Basically want to generate a converted file with no word wrap (so that all those 'logged in via CLI' or 'logged out via CLI' lines are inline with the previous line) and next filter out a few other contents from the file which are not useful for me. I use vi to open the file in Linux. Kind Regards & Thanks Dev |
Hi,
I think frankbell's point is that word wrap like this is normally an artifact of the program you are using to view the file not the file itself. He is looking for some sort of confirmation that the file really does contain those newlines. Evo2. |
1 Attachment(s)
Hi Evo,
Yes the file does contain new lines... Attaching a sample log file. Kind Regards & Thanks Debasish |
I think evo2 explained what I was thinking better than I did. I also should have asked on what OS the file was created.
I see in the screenshot references to Oracle. Was that Oracle on Windows or Linux? Windows and *nix handle new-lines differently. This link explains it nicely: http://www.cs.toronto.edu/~krueger/c...e-endings.html |
That isn't word wrap... That is the formatting, and the second line is deliberately indented.
You can try some perl: Code:
#!/usr/bin/perl |
Thanks jpollard.. that works really good.
Cheers Dev |
Hi jpollard,
I did a little change to my initial code (used diff and awk) which resulted in a file formatted as below, entry-110832 2014-10-21 21:21:05 zfsauditlogger, ed01client01.comfin.ge.com, User logged in via CLI entry-110833 2014-10-21 21:21:05 zfsauditlogger, ed01client01.comfin.ge.com, User logged out of CLI entry-110834 2014-10-21 21:21:29 zfsauditlogger, ed01client01.comfin.ge.com, User logged out of CLI entry-110835 2014-10-21 21:22:43 zfsauditlogger, ed01client01.comfin.ge.com, User logged out of CLI entry-110836 2014-10-21 21:24:58 zfsauditlogger, ed01client01.comfin.ge.com, User logged in via CLI Need some advise on changes to the below perl code to remove line wrap from the above formatted text. ........... #!/usr/bin/perl while (<>) { chop; if (/^\s+(.*)/) { print " ",$1; } else { print "\n",$_; } } ........... Thanks in advance for your help Debasish |
That last is harder because of the lack of simple identification of the line continuation.
The following is a bit of an assumption: the code assumes that the start of the record is always "entry-" followed by exactly 6 digits. If the second line is in the same format as the first, it will be joined onto the first. Code:
#!/usr/bin/perl |
Hey jpollard,
doing some research on that.. worked quite well .. only thing which seems to be different are the below lines in the output.. entry-110896 2014-10-21 23:45:16 oracle_agent, 3.154.219.140, timed out entry-110897 2014-10-21 23:46:31 oracle_agent, 3.154.219.140,entry-110898 2014-10-21 23:46:33 oracle_agent, 3.154.219.140,entry-110899 2014-10-21 23:46:52 oracle_agent, 3.154.219.140,entry-110900 2014-10-21 23:49:34 oracle_agent, ep01client02.comfin.ge.com, logged in via CLI entry-110901 2014-10-21 23:49:39 oracle_agent, ep01client02.comfin.ge.com, logged out of CLI entry-110902 2014-10-21 23:50:05 oracle_agent, ep01client02.comfin.ge.com, logged in via CLI entry-110903 2014-10-21 23:50:08 oracle_agent, ep01client02.comfin.ge.com, logged out of CLI entry-110904 2014-10-21 23:50:22 oracle_agent, ep01client02.comfin.ge.com, logged in via CLI entry-110905 2014-10-21 23:50:24 oracle_agent, ep01client02.comfin.ge.com, logged out of CLI entry-110906 2014-10-22 00:02:17 oracle_agent, 3.154.219.140, timed out entry-110907 2014-10-22 00:02:17 oracle_agent, 3.154.219.140, timed out entry-110908 2014-10-22 00:06:17 oracle_agent, 3.154.219.140, timed out entry-110909 2014-10-22 00:06:31 oracle_agent, 3.154.219.140,entry-110910 2014-10-22 00:09:34 oracle_agent, ep01client02.comfin.ge.com, logged in via CLI entry-110911 2014-10-22 00:09:39 oracle_agent, ep01client02.comfin.ge.com, logged out of CLI entry-110912 2014-10-22 00:10:04 oracle_agent, ep01client02.comfin.ge.com, logged in via CLI entry-110913 2014-10-22 00:10:08 oracle_agent, ep01client02.comfin.ge.com, logged out of CLI entry-110914 2014-10-22 00:10:22 oracle_agent, ep01client02.comfin.ge.com, logged in via CLI entry-110915 2014-10-22 00:10:24 oracle_agent, ep01client02.comfin.ge.com, logged out of CLI entry-110916 2014-10-22 00:29:34 oracle_agent, ep01client02.comfin.ge.com, logged in via CLI entry-110917 2014-10-22 00:29:39 oracle_agent, ep01client02.comfin.ge.com, logged out of CLI entry-110918 2014-10-22 00:30:04 oracle_agent, ep01client02.comfin.ge.com, logged in via CLI entry-110919 2014-10-22 00:30:08 oracle_agent, ep01client02.comfin.ge.com, logged out of CLI entry-110920 2014-10-22 00:30:22 oracle_agent, ep01client02.comfin.ge.com, logged in via CLI entry-110921 2014-10-22 00:30:24 oracle_agent, ep01client02.comfin.ge.com, logged out of CLI entry-110922 2014-10-22 00:45:17 oracle_agent, 3.154.219.140, timed out entry-110923 2014-10-22 00:46:31 oracle_agent, 3.154.219.140,entry-110924 2014-10-22 00:46:33 oracle_agent, 3.154.219.140,entry-110925 2014-10-22 00:46:52 oracle_agent, 3.154.219.140,entry-110926 2014-10-22 00:49:34 oracle_agent, ep01client02.comfin.ge.com, logged in via CLI entry-110927 2014-10-22 00:49:39 oracle_agent, ep01client02.comfin.ge.com, logged out of CLI entry-110928 2014-10-22 00:50:04 oracle_agent, ep01client02.comfin.ge.com, logged in via CLI entry-110929 2014-10-22 00:50:08 oracle_agent, ep01client02.comfin.ge.com, logged out of CLI some of the entry.*** lines (e.g entry-110924,entry-110925 etc) are moved up to the previous lines specifically the ones which have previous lines something like below.. entry-110923 2014-10-22 00:46:31 oracle_agent, 3.154.219.140, (mark nothing after the comma (,) at the end) as opposed to general pattern below entry-110921 2014-10-22 00:30:24 oracle_agent, ep01client02.comfin.ge.com, logged out of CLI entry-110922 2014-10-22 00:45:17 oracle_agent, 3.154.219.140, timed out Pls advise. Kind Regards & Thanks Debasish |
Yup - that would be due to the exceptions to format.
Try this one. It adds a bit more by detecting when a newline may be needed before the "entry-" Code:
#!/usr/bin/perl |
Thanks jpollard. That works great.
The linux community rocks!! :) Dev |
All times are GMT -5. The time now is 03:42 PM. |