LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 10-15-2014, 07:13 PM   #1
debumail186
LQ Newbie
 
Registered: Oct 2014
Posts: 9

Rep: Reputation: Disabled
Smile Remove Word wrap from a file in Unix


Hi friends,

I am trying to remove word wrap from a file in Unix..

The contents of the file are as below.. (just as example)

Entries:

ENTRY TIME SUMMARY
entry-000 2014-10-13 15:49:06 oracle_agent, ep01client01.comfin.ge.com, User
logged in via CLI
entry-001 2014-10-13 15:49:07 oracle_agent, ep01client01.comfin.ge.com, User
logged out of CLI
entry-002 2014-10-13 15:49:08 oracle_agent, ep01client01.comfin.ge.com, User
logged out of CLI
entry-003 2014-10-13 16:06 oracle_agent, ep01client01.comfin.ge.com, User
logged in via CLI
entry-004 2014-10-13 16:06:02 oracle_agent, ep01client01.comfin.ge.com, User
logged out of CLI
entry-005 2014-10-13 16:09:05 oracle_agent, ep01client01.comfin.ge.com, User
logged in via CLI
entry-006 2014-10-13 16:09:07 oracle_agent, ep01client01.comfin.ge.com, User
logged in via CLI
entry-007 2014-10-13 16:09:08 oracle_agent, ep01client01.comfin.ge.com, User
logged out of CLI
As you may note, the string 'logged out of CLI' goes on the next line , since it exceeds the screen width of 80 chars.

I want to have all of the contents for a line starting with entryxxx , in the same line and not have it jump over to the new line.

Please advise , what would be the best way to do it.

Thanks in advance..

Best
Dev
 
Old 10-15-2014, 08:58 PM   #2
frankbell
LQ Guru
 
Registered: Jan 2006
Location: Virginia, USA
Distribution: Slackware, Debian, Mageia, and whatever VMs I happen to be playing with
Posts: 12,779
Blog Entries: 17

Rep: Reputation: 3313Reputation: 3313Reputation: 3313Reputation: 3313Reputation: 3313Reputation: 3313Reputation: 3313Reputation: 3313Reputation: 3313Reputation: 3313Reputation: 3313
What program are you using the view the file?
 
Old 10-15-2014, 09:14 PM   #3
debumail186
LQ Newbie
 
Registered: Oct 2014
Posts: 9

Original Poster
Rep: Reputation: Disabled
Hello Frank,

Thanks for your reply.

The file is normal text file which is generated from another server.. I want to remove the word wrap and filter out a few other things from the file.

Basically want to generate a converted file with no word wrap (so that all those 'logged in via CLI' or 'logged out via CLI' lines are inline with the previous line) and next filter out a few other contents from the file which are not useful for me.

I use vi to open the file in Linux.

Kind Regards & Thanks
Dev
 
Old 10-15-2014, 09:24 PM   #4
evo2
LQ Guru
 
Registered: Jan 2009
Location: Japan
Distribution: Mostly Debian and Scientific Linux
Posts: 5,753

Rep: Reputation: 1288Reputation: 1288Reputation: 1288Reputation: 1288Reputation: 1288Reputation: 1288Reputation: 1288Reputation: 1288Reputation: 1288
Hi,

I think frankbell's point is that word wrap like this is normally an artifact of the program you are using to view the file not the file itself. He is looking for some sort of confirmation that the file really does contain those newlines.

Evo2.
 
Old 10-15-2014, 09:34 PM   #5
debumail186
LQ Newbie
 
Registered: Oct 2014
Posts: 9

Original Poster
Rep: Reputation: Disabled
Hi Evo,

Yes the file does contain new lines...

Attaching a sample log file.

Kind Regards & Thanks
Debasish
Attached Files
File Type: txt sample.txt (12.1 KB, 18 views)
 
Old 10-15-2014, 09:48 PM   #6
frankbell
LQ Guru
 
Registered: Jan 2006
Location: Virginia, USA
Distribution: Slackware, Debian, Mageia, and whatever VMs I happen to be playing with
Posts: 12,779
Blog Entries: 17

Rep: Reputation: 3313Reputation: 3313Reputation: 3313Reputation: 3313Reputation: 3313Reputation: 3313Reputation: 3313Reputation: 3313Reputation: 3313Reputation: 3313Reputation: 3313
I think evo2 explained what I was thinking better than I did. I also should have asked on what OS the file was created.

I see in the screenshot references to Oracle. Was that Oracle on Windows or Linux?

Windows and *nix handle new-lines differently. This link explains it nicely: http://www.cs.toronto.edu/~krueger/c...e-endings.html
 
Old 10-15-2014, 11:01 PM   #7
jpollard
Senior Member
 
Registered: Dec 2012
Location: Washington DC area
Distribution: Fedora, CentOS, Slackware
Posts: 4,711

Rep: Reputation: 1279Reputation: 1279Reputation: 1279Reputation: 1279Reputation: 1279Reputation: 1279Reputation: 1279Reputation: 1279Reputation: 1279
That isn't word wrap... That is the formatting, and the second line is deliberately indented.

You can try some perl:

Code:
#!/usr/bin/perl

while (<>) {
   chop;
   if (/^\s+(.*)/) {
        print " ",$1;
   } else {
        print "\n",$_;
   }
}
Note, this will leave the last line without a newline terminator unless there is an empty line
 
Old 10-20-2014, 07:47 PM   #8
debumail186
LQ Newbie
 
Registered: Oct 2014
Posts: 9

Original Poster
Rep: Reputation: Disabled
Thanks jpollard.. that works really good.

Cheers
Dev
 
Old 10-21-2014, 06:22 PM   #9
debumail186
LQ Newbie
 
Registered: Oct 2014
Posts: 9

Original Poster
Rep: Reputation: Disabled
Hi jpollard,

I did a little change to my initial code (used diff and awk) which resulted in a file formatted as below,


entry-110832 2014-10-21 21:21:05 zfsauditlogger, ed01client01.comfin.ge.com,
User logged in via CLI
entry-110833 2014-10-21 21:21:05 zfsauditlogger, ed01client01.comfin.ge.com,
User logged out of CLI
entry-110834 2014-10-21 21:21:29 zfsauditlogger, ed01client01.comfin.ge.com,
User logged out of CLI
entry-110835 2014-10-21 21:22:43 zfsauditlogger, ed01client01.comfin.ge.com,
User logged out of CLI
entry-110836 2014-10-21 21:24:58 zfsauditlogger, ed01client01.comfin.ge.com,
User logged in via CLI

Need some advise on changes to the below perl code to remove line wrap from the above formatted text.
...........
#!/usr/bin/perl

while (<>) {
chop;
if (/^\s+(.*)/) {
print " ",$1;
} else {
print "\n",$_;
}
}
...........

Thanks in advance for your help
Debasish
 
Old 10-21-2014, 08:56 PM   #10
jpollard
Senior Member
 
Registered: Dec 2012
Location: Washington DC area
Distribution: Fedora, CentOS, Slackware
Posts: 4,711

Rep: Reputation: 1279Reputation: 1279Reputation: 1279Reputation: 1279Reputation: 1279Reputation: 1279Reputation: 1279Reputation: 1279Reputation: 1279
That last is harder because of the lack of simple identification of the line continuation.

The following is a bit of an assumption: the code assumes that the start of the record is always "entry-" followed by exactly 6 digits. If the second line is in the same format as the first, it will be joined onto the first.

Code:
#!/usr/bin/perl

while (<>) {
   chop;
   if (!/^entry-\d{6}/) {
        print " ",$_,"\n";
   } else {
        print $_;
   }
}
The reason this is an assumption is that the second line (being merged) may not have a unique string (which is why the indentation format worked - indentation is always blank). So the key has to be the first line - and the next line is assumed to be a continuation.

Last edited by jpollard; 10-21-2014 at 08:58 PM.
 
Old 10-21-2014, 09:18 PM   #11
debumail186
LQ Newbie
 
Registered: Oct 2014
Posts: 9

Original Poster
Rep: Reputation: Disabled
Hey jpollard,

doing some research on that..

worked quite well .. only thing which seems to be different are the below lines in the output..

entry-110896 2014-10-21 23:45:16 oracle_agent, 3.154.219.140, timed out
entry-110897 2014-10-21 23:46:31 oracle_agent, 3.154.219.140,entry-110898 2014-10-21 23:46:33 oracle_agent, 3.154.219.140,entry-110899 2014-10-21 23:46:52 oracle_agent, 3.154.219.140,entry-110900 2014-10-21 23:49:34 oracle_agent, ep01client02.comfin.ge.com, logged in via CLI
entry-110901 2014-10-21 23:49:39 oracle_agent, ep01client02.comfin.ge.com, logged out of CLI
entry-110902 2014-10-21 23:50:05 oracle_agent, ep01client02.comfin.ge.com, logged in via CLI
entry-110903 2014-10-21 23:50:08 oracle_agent, ep01client02.comfin.ge.com, logged out of CLI
entry-110904 2014-10-21 23:50:22 oracle_agent, ep01client02.comfin.ge.com, logged in via CLI
entry-110905 2014-10-21 23:50:24 oracle_agent, ep01client02.comfin.ge.com, logged out of CLI
entry-110906 2014-10-22 00:02:17 oracle_agent, 3.154.219.140, timed out
entry-110907 2014-10-22 00:02:17 oracle_agent, 3.154.219.140, timed out
entry-110908 2014-10-22 00:06:17 oracle_agent, 3.154.219.140, timed out
entry-110909 2014-10-22 00:06:31 oracle_agent, 3.154.219.140,entry-110910 2014-10-22 00:09:34 oracle_agent, ep01client02.comfin.ge.com, logged in via CLI
entry-110911 2014-10-22 00:09:39 oracle_agent, ep01client02.comfin.ge.com, logged out of CLI
entry-110912 2014-10-22 00:10:04 oracle_agent, ep01client02.comfin.ge.com, logged in via CLI
entry-110913 2014-10-22 00:10:08 oracle_agent, ep01client02.comfin.ge.com, logged out of CLI
entry-110914 2014-10-22 00:10:22 oracle_agent, ep01client02.comfin.ge.com, logged in via CLI
entry-110915 2014-10-22 00:10:24 oracle_agent, ep01client02.comfin.ge.com, logged out of CLI
entry-110916 2014-10-22 00:29:34 oracle_agent, ep01client02.comfin.ge.com, logged in via CLI
entry-110917 2014-10-22 00:29:39 oracle_agent, ep01client02.comfin.ge.com, logged out of CLI
entry-110918 2014-10-22 00:30:04 oracle_agent, ep01client02.comfin.ge.com, logged in via CLI
entry-110919 2014-10-22 00:30:08 oracle_agent, ep01client02.comfin.ge.com, logged out of CLI
entry-110920 2014-10-22 00:30:22 oracle_agent, ep01client02.comfin.ge.com, logged in via CLI
entry-110921 2014-10-22 00:30:24 oracle_agent, ep01client02.comfin.ge.com, logged out of CLI
entry-110922 2014-10-22 00:45:17 oracle_agent, 3.154.219.140, timed out
entry-110923 2014-10-22 00:46:31 oracle_agent, 3.154.219.140,entry-110924 2014-10-22 00:46:33 oracle_agent, 3.154.219.140,entry-110925 2014-10-22 00:46:52 oracle_agent, 3.154.219.140,entry-110926 2014-10-22 00:49:34 oracle_agent, ep01client02.comfin.ge.com, logged in via CLI
entry-110927 2014-10-22 00:49:39 oracle_agent, ep01client02.comfin.ge.com, logged out of CLI
entry-110928 2014-10-22 00:50:04 oracle_agent, ep01client02.comfin.ge.com, logged in via CLI
entry-110929 2014-10-22 00:50:08 oracle_agent, ep01client02.comfin.ge.com, logged out of CLI

some of the entry.*** lines (e.g entry-110924,entry-110925 etc) are moved up to the previous lines specifically the ones which have previous lines something like below..

entry-110923 2014-10-22 00:46:31 oracle_agent, 3.154.219.140, (mark nothing after the comma (,) at the end)

as opposed to general pattern below

entry-110921 2014-10-22 00:30:24 oracle_agent, ep01client02.comfin.ge.com, logged out of CLI
entry-110922 2014-10-22 00:45:17 oracle_agent, 3.154.219.140, timed out

Pls advise.

Kind Regards & Thanks
Debasish
 
Old 10-21-2014, 10:00 PM   #12
jpollard
Senior Member
 
Registered: Dec 2012
Location: Washington DC area
Distribution: Fedora, CentOS, Slackware
Posts: 4,711

Rep: Reputation: 1279Reputation: 1279Reputation: 1279Reputation: 1279Reputation: 1279Reputation: 1279Reputation: 1279Reputation: 1279Reputation: 1279
Yup - that would be due to the exceptions to format.

Try this one. It adds a bit more by detecting when a newline may be needed before the "entry-"

Code:
#!/usr/bin/perl

$nl = 0;        # assume newline not needed yet
while (<>) {
   chop;
   if (!/^entry-\d{6}/) {
        print " ",$_,"\n";
        $nl = 0;        # newline not needed
   } else {
        print "\n" if ($nl); # newline is needed
        $nl = 1;        # need a newline in the future
        print $_;
   }
}
print "\n" if ($nl);    # newline is needed
 
1 members found this post helpful.
Old 11-13-2014, 07:52 PM   #13
debumail186
LQ Newbie
 
Registered: Oct 2014
Posts: 9

Original Poster
Rep: Reputation: Disabled
Thanks jpollard. That works great.

The linux community rocks!!

Dev
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
remove word from a file eshl Programming 6 08-12-2014 01:37 PM
using sed to remove word wrap Harpune Linux - Software 2 03-02-2009 07:53 PM
variable length string using GD (word wrap, carriage return, word/character count)? frieza Programming 1 02-14-2009 06:21 PM
"enscript --word-wrap" does not wrap line of text file powah Linux - General 3 05-16-2006 10:12 PM
Microsoft Word won't word wrap Micro420 General 1 06-13-2005 05:36 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 10:49 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration