LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 08-06-2011, 05:27 PM   #1
K-Veikko
LQ Newbie
 
Registered: Jul 2005
Posts: 11

Rep: Reputation: 0
Stylish text


I have been playing around with sed to create human readable text files. Sometimes I encounter very long lines in textfiles.

How can I split a long line of text into several lines. – Just to make it more eye-pleasant.
  1. Count 320 charcters from line beginning.
  2. Find next .[space]
  3. Replace [dot][space] with [dot][\n\n]
  4. Go to #1 and continue counting from newly created [\n\n].

Prefererably a solution that suits in the pipe.

Last edited by K-Veikko; 08-06-2011 at 05:29 PM.
 
Old 08-06-2011, 05:45 PM   #2
ta0kira
Senior Member
 
Registered: Sep 2004
Distribution: FreeBSD 9.1, Kubuntu 12.10
Posts: 3,078

Rep: Reputation: Disabled
Maybe something like sed -r 's/(.{1,320}) (.)/\1\n\n\2/g', but that's for 1-320 characters per line.
Kevin Barry

edit: This might be what you're looking for: sed -r 's/(.{320}[^ ]*) (.)/\1\n\n\2/g'.

Last edited by ta0kira; 08-06-2011 at 05:48 PM.
 
Old 08-06-2011, 05:46 PM   #3
crts
Senior Member
 
Registered: Jan 2010
Posts: 2,020

Rep: Reputation: 757Reputation: 757Reputation: 757Reputation: 757Reputation: 757Reputation: 757Reputation: 757
Hi,

try this:
Code:
sed -r 's/(.{320}[^.]*\.)[[:blank:]]/\1\n\n/g' file

Last edited by crts; 08-06-2011 at 05:52 PM. Reason: replaced space with [[:blank:]]
 
Old 08-06-2011, 05:55 PM   #4
crts
Senior Member
 
Registered: Jan 2010
Posts: 2,020

Rep: Reputation: 757Reputation: 757Reputation: 757Reputation: 757Reputation: 757Reputation: 757Reputation: 757
Quote:
Originally Posted by ta0kira View Post
Maybe something like sed -r 's/(.{1,320}) (.)/\1\n\n\2/g', but that's for 1-320 characters per line.
Kevin Barry

edit: This might be what you're looking for: sed -r 's/(.{320}[^ ]*) (.)/\1\n\n\2/g'.
Hi,

I think the OP wants to match a literal dot at the end followed by a space.

@OP:
Try also this alternatives:
Code:
sed -r 's/(.{320}[^.]*\.)[[:blank:]]+/\1\n\n/g' file
sed -r 's/(.{320}[^.]*\.)[[:blank:]]*/\1\n\n/g' file
I think you want the latter. It will split at the next dot even if it is not followed by a space.
The first one will only split if the dot is followed by at least one space or more. Both will remove all spaces following the dot.

Last edited by crts; 08-06-2011 at 06:00 PM.
 
1 members found this post helpful.
Old 08-25-2011, 05:23 PM   #5
K-Veikko
LQ Newbie
 
Registered: Jul 2005
Posts: 11

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by crts View Post
Hi,I think the OP wants to match a literal dot at the end followed by a space.
This helped me.

Thank you very much for the quick answers. – Took quite a while to finally make my account active again. Sorry for the delay.
 
Old 09-06-2011, 08:03 AM   #6
K-Veikko
LQ Newbie
 
Registered: Jul 2005
Posts: 11

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by K-Veikko View Post
Code:
sed -r 's/(.{320}[^.]*\.)[[:blank:]]+/\1\n\n/g'
However. When struggling towards perfectness I met my limits again.

How can I make this linebreak happen if the script finds any of listed letters or any combination of them, actually any [^A-Za-z], followed by space. Especially

.[space] ".[space] ."[space]
?[space] "?[space] ?"[space]
![space] "![space] !"[space]
"[space]

Last edited by K-Veikko; 09-06-2011 at 08:25 AM.
 
Old 09-06-2011, 09:00 AM   #7
crts
Senior Member
 
Registered: Jan 2010
Posts: 2,020

Rep: Reputation: 757Reputation: 757Reputation: 757Reputation: 757Reputation: 757Reputation: 757Reputation: 757
Quote:
Originally Posted by K-Veikko View Post
However. When struggling towards perfectness I met my limits again.

How can I make this linebreak happen if the script finds any of listed letters or any combination of them, actually any [^A-Za-z], followed by space. Especially

.[space] ".[space] ."[space]
?[space] "?[space] ?"[space]
![space] "![space] !"[space]
"[space]
Do you still mean if it finds any of the characters after the first 320 characters?
Code:
sed -r 's/(.{320}[^[:punct:]]*[[:punct:]]+)[[:blank:]]+/\1\n\n/g'
 
Old 09-06-2011, 12:37 PM   #8
K-Veikko
LQ Newbie
 
Registered: Jul 2005
Posts: 11

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by crts View Post
Do you still mean if it finds any of the characters after the first 320 characters?
Code:
sed -r 's/(.{320}[^[:punct:]]*[[:punct:]]+)[[:blank:]]+/\1\n\n/g'
Any single character OR any combination of two (or more) characters FOLLOWED by space.

- I have not yet tested your solution.
 
Old 09-06-2011, 01:51 PM   #9
crts
Senior Member
 
Registered: Jan 2010
Posts: 2,020

Rep: Reputation: 757Reputation: 757Reputation: 757Reputation: 757Reputation: 757Reputation: 757Reputation: 757
Quote:
Originally Posted by K-Veikko View Post
Any single character OR any combination of two (or more) characters FOLLOWED by space.

- I have not yet tested your solution.
In this case you just might want to lose the first quantifier:
Code:
sed -r 's/([^[:punct:]]*[[:punct:]]+)[[:blank:]]+/\1\n\n/g'
The above will replace spaces that follow non-alphanumeric characters.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
LXer: Listen to Text and Instant Messages with Gespeaker 'Text To Speech' Utility LXer Syndicated Linux News 0 09-25-2010 08:11 PM
How to parse text file to a set text column width and output to new text file? jsstevenson Programming 12 04-23-2008 02:36 PM
LXer: Make the whole Web look better with Stylish LXer Syndicated Linux News 0 05-07-2007 03:31 PM
in Pascal: how to exec a program, discard text output or send to text file Valkyrie_of_valhalla Programming 6 05-02-2007 09:50 AM
LXer: Stylish XML LXer Syndicated Linux News 0 04-14-2006 02:21 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 05:51 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration