LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 06-04-2012, 02:45 PM   #1
dio
LQ Newbie
 
Registered: Jun 2012
Posts: 2

Rep: Reputation: Disabled
How do I generate a new text file for each line of text in a document?


Hi all,
i have a text file of several thousand lines, and each lines needs to be outputed in a separate text file.

Split command doesn´t work, because of suffix failure in the case of so many lines. This is what I have tried so far:

while read LINE; do echo $LINE>$LINE.txt; done <all_texts_new.txt

This works perfectly until the ">$LINE.txt" part, here it seems to fail.

Any ideas? Thanks a lot
 
Old 06-04-2012, 03:39 PM   #2
PTrenholme
Senior Member
 
Registered: Dec 2004
Location: Olympia, WA, USA
Distribution: Fedora, (K)Ubuntu
Posts: 4,186

Rep: Reputation: 346Reputation: 346Reputation: 346Reputation: 346
Try while read LINE; do echo ${LINE}>"${LINE}".txt; done <all_texts_new.txt (By default bash will "tokenize" LINE, and any blanks, etc., in LINE will make bash try the parts of LINE after the blank as commands to be executed if the quotes are missing.)

There are other, more sophisticated, ways to do this sort of thing. And the .txt, while it may be useful as a "tag" for you, is not normally required. (Linux files a categorized by an internal "magic number," and, although extensions can often help users,they are't needed by most programs.)

By the way, you're going to end up with a directory containing files whose contents is redundant with the file's name. You could, instead, create empty files without loosing any information: while read LINE; touch "${LINE}"; done <all_texts_new.txt
 
1 members found this post helpful.
Old 06-04-2012, 03:46 PM   #3
dio
LQ Newbie
 
Registered: Jun 2012
Posts: 2

Original Poster
Rep: Reputation: Disabled
Thank you for your suggestion, PTrenholme! Unfortunately, this did not work in my case, probably because lines are too long to function as file names???

Anyways, found the solution just a couple of minutes ago:

awk '{print >> "s" sprintf("%03d",++c) ".txt"}' all_texts_new.txt

Phew, glad this is done, it was a loong evening!
 
Old 06-04-2012, 04:05 PM   #4
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,246

Rep: Reputation: 2684Reputation: 2684Reputation: 2684Reputation: 2684Reputation: 2684Reputation: 2684Reputation: 2684Reputation: 2684Reputation: 2684Reputation: 2684Reputation: 2684
What if the line is blank? Do you want an empty file for that line?
 
Old 06-04-2012, 08:25 PM   #5
PTrenholme
Senior Member
 
Registered: Dec 2004
Location: Olympia, WA, USA
Distribution: Fedora, (K)Ubuntu
Posts: 4,186

Rep: Reputation: 346Reputation: 346Reputation: 346Reputation: 346
Quote:
Originally Posted by dio View Post
Thank you for your suggestion, PTrenholme! Unfortunately, this did not work in my case, probably because lines are too long to function as file names???

Anyways, found the solution just a couple of minutes ago:

awk '{print >> "s" sprintf("%03d",++c) ".txt"}' all_texts_new.txt

Phew, glad this is done, it was a loong evening!
IIRC, the file name length limit is about 256 characters, but it may be longer if you're using a 64-bit OS.

Anyhow, if you just wanted to name the file sequentially, you should have said so in your first post. The AWK code to do what you first said you wanted would be something like

gawk '{print > "\"" gensub(/\"/,"\\\"","g",$0) "\".txt"}' all_texts_new.txt

(Note that the gensub function is a gawk extension to ANSI AWK.)

And the bash code to do what you ended up doing would be:

c=0;while read LINE; do c=$((++c));echo $LINE>$(printf "s%03d" ${c});done <all_texts_new.txt

(But the AWK would be much faster . . .)

Last edited by PTrenholme; 06-05-2012 at 01:17 PM. Reason: Typo in the bash solution.
 
Old 06-05-2012, 03:09 AM   #6
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,246

Rep: Reputation: 2684Reputation: 2684Reputation: 2684Reputation: 2684Reputation: 2684Reputation: 2684Reputation: 2684Reputation: 2684Reputation: 2684Reputation: 2684Reputation: 2684
Well I will assume you do not wish for blank files, but also that the line an entry is on is important, so this could work:
Code:
awk 'NF{print > sprintf("s%03d.txt",NR)}' file
 
Old 06-05-2012, 03:31 PM   #7
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Debian sid + kde 3.5 & 4.4
Posts: 6,823

Rep: Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957
Please use ***[code][/code] tags*** around your code and data, to preserve formatting and to improve readability. Please do not use quote tags, colors, or other fancy formatting.


Code:
c=$((++c))
The "++" increment operator resets the variable contents directly, so there's no need to use the "=".

In bash or ksh, you can use the ((..)) arithmetic operator. For portability, prefix $((..)) with the true command or use let.

Code:
(( ++c ))
: $(( ++c ))
let ++c
In this case however, we want to pass the value directly to printf, so we use $((..)).

Also, don't forget to quote your expansions, to avoid word splitting.

So, to flesh out the loop (assuming bash):

Code:
c=0
while read -r line || [[ -n $line ]]; do

	[[ -z $line ]] && continue	#skips empty lines

	echo "$line" > "$( printf "s%03d.txt" "$((++c))" )"

done <all_texts_new.txt
The [[ -n $line ]] test catches cases where there's no final newline in the input text.

Finally, since environment variables are generally all upper-case, it's good practice to keep your own user variables in lower-case or mixed-case to help differentiate them.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Delete line of text from text file via shell? zizou86 Programming 3 01-13-2010 12:25 PM
using sed to replace text on one line in a text file vo1pwf Linux - Newbie 5 06-24-2009 08:54 AM
How to add a text to first line of a text file? oskeewow Linux - Newbie 6 04-23-2008 01:40 PM
how to change some text of a certain line of a text file with bash and *nix scripting alred Programming 6 07-10-2006 12:55 PM
SED - display text on specific line of text file 3saul Linux - Software 3 12-29-2005 05:32 PM


All times are GMT -5. The time now is 10:35 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration