LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 08-07-2014, 03:37 PM   #1
Netopia2
LQ Newbie
 
Registered: Aug 2014
Posts: 6

Rep: Reputation: Disabled
PREpending one text file to a number of other text files


Greetings!

I have a situation that sort of has me going in circles. Each week we get a fairly large text file from a client. We've just won some more work from them, and that new work comes in as a MUCH larger text file, and that's where my problems begin.

We use a proprietary piece of database software that is specific to the Postal industry. This software doesn't handle the import of large files very well. Strangely, you can batch import a huge number of smaller files that total the same size or larger than the original large file and it works fine.

I've use 'split' to break this large file into a dozen smaller files with unbroken lines, with the output being:

file-.txt
file-1.txt
file-2.txt
etc

The database program can import these just fine, but now only the first file (file-.txt) has a header record... and sadly... the same software requires a header for any file imported. So, I've been relegated to manually copying and pasting the header record from the first file into all the rest.

What I need to do is copy the first line from the first file and then PREpend it to the rest of the files. Being somewhat of a Linux text manipulation noob, I'm lost as to where to even begin to look.

Thoughts? Ideas?

Also, if I've posted this (1st ever) post in an incorrect subforum, please let me know. I did my best to pick a subforum that I thought appropriate.

Thanks in advance for even having read this far!

Joe
 
Old 08-07-2014, 03:43 PM   #2
szboardstretcher
Senior Member
 
Registered: Aug 2006
Location: Detroit, MI
Distribution: GNU/Linux systemd
Posts: 4,278

Rep: Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694
You can add text into the top of a file with:

Code:
sed '1s/^/Some Text\n/' somefile.txt
Make sure the above works, then you probably want to modify that to add to all *.txt files, as well add the -i for in place.

Is this what you mean?

Last edited by szboardstretcher; 08-07-2014 at 03:45 PM.
 
1 members found this post helpful.
Old 08-07-2014, 04:44 PM   #3
Netopia2
LQ Newbie
 
Registered: Aug 2014
Posts: 6

Original Poster
Rep: Reputation: Disabled
szboardstretcher, not exactly. I need to extract the first record of the file that doesn't have a number after it (the one ending with a hyphen) and then have that text be prepended as the first line of text in the other text files (the ones that end with a number).

I'll go look up 'sed' and see how it works. Perhaps I can figure it out... or, perhaps not!

Joe
 
Old 08-07-2014, 05:02 PM   #4
Firerat
Senior Member
 
Registered: Oct 2008
Distribution: Debian sid
Posts: 2,683

Rep: Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783
Code:
Some_text="$(head -n1 origfile.txt)"
split_function
for File in file-[0-9]*.txt;do
   sed -i '1s/^/'$Some_Text'\n/' "$[File}"
done
Or
Code:
sed -i '1s/^/'$Some_Text'\n/' file-[0-9]*.txt

Last edited by Firerat; 08-07-2014 at 05:05 PM.
 
Old 08-07-2014, 07:44 PM   #5
AnanthaP
Member
 
Registered: Jul 2004
Location: Chennai, India
Posts: 952

Rep: Reputation: 217Reputation: 217Reputation: 217
Here goes awk pseudo code.

If first record of any file: (fnr==0), then store the file name to redirect the output to it.
If first record of the first file: (nr==fnr==1) then store the $0 for replication in other files.
If first record of subsequent files: (nr>1, fnr==1) (implies that it's a new file), write the stored $0
If nr <> 1: then just write the record.

OK
If it is a text file and the POSTAL special package needs the header in every file (as it should), I suggest that you look carefully at header as it might contain information that is specific to that file name (like file name, noff records and so on). I mean you may have to work more on the header for each sub file.

You should get the developer to enhance the software.

You should try to get the client to give it in the format you need. Give him some blah/bs about properly being able to identify the file etc. The client may have some pull with the developer.

OK

Last edited by AnanthaP; 08-08-2014 at 09:09 AM.
 
Old 08-08-2014, 06:40 AM   #6
Netopia2
LQ Newbie
 
Registered: Aug 2014
Posts: 6

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by Firerat View Post
Code:
Some_text="$(head -n1 origfile.txt)"
split_function
for File in file-[0-9]*.txt;do
   sed -i '1s/^/'$Some_Text'\n/' "$[File}"
done
Or
Code:
sed -i '1s/^/'$Some_Text'\n/' file-[0-9]*.txt
Firerat, thank you for taking the time to help out. I want to be sure I understand your code:

Some_text="$(head -n1 origfile.txt)"

This sets a variable called Some-text that is defined as "head -n1 origfile.txt"
From what I read about sed last night, it looks like this will pull in the first line from the main file.
I don't know what the significance of "head" is.

split_function

Here I should insert the split that I already have working.

for File in file-[0-9]*.txt;do

This creates a for loop for all split files, but excludes the original file as it does not have a number after after the hyphen (nice! )

sed -i '1s/^/'$Some_Text'\n/' file-[0-9]*.txt

Hmmm... this one is the meat.
It calls sed.
The -i tells it to insert what's about to be defined.
Hmmm... I believe, it says to, on line 1, replace the beginning of the line with the variable defined earlier, plus a line feed.
Do this to all files named file-(digit followed by anything).txt


If that's it, I understand what it's doing, but WOW... I think I would have pulled my hair out trying to figure out how to write that!

Is my understanding correct... or close? I hate learning to simply turn on a light switch, as opposed to understanding what happens when the switch is turned on. Please let me know if I have some holes in my understanding of this.

Thanks,

Joe
 
Old 08-08-2014, 06:49 AM   #7
Firerat
Senior Member
 
Registered: Oct 2008
Distribution: Debian sid
Posts: 2,683

Rep: Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783
head -n1 simply returns the first line

we put that in a Variable as a string, then later use that variable in szboardstretcher's sed

NOTE, there is an issue with my post, Some_text and Some_Text

I normally use vim, and 'complete' variable names using crtl+p ( ^p ) so I get less typos



some bash reading material

http://www.tldp.org/LDP/Bash-Beginners-Guide/html/
http://www.tldp.org/LDP/abs/html/
http://mywiki.wooledge.org/BashGuide
http://www.gnu.org/software/bash/manual/bashref.html

The tldp stuff is great, however there are some nasty bad habits in it,
The mywiki.wooledge does a very good job of 'fixing' those habits


NOTE Bash is not the only solution, AnanthaP has offered some awk logic, but you also have just sed, ruby perl python .. + more...

but bash is often simple, and quick enougth to get the job done
 
Old 08-08-2014, 06:54 AM   #8
Firerat
Senior Member
 
Registered: Oct 2008
Distribution: Debian sid
Posts: 2,683

Rep: Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783
Quote:
Originally Posted by Netopia2 View Post
sed -i '1s/^/'$Some_Text'\n/' file-[0-9]*.txt

Hmmm... this one is the meat.
that was an after thought .. I figured you don't really need the for loop, sed will in effect do it's own loop on those files

I for loop would be usefull, if for instance you wanted to 'glob' lots of files and then perform some test to conditionally run sed.
 
Old 08-11-2014, 11:03 AM   #9
Netopia2
LQ Newbie
 
Registered: Aug 2014
Posts: 6

Original Poster
Rep: Reputation: Disabled
Firerat,

Something is amiss, but I don't know what. I put this little script together based on your code:


Code:
Some_Text="$(head -n1 Reactivation_*.txt)"
cd PostSort
sed -i '1s/^/'$Some_Text'\n/' catalyst[0-9]*.txt
But then I get this error:

Code:
sed: -e expression #1, char 25: unterminated `s' command
I believe that char 25 is the single quote at the end of $Some_Text , so I'm not sure why it's saying that it is unterminated. Thoughts?

Joe
 
Old 08-11-2014, 11:06 AM   #10
szboardstretcher
Senior Member
 
Registered: Aug 2006
Location: Detroit, MI
Distribution: GNU/Linux systemd
Posts: 4,278

Rep: Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694
You need to escape the inside quotes, or get rid of them, depending.
 
Old 08-11-2014, 12:29 PM   #11
Netopia2
LQ Newbie
 
Registered: Aug 2014
Posts: 6

Original Poster
Rep: Reputation: Disabled
I've now tried escaping the inside quotes. Results in the exact same error.

I've both removed the quotes and tried double quotes. In both those cases the resultant file starts with either $Some_Text or "$Some_Text", respectively.

Any other thoughts?

Joe
 
Old 08-11-2014, 12:44 PM   #12
ntubski
Senior Member
 
Registered: Nov 2005
Distribution: Debian, Arch
Posts: 3,781

Rep: Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081
You can't escape single quotes. Unquoted variables with spaces will expand into multiple arguments, use double quotes to avoid this.
Code:
Some_Text="$(head -n1 Reactivation_*.txt)"
cd PostSort
sed -i '1s/^/'"$Some_Text"'\n/' catalyst[0-9]*.txt
I added colours to show the span of quoting.

Using just double quotes should work in this case:

Code:
sed -i "1s/^/$Some_Text\n/" catalyst[0-9]*.txt
Quote:
I've both removed the quotes and tried double quotes.
You didn't remove all the single quotes, I think.
 
Old 08-12-2014, 09:34 AM   #13
Netopia2
LQ Newbie
 
Registered: Aug 2014
Posts: 6

Original Poster
Rep: Reputation: Disabled
Thanks to everyone for all the help. I was able to get it working after making the quoting '"xxxx"'.

AND... thanks to the point in the right direction, I was also able to add a line of SED to change the unix style LFs to Windows LFCRs. I'm getting my toes into the water of SED... even if only in the extreme kiddie end of the pool!

Joe
 
Old 08-13-2014, 07:17 AM   #14
Firerat
Senior Member
 
Registered: Oct 2008
Distribution: Debian sid
Posts: 2,683

Rep: Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783
worth a bookmark
http://sed.sourceforge.net/sed1line.txt
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
grep a text in files and print the file name who don't contain such text whossa Linux - Newbie 5 04-13-2012 07:49 AM
[SOLVED] read a text file distrubute its contents on different text files magische_vogel Programming 13 02-26-2011 06:51 PM
join 2 text files based on first number present in every line of the 2 text files markraem Linux - Software 4 01-25-2010 06:26 AM
prepending text dynamically to a file curos Linux - Newbie 1 02-13-2009 03:29 AM
Steps needed to convert multiple text files into one master text file jamtech Programming 5 10-07-2007 11:24 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 03:35 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration