LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices


Reply
  Search this Thread
Old 05-14-2011, 06:54 PM   #1
scittlez227
LQ Newbie
 
Registered: May 2011
Posts: 3

Rep: Reputation: 9
Merging Two Files from Two Different Directories


Hi,

I'm totally new to Linux and this website. I was wondering if anyone had or could help me create a shell script that would merge two files from two different directories and then have that new merged file in a third differnt directory.

The merged file would need to eliminate duplicates and sort the contents. Any suggestions? Thanks.
 
Click here to see the post LQ members have rated as the most helpful post in this thread.
Old 05-14-2011, 07:13 PM   #2
T3RM1NVT0R
Senior Member
 
Registered: Dec 2010
Location: Internet
Distribution: Linux Mint, SLES, CentOS, Red Hat
Posts: 2,385

Rep: Reputation: 477Reputation: 477Reputation: 477Reputation: 477Reputation: 477
@ Reply

Hi there,

Try this:

cat file1 file2 | sort | uniq > file3

Where:

file1 and file2 are the files which you want to merge sort will be the option that will sort | uniq is the option that will only take the unique info basically avoiding duplicate things in both the files and > will redirect the output to file3 which will be the file that will get created after merger.

If you want to do it for files from different directories you just need to do this:

cat /etc/file1 /root/file2 | sort | uniq > /tmp/file3

Last edited by T3RM1NVT0R; 05-14-2011 at 07:16 PM.
 
Old 05-14-2011, 07:15 PM   #3
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Arch + Xfce
Posts: 6,852

Rep: Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037
I have a suggestion. Provide some details. We aren't mind-readers here.

What kind of files? What kind of contents? What should the merged file look like? What is the purpose or context behind merging them? What's the directory structure, or will some method will be needed to locate the files first? Is this a one-time thing, or something that you will need to automate? Etc. etc.

At the very least you should provide some sample input, and a sample of the desired output.


Edit: @T3RM1NVT0R. "sort -u" alone does the same thing as "sort | uniq".

Last edited by David the H.; 05-14-2011 at 07:19 PM.
 
2 members found this post helpful.
Old 05-14-2011, 07:43 PM   #4
scittlez227
LQ Newbie
 
Registered: May 2011
Posts: 3

Original Poster
Rep: Reputation: 9
Thanks guys.

David the H. - This is a homework problem one of my friends has. He asked me to look into a way to solve this problem his professor gave them. Any and all solutions would be greatly appreciated, thanks. He needs a shell script that would satisfy this:

We have two set of files, let's say they are in directory a and directory b. Let's say the file names are a1 a2 a3 a4 ... and b1 b2 b3 b4. (I thought having the same name might be confusing, so maybe having different names is better. So, you will need to line up a1 with b1, a2 with b2 ... to create c1, c2 ... File contents are different but they need to have overlapping parts. They also have spaces in them (more than one char space).

What you'll need to do is to create a 3rd set of files in directory c where files from directory a and b are merged. You will need to compare each line from file1 of directory a with file1 of directory b and write it into file1 of directory c. Meanwhile, you'll need to merge the lines, eliminate the blanks which are more than a single char space, and eliminate the duplicates and sort this content from left to write in the output file. This should work for multiple lines and multiple files (so you need to automate this). Let's consider that each line is a fixed length--i.e. at most 80 char for input files and 160 for the output files.

For example:

file1 of a: file1 of b:

ffffff eee ccc r 12 ddd fff k ccc bbbb zzz nnn eeeee aaaaaaaaa 3

file1 of c:

12 3 aaaaaaaaa bbbb ccc ddd eee eeeee fff ffffff k nnn r zzz
 
1 members found this post helpful.
Old 05-15-2011, 02:55 AM   #5
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Arch + Xfce
Posts: 6,852

Rep: Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037
Well, nobody is going to provide all the answers to homework questions, but we can help guide you through areas where you're having difficulty.

Your first order of business is to break the script down into steps and figure out how to do each one individually. Then you can assemble the working commands up into a script that controls the flow.

So start by determining how to merge two files and work from there. A simple cat+sort won't work here, since you have to actually merge lines. You'll probably need to use awk instead, which is a field-based tool, or something else with advanced editing ability like perl (but have a look at other available commands first).

The actual filenames shouldn't be a big problem; the script just needs to know where to find each one. I personally would look into creating two lists of files and storing them in arrays with matching index numbers. then you should be able to easily process everything with a single loop over those index numbers.
 
Old 05-20-2011, 07:13 AM   #6
scittlez227
LQ Newbie
 
Registered: May 2011
Posts: 3

Original Poster
Rep: Reputation: 9
this is what I think will work as long as the newly combined file has an even number of characters and all characters are in one line. Let me know what you think and if there is another piece of code that would solve the even # of characters problem. Thanks again for all your input.

#!/bin/bash
# all files must contain an even number of characters or else this shell will error out
y=1
echo "Input the # of file(s) in each directory"
read x
while [ $y -le $x ]
do
filea="/dira/a"$y
fileb="/dirb/b"$y
filetemp="/temp/temp"$y
filec="/dirc/c"$y

paste $filea $fileb > $filetemp
tsort $filetemp |sort -d |awk '{ str1=str1 $0 " "}END{ print str1 }' > $filec
cat $filec

done
 
Old 05-20-2011, 08:31 AM   #7
onebuck
Moderator
 
Registered: Jan 2005
Location: Central Florida 20 minutes from Disney World
Distribution: SlackwareŽ
Posts: 13,925
Blog Entries: 44

Rep: Reputation: 3159Reputation: 3159Reputation: 3159Reputation: 3159Reputation: 3159Reputation: 3159Reputation: 3159Reputation: 3159Reputation: 3159Reputation: 3159Reputation: 3159
Hi,

Welcome to LQ!

This is a good example of how to properly get help with homework questions here at LQ. As per the LQ Rules;
Quote:
Do not expect LQ members to do your homework - you will learn much more by doing it yourself.
Kudos to you for your honesty and the manner in which you have related the queries.

I do suggest that you use/wrap vbcode tags # or quote to post your data or long lists. This will make your posts cleaner therefore easier to read.

FYI: I suggest that you look at 'How to Ask Questions the Smart Way' so in the future your queries provide information that will aid us in diagnosis of the problem or query.




Just a few links to aid you to gaining some understanding. I would start at 4,5 &6 while the other links will enhance your experience;



1 Linux Documentation Project
2 Rute Tutorial & Exposition
3 Linux Command Guide
4 Bash Beginners Guide
5 Bash Reference Manual
6 Advanced Bash-Scripting Guide
7 Linux Newbie Admin Guide
8 LinuxSelfHelp
9 Utimate Linux Newbie Guide

The above links and others can be found at 'Slackware-Links'. More than just SlackwareŽ links!
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] merging files in linux dinakumar12 Linux - Newbie 14 03-07-2011 05:10 PM
Merging Data Files toshibalaptoplinux Linux - Newbie 6 12-14-2008 11:32 PM
Merging Two Files using C++ ckoniecny Programming 5 09-26-2006 09:00 AM
merging mpg files karhu Linux - Software 4 07-28-2005 05:25 AM
merging movie files ZaphyR Linux - Software 4 09-05-2004 08:26 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - General

All times are GMT -5. The time now is 04:46 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration