ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
file 1)
a data
b data
c location://wrong-path-to-file/filename.mp3
d data
e
f data
g data
h location://wrong-path-to-file/filename.mp3
i data
etc.
file 2)
a no data
b no data
c location://right-path-to-file/filename.mp3
d no data
e
f no data
g no data
h location://right-path-to-file/filename.mp3
i no data
etc.
I want to merge the two files, so that:
new file)
a data
b data
c location://right-path-to-file/filename.mp3
d data
etc.
The files are not the same in layout, so I cannot just select the right lines.
What I'm thinking about is something like this:
if line contains "location" in file 1
then select filename.mp3 using sed/cut /...
search for filename.mp3 in file 2 (assuming the filename is unique)
replace the line in file 1 with the line containg filename.mp3 from file 2
I think I'm on the same lines as Grail. If we make some assumptions, and use the first field and the filename as an unique key, in awk this would be:
Code:
awk '#
BEGIN {
# Accept any newline convention, removing leading and trailing whitespace.
RS = "[\t\v\f ]*(\r\n|\n\r|\r|\n)[\t\v\f ]*"
# Fields are separated by one or more consecutive whitespace characters.
FS = "[\t\v\f ]+"
# For output, use Unix newline convention.
ORS = "\n"
# For output, use space as a field separator.
# (This only affects when the input record is modified).
OFS = " "
# Ordinal for file being accessed (1, 2, ...)
FILE = 0
}
# Increment FILE for each new file.
(FNR == 1) { FILE++ }
# For the first file, only location:// lines are remembered.
(FILE == 1 && $2 ~ /^location:/) {
# As the key, take the filename part of the location,
key = $2
sub(/^.*\//, "", key)
# but prepend the first field and a space to it.
key = $1 " " key
# Do we already have this key used?
if (key in location)
if (location[key] != $2)
printf("Warning: %s redefined from %s to %s.\n", key, location[key], $2) > "/dev/stderr"
# Save the entire location under the key.
location[key] = $2
}
# A location field in the second file?
(FILE == 2 && $2 ~ /^location:/) {
# Construct the key the same way as above.
key = $2
sub(/^.*\//, "", key)
key = $1 " " key
# If there is a stored location corresponding to key, replace the second field.
if (key in location)
$2 = location[key]
# Note: To retain the original field separator, you can use
# sub(/location:[^\t\v\f ]*/, location[key])
}
# Output all records for the second file.
# The "default rule" is { print $0 }; no need to write it.
(FILE == 2)
' locations-but-no-datadata-but-wrong-locations > new-file
The awk command is much, much longer than really needed, but I wanted to make it as readable and robust as possible. The first line begins with a comment because some awk variants do not like an empty first line, and I wanted to have it nicely indented.
Running with your example files new-file will contain
Code:
a data
b data
c location://right-path-to-file/filename.mp3
d data
e
f data
g data
h location://right-path-to-file/filename.mp3
i data
Questions?
Last edited by Nominal Animal; 05-31-2012 at 12:33 AM.
2) As you have pointed out, what if the filename is not unique? I think I can check that quickly by using find all MP3 files recursively in the parent directory and than use sort unique. For now, I'm pretty sure the all are unique.
Are the a, b, c, etc actually in the file or just there to denote each line? The a, b, c, etc only refer to the lines. They are NOT in the actual file.
What if the filename you find is not on the same line in the second file as the first file? As I said before, the files are not the same in layout, nor in order. Therefore I cannot simply swap line X in file 1 with line X in file 2.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.