bash: compare entries in file against a directory and another file - then print based on pattern
Hello everyone,
I am fairly new to bash scripting and I need some advice. Here's my situation. I have a File 1 with about ~130 entries (it's a column of 6 digit numbers). I want to compare this file (and an amended version of the entries in this file) against a directory and it's subfolders. Then I want to take that list and compare it against a csv file based on column 1 then based on pattern match in column 3. Please see the steps below, I've included explanations for each as it might be easier to see the desired output and understand my goal. Code:
File 1 1. Compare this with a directory to check if there are subfolders matching the entries from file 1. I also want it to print out a message if the subfolder exists and if the subfolder doesn't exist. Code:
Desired Output Code:
Desired Output I'm flexible on this step if there is a better way to get entries that have both the original numbered file and the _2 numbered file. So for this example the output would be Code:
Desired Output Code:
FileCSV Code:
Desired Output If anyone can walk me through this and provide explanations, it would be very helpful in allowing me to learn. Thanks in advance. |
Maybe just use the File 1 as a matching pattern list and extract matched lines in CSV file that have X at 3rd column
Then check if there is a sub directory with name matching the 1st column of extracted lines matching pattern from File 1 should match either xxxxxx-100 or xxxxxx-100_2 in CSV file |
Quote:
Also, is there a way I can print lines while doing a partial string match? such that if I give this string 100234-100 it will match and print both 100234-100 and 100234-100_2? |
Assuming CSV file field 1 contains 6 digits number followed by -100 and maybe followed by _2 and field 3 contains X's
Code:
#build pattern list file If I had to choose, I'd write a perl script and store patterns in hash, should be faster |
Quote:
Also, if I don't want to test if the directory exists, just want to match if those numbers are in the csv file, I would take out the "awk -F 'system("test -d" bit correct? I would think there'd be an easier way to do this. ...alas I'm still learning all of this. |
Depending on your setup, yes there are faster ways.
You're correct on removing the last awk for skipping the directory test. The major part of a solution is a well understood problem. I admit it's not very efficient as it, if I had a precise idea about directory structure, subdirectories names, amount of directories, real lines examples from CSV file, exact requested goal etc I would do things differently Code:
# add a ^ character in front of each line of file1 Code:
# use comma as field separator, Code:
# test if field 3 contains X Code:
# test if outputed lines from previous command Code:
# test if a directory named with first field value exists, |
All times are GMT -5. The time now is 03:19 PM. |