Help with a bash script
Awesome forum!
I am trying to figure out a way to pull http links out of text files and then output the results in a log. The text files are in folders like this inside a source directory. /source ./folder1 ...folder1.txt ./folder2 ...folder1.txt ./folder3 ...folder1.txt So basically I would like to get the output to look something like this. folder1 http://example.com folder2 http://example.com folder3 http://example.com I just can't wrap my head around how to do this. Thanks in advance |
Sounds like a possible homework question, so just some basic advice for the moment.
Break it down into the steps you need to perform. Figure out how to do each one individually, then you can combine them at the end. First you need to compile a list of filenames. Take a look at the find command. Next, you need to figure out how to extract the links you need from each file. This depends on the exact format of the text, but the usual tools are grep, sed, or awk. Finally, create a loop to process each file in the list and output the desired format to your log file. |
David thanks for the help.
Give the folder names: ls ~/folder Would give the path of the text files: find ~/folder -name *txt This would extract the http link out of the txt files: grep "http://" /folder/name.txt | sed 's/^.*http:/http:/' | sed 's/\s.*$//' | sort The thing that stumps me is how to do the loops. The output I am looking for would be... folder http://www.link.com folder1 http://www.link.com etc |
Hey wassup
Quote:
make no distinction between files and directories, so assuming you only want directories it will fail. Quote:
Quote:
so it is a waste to have it in there. Also, maybe you could show us a before and after of the line you want as the sed's seem quite over the top as well. Quote:
|
Hello. Try this one.
Code:
cd ~/folder; find -type f -iname '*txt' -exec grep -o "http://[^[:blank:]\"']\+" {} \; |
All times are GMT -5. The time now is 05:40 PM. |