tough one: how do you find patterns/sequences in file names?
say you have a dir with files:
file.1.foo file.2.foo file.3.foo bar.100.gah bar.101.gah bar.102.gah someFile1 otherThing2 ... how would you go about finding that there are 2 sequences there and two files that are not part of any sequence? i.e.: seq: file.#.foo 1-3 seq: bar.#.gah 100-102 sngl: someFile1 sngl: otherThing2 I have a couple really complex examples, but it doesn't seem like it should take over 300 lines of code to do this. Does anyone know of a good way to find this info? Some super cool regex or something? Language doesn't matter much as long as it's nice & tidy. If I had my druthers, the answer would be in done in python, but I can translate if need be. |
Have you read the man page for grep?
man grep If I understand your question correctly, then I believe that grep is what you are looking for. Regards |
What about the base where you have something like this:
Code:
file.1.middle.3.end Also, how would you handle something like this (where a number in a range is missing)? Code:
file.1.end |
I really hate Perl, and I really suck with Perl.
That being said, here's my solution. Code:
use strict; Code:
$ cat list |
just one way out of the many with GNUawk
Code:
ls -1 | awk 'BEGIN{FS="."} |
Wow, thanks guys! I wasn't expecting to get answers on this one.
In answer matthewg42 questions - all the file1's are one group, the file2 & file3 are singles. if a number is missing, you can consider it two sequences. If the report is on the smart side, it would report something like: foo.#.end 1-3,5-6 but foo.#.end 1-3 foo.#.end 5-6 is also acceptable '#' is arbitrary - it could be anything.. '#' makes sense. '%04d' makes a lot of sense. but yeah... these give me great starting points. Thanks! |
All times are GMT -5. The time now is 11:36 AM. |