LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Convert .../($var1|$var2)/... regex to string (https://www.linuxquestions.org/questions/linux-newbie-8/convert-%24var1%7C%24var2-regex-to-string-4175419955/)

smartgig 08-02-2012 05:48 AM

Convert .../($var1|$var2)/... regex to string
 
Hello,

In one part of my shell program I need to translate many of lines in the following pattern :
Code:

/(folder1|...|folderN)/(sub1|...|subN)/.../(file1|...|fileN)
into strings :
Code:

/folder1/sub1/file1
/folder1/sub1/...
/folder1/sub1/fileN
...
/folderN/subN/fileN

Please help me obtaining the shell script.

sycamorex 08-02-2012 05:59 AM

Hi and welcome to LQ.

Can you post the actual example of your naming conventions?

Do the names of the directories/files differ only in the appended number?

smartgig 08-02-2012 06:04 AM

Here are some of them :
Code:

/(ar|az|dua|en|fa|fr)/(index|index2).php
/(ar|az|dua|en|fa|fr)/administrator/(index|index2).php
/(ar|fa)/modules/(mod_tabmods|mod_tabs|mod_base|mod_secure)/scripts/(importer|uploader).php

I want to apply some command on the files so the path of the files should be in the correct way !

sycamorex 08-02-2012 06:06 AM

Now you've confused me. That's not exactly what you posted in the original post.
Can you post the desired output of the sample above?

smartgig 08-02-2012 06:15 AM

Thank you for your answer !
Where's the confusion?

desired output :
Code:

/ar/index.php
/ar/index2.php
/az/index.php
/az/index2.php
/dua/index.php
/dua/index2.php
/en/index.php
/en/index2.php
/fa/index.php
/fa/index2.php
/fr/index.php
/fr/index2.php
/ar/administrator/index.php
/ar/administrator/index2.php
/az/administrator/index.php
/az/administrator/index2.php
/dua/administrator/index.php
/dua/administrator/index2.php
/en/administrator/index.php
/en/administrator/index2.php
/fa/administrator/index.php
/fa/administrator/index2.php
/fr/administrator/index.php
/fr/administrator/index2.php
/ar/modules/mod_tabmods/scripts/importer.php
/ar/modules/mod_tabmods/scripts/uploader.php
/ar/modules/mod_tabs/scripts/importer.php
/ar/modules/mod_tabs/scripts/uploader.php
/ar/modules/mod_base/scripts/importer.php
/ar/modules/mod_base/scripts/uploader.php
/ar/modules/mod_secure/scripts/importer.php
/ar/modules/mod_secure/scripts/uploader.php
/fa/modules/mod_tabmods/scripts/importer.php
/fa/modules/mod_tabmods/scripts/uploader.php
/fa/modules/mod_tabs/scripts/importer.php
/fa/modules/mod_tabs/scripts/uploader.php
/fa/modules/mod_base/scripts/importer.php
/fa/modules/mod_base/scripts/uploader.php
/fa/modules/mod_secure/scripts/importer.php
/fa/modules/mod_secure/scripts/uploader.php


David the H. 08-02-2012 07:25 AM

Well, this is a challenging one!

A proper solution would likely involve parsing the lines and storing each part separately into arrays, which would be re-built into the desired output, or something like that. It would take some time to figure out.

I do have a quick hack solution, but I don't particularly recommend it.

Code:

text='/(ar|az|dua|en|fa|fr)/administrator/(index|index2).php'
text="$( echo "$text" | tr '()|' '{},' )"
eval printf '%s\\n' "$text"

This translates the line into a valid brace expansion. Unfortunately though the final printing involves using eval to expand it, and eval has worrisome security issues. I generally suggest avoiding it unless there's no other choice. It may be safe enough here though, as long as the input text is known and trustworthy.

It also won't work correctly if any part of the original line contains whitespace, or if there happen to be '()|' characters that aren't part of the expansion pattern.

(Incidentally, I tried to use parameter substitutions instead of tr, but there seems to be a bug in my bash that keeps the substitution to "}" from being handled correctly.)

sycamorex 08-02-2012 09:59 AM

Quote:

Originally Posted by smartgig (Post 4743978)
Thank you for your answer !
Where's the confusion?

desired output :
Code:

/ar/index.php
/ar/index2.php
/az/index.php
/az/index2.php
/dua/index.php
/dua/index2.php
/en/index.php
/en/index2.php
/fa/index.php
/fa/index2.php
/fr/index.php
/fr/index2.php
/ar/administrator/index.php
/ar/administrator/index2.php
/az/administrator/index.php
/az/administrator/index2.php
/dua/administrator/index.php
/dua/administrator/index2.php
/en/administrator/index.php
/en/administrator/index2.php
/fa/administrator/index.php
/fa/administrator/index2.php
/fr/administrator/index.php
/fr/administrator/index2.php
/ar/modules/mod_tabmods/scripts/importer.php
/ar/modules/mod_tabmods/scripts/uploader.php
/ar/modules/mod_tabs/scripts/importer.php
/ar/modules/mod_tabs/scripts/uploader.php
/ar/modules/mod_base/scripts/importer.php
/ar/modules/mod_base/scripts/uploader.php
/ar/modules/mod_secure/scripts/importer.php
/ar/modules/mod_secure/scripts/uploader.php
/fa/modules/mod_tabmods/scripts/importer.php
/fa/modules/mod_tabmods/scripts/uploader.php
/fa/modules/mod_tabs/scripts/importer.php
/fa/modules/mod_tabs/scripts/uploader.php
/fa/modules/mod_base/scripts/importer.php
/fa/modules/mod_base/scripts/uploader.php
/fa/modules/mod_secure/scripts/importer.php
/fa/modules/mod_secure/scripts/uploader.php


My python is not great but it seems to work fine:
Code:

cat text.txt
/(ar|az|dua|en|fa|fr)/(index|index2).php
/(ar|az|dua|en|fa|fr)/administrator/(index|index2).php
/(ar|fa)/modules/(mod_tabmods|mod_tabs|mod_base|mod_secure)/scripts/(importer|uploader).php
~/data/tmp % ./pyscri.py
/ar/index.php
/ar/index2.php
/az/index.php
/az/index2.php
/dua/index.php
/dua/index2.php
/en/index.php
/en/index2.php
/fa/index.php
/fa/index2.php
/fr/index.php
/fr/index2.php
/ar/administrator/index.php
/ar/administrator/index2.php
/az/administrator/index.php
/az/administrator/index2.php
/dua/administrator/index.php
/dua/administrator/index2.php
/en/administrator/index.php
/en/administrator/index2.php
/fa/administrator/index.php
/fa/administrator/index2.php
/fr/administrator/index.php
/fr/administrator/index2.php
/ar/modules/mod_tabmods/scripts/importer.php
/ar/modules/mod_tabmods/scripts/uploader.php
/ar/modules/mod_tabs/scripts/importer.php
/ar/modules/mod_tabs/scripts/uploader.php
/ar/modules/mod_base/scripts/importer.php
/ar/modules/mod_base/scripts/uploader.php
/ar/modules/mod_secure/scripts/importer.php
/ar/modules/mod_secure/scripts/uploader.php
/fa/modules/mod_tabmods/scripts/importer.php
/fa/modules/mod_tabmods/scripts/uploader.php
/fa/modules/mod_tabs/scripts/importer.php
/fa/modules/mod_tabs/scripts/uploader.php
/fa/modules/mod_base/scripts/importer.php
/fa/modules/mod_base/scripts/uploader.php
/fa/modules/mod_secure/scripts/importer.php
/fa/modules/mod_secure/scripts/uploader.php

pyscri.py
Code:

#!/usr/bin/env python
import re
import itertools

f_ext = ".php"
new_path = []
new_line = []
lines = [line.rstrip() for line in open('text.txt')]

for x in range(len(lines)):
    each_line = lines[x].split("/")
    for z in range(1, len(each_line)):
        level1 = (re.sub('\(|\)', '', each_line[z])).split("|")
        new_path.append(level1)
        if f_ext in level1[-1]:
            new_line.append(new_path)
            new_path = []

for paths in range(len(new_line)):
    for element in itertools.product(*new_line[paths]):
        final_line = '/' + '/'.join(element)
        if f_ext not in final_line:
            final_line = final_line + f_ext
        print(final_line)

HTH


All times are GMT -5. The time now is 02:40 PM.