LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   [regExp] Retrieve content embedded in quotation marks (https://www.linuxquestions.org/questions/programming-9/%5Bregexp%5D-retrieve-content-embedded-in-quotation-marks-606016/)

sylvaticus 12-11-2007 05:47 AM

[regExp] Retrieve content embedded in quotation marks
 
Hello.. I need to retrieve a list of variables from a line of a programming language where variables are enclosed in quotation marks..

Eg.
Code:

MYVAR("AA","BB","CC")*1.12345*BVAR("DD","CC","EE") * 2
should retrieve me
AA BB CC DD CC EE

I am using python and up to now I arrived to write the following regexp:

Code:

variables = re.findall("\".+\"", myLine[i])
Hovewer I got
Code:

"AA","BB","CC")*1.12345*BVAR("DD","CC","EE"
How can I tell the regExp to get the more "internal" match ???

ghostdog74 12-11-2007 06:00 AM

there's no need to use regexp, you can make use of Python's basic string manipulation methods
eg ( tested only on that string)
Code:

s="""MYVAR("AA","BB","CC")*1.12345*BVAR("DD","CC","EE") * 2"""
while 1:
    try:
        ind = s.index('"')
        s=s[ind+1:]
        end = s.index('"')
        print s[:end]
        s=s[end+1:]
    except: break

output:
Code:

# ./test.py
AA
BB
CC
DD
CC
EE

however, if you are bent on using regexp

Code:

import re
pat = re.compile('"(.*?)"')
s="""MYVAR("AA","BB","CC")*1.12345*BVAR("DD","CC","EE") * 2"""
print pat.findall(s)

output:
Code:

# ./test.py
['AA', 'BB', 'CC', 'DD', 'CC', 'EE']


sylvaticus 12-11-2007 06:28 AM

Thank you very much. I was going in the direction of your fist example (using "split()" as I am a python-newby) before reading your text.

Following your text I learned that *? make the * operator to works in "minimal fashion" (from http://docs.python.org/lib/re-syntax.html ) that was my problem.

I suppose also that the quotation marks are not in the output because they are outside the () operator...

Thank you very much...

Antonello


All times are GMT -5. The time now is 08:56 PM.