LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   SED interval specification (https://www.linuxquestions.org/questions/programming-9/sed-interval-specification-764317/)

wakatana 10-25-2009 01:00 PM

Guys I have another problem, i think sed can do that

input (all in one line):
Code:

SITA<br>_________<br><span class="foto">FOTO: ilustračné foto SITA_AP<br><br><span class="datum">Sobota 24. októbra&nbsp;2009</span><br clear="left">
desired output
Code:

2009 Sobota 24. októbra
tried this
Code:

cat mikus.html | sed -n 's/"datum">\([a-zA-Z][a-zA-Z]* [0-9][0-9]*\. [a-zA-Z][a-zA-Z]*\)&[a-zA-Z][a-zA-Z]*;\([0-9][0-9][0-9][0-9]\)/\2 \1/gp'

also extended regexps

cat mikus.html | sed -nr 's/"datum">([a-zA-Z][a-zA-Z]* [0-9][0-9]*\. [a-zA-Z][a-zA-Z]*)&[a-zA-Z][a-zA-Z]*;([0-9][0-9][0-9][0-9])/\2 \1/gp'

but both returns the same output
Code:

SITA<br>_________<br><span class="foto">FOTO: ilustračné foto SITA_AP<br><br><span class=2009 Sobota 24. októbra</span><br clear="left">

I tried same regexp (without memory of course) in grep an seems work
Code:

grep -o '"datum">[a-zA-Z][a-zA-Z]* [0-9][0-9]*\. [a-zA-Z&][A-Za-z&]*;[0-9][0-9][0-9][0-9]' mikus.html
"datum">Sobota 24. októbra&nbsp;2009

What I am doing wrong with sed ? Thank you all

ghostdog74 10-25-2009 07:07 PM

awk
Code:

# awk -vRS='</span>' '{gsub(/.*>|&nbsp;/,"")}1' file
Sobota 24. októbra2009



All times are GMT -5. The time now is 10:08 AM.