Saving part of a html document
Hi all,
I am writing a few bash scripts that need to extract parts of html documents at a certain point. Consider the exerpt below: I need to select the text begining at <!--TITULO--> and ending at <!--TITULO--> and also the part begining at <!--TEXTO--> and ending at <!--TEXTO-->. The first one is easy because its one line, so grep does it. The second is the problem because its a bunch of lines. System is debian etch, bash is 3.1dfsg-8. Any help would be appreciated, thanks! </tr> <tr> <td class="tit18b"> <!--TITULO-->Foguete VSB-30 deve ser lançado hoje em Alcantara<!--TITULO--> </td> </tr> <tr> <td class="texto11" height="20"> <!--TEXTO--> <P> <P>Agencia JB<P> <P><P> <P>MARANHAO - O Veiculo de Sondagem Booster (SBV-30), no Centro de Lancamento de Alcantara, no Maranhao, deve ser lancado nesta segunda-feira. As condicoes meteorologicas sao favoraveis para o lancamento do foguete que deve ocorrer as 10h30, de acordo com a assessoria de imprensa da Agencia Espacial Brasileira (AEB).<P>O Veiculo de Sondagem Booster (VSB-30) levara nove experimentos cientificos, a maioria de universidades brasileiras. O voo terá duracao total de 20 minutos e o foguete chegara a cerca de 280 quilometros do solo.<P> <!--TEXTO--> </td> </tr> <tr> <td> |
Code:
awk '/<!--TITULO-->/,/<!--TITULO-->/{ |
nice, it worked. I'll learn some awk...
thanks friend! |
All times are GMT -5. The time now is 05:06 PM. |