LinuxQuestions.org
Register a domain and help support LQ
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 07-17-2007, 03:38 AM   #1
bruno buys
Senior Member
 
Registered: Sep 2003
Location: Rio
Distribution: Debian
Posts: 1,509

Rep: Reputation: 46
Saving part of a html document


Hi all,
I am writing a few bash scripts that need to extract parts of html documents at a certain point. Consider the exerpt below: I need to select the text begining at <!--TITULO--> and ending at <!--TITULO--> and also the part begining at <!--TEXTO--> and ending at <!--TEXTO-->. The first one is easy because its one line, so grep does it. The second is the problem because its a bunch of lines.
System is debian etch, bash is 3.1dfsg-8. Any help would be appreciated, thanks!




</tr>
<tr>
<td class="tit18b">
<!--TITULO-->Foguete VSB-30 deve ser lançado hoje em Alcantara<!--TITULO-->
</td>
</tr>
<tr>
<td class="texto11" height="20">
<!--TEXTO-->
<P> <P>Agencia JB<P> <P><P> <P>MARANHAO - O Veiculo de Sondagem Booster (SBV-30), no Centro de Lancamento de Alcantara, no Maranhao, deve ser lancado nesta segunda-feira. As condicoes meteorologicas sao favoraveis para o lancamento do foguete que deve ocorrer as 10h30, de acordo com a assessoria de imprensa da Agencia Espacial Brasileira (AEB).<P>O Veiculo de Sondagem Booster (VSB-30) levara nove experimentos cientificos, a maioria de universidades brasileiras. O voo terá duracao total de 20 minutos e o foguete chegara a cerca de 280 quilometros do solo.<P>
<!--TEXTO-->
</td>
</tr>
<tr>
<td>
 
Old 07-17-2007, 03:57 AM   #2
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 241Reputation: 241Reputation: 241
Code:
awk '/<!--TITULO-->/,/<!--TITULO-->/{ 
      gsub("<!--TITULO-->","")
      print
    }
    flag { 
        if ( /<!--TEXTO-->/ ) { 
	      flag=0;next
	}
	else {	print }
    } 
    /<!--TEXTO-->/{
	flag=1
	next
    }' "file"
 
Old 07-17-2007, 12:15 PM   #3
bruno buys
Senior Member
 
Registered: Sep 2003
Location: Rio
Distribution: Debian
Posts: 1,509

Original Poster
Rep: Reputation: 46
nice, it worked. I'll learn some awk...

thanks friend!
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Writing to HTML document after loading completes dlublink Programming 4 09-13-2006 09:53 AM
How to get a cohesive document from a series of HTML pages rickh Linux - Software 4 03-04-2006 08:22 PM
Problem in saving document!!!! doctor_sniff Linux - Software 0 08-15-2005 04:49 AM
simple q: can you use php inside an html document? BrianK Programming 2 12-10-2004 10:31 PM
Fooling a HTML document? eantoranz Programming 11 11-11-2004 08:42 AM


All times are GMT -5. The time now is 10:12 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration