I've written a mini scraper for extracting cheat codes for my PSP. I've noticed that these tags are still in the output of the text after removing all of the tags with sed. How can I remove the unwanted output?
Unwanted Output:
Code:
<!--
google_ad_client = "ca-pub-4347670546564685";
/* 336d */
google_ad_slot = "7700751062";
google_ad_width = 336;
google_ad_height = 280;
//-->
<script type="text/javascript"
src="http://pagead2.googlesyndication.com/pagead/show_ads.js">
Scraper:
Code:
#!/bin/bash
wget -q -O - 'goo.gl/vfYA94' | \
sed -En '/<strong>([1-9]|[12][0-9]|3[01])/,/<\/blockquote>/p' | \
sed -e 's/<[^>]*>//g'