LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices

Reply
 
Search this Thread
Old 06-03-2008, 02:56 PM   #1
lourencojunior
LQ Newbie
 
Registered: May 2008
Posts: 8

Rep: Reputation: 0
Lightbulb Download audio from answers.com


Hallo everyone!!

I would like to download audio files from http://answers.com/ automatically.

Website usage:
Code:
  http://answers.com/choir
  http://answers.com/possible
  http://answers.com/border
I am using the following commands:
Code:
wget http://answers.com/choir -O choir.html # save a local copy 
cat choir.html | grep wav # extract line where wav file link can be found
At this point, I get an output as follows (a single line):
Code:
<h1>choir</h1>&nbsp;&nbsp;(<span style="color:blue;" class="pointer" onclick="pw = window.open('http://content.answers.com/main/content/pronkey-answers.html', 'PronunciationKey', 'height=650,width=520,resizable,scrollbars');if(pw){pw.focus();}" onmouseout="status='';return true;" onmouseover="status='Click for pronunciation key';return true;"><span class="pron">kwīr</span></span>) <span style="cursor:pointer" onmouseover="status='Click to hear pronunciation';return true;" onmouseout="status='';return true;" onclick="playIt('http://content.answers.com/main/content/ahd4/pron/C0317300.wav')"><img border="0" align="middle" src="http://content.answers.com/main/content/img/pron.gif" alt="pronunciation" /></span><br />
My doubt:
- How could I get `http://content.answers.com/main/cont...n/C0317300.wav' as result of a Regular Expression? Thus, I could use wget to download the wav file.

- Is (are) there another (other) way?

As result I would like get a pronunciation file using a command like:
[code]
pron choir
[\code]

What I've coded until now:
Code:
#! /bin/bash

wget http://answers.com/$1 -O $1.html # save a local copy 
cat $1.html | grep wav # extract line where wav file link can be found

## do something to get audio url

wget $(WAV) -O $1.wav
rm -f $1.html
Any help is valid!

Thanks in advance.
Regards,
Lourenco.

Last edited by lourencojunior; 06-03-2008 at 03:22 PM.
 
Old 06-04-2008, 08:12 AM   #2
Agrouf
Senior Member
 
Registered: Sep 2005
Location: France
Distribution: LFS
Posts: 1,591

Rep: Reputation: 79
sed "s/.*(\(.*\)).*/\1/g"
 
Old 06-04-2008, 01:27 PM   #3
lourencojunior
LQ Newbie
 
Registered: May 2008
Posts: 8

Original Poster
Rep: Reputation: 0
Thumbs up

Thank you Agrouf!

Such Regular Expression solve my problem. Now, I've got the solution.

To get a local copy of wav file:
[code]
#!/bin/bash

# TODO:
# - Verify whether $1.wav already exists. Case yes play, otherwise wget.
# - Improve first sed Regular Expression for deleting \' char.

# getting a local copy of page
wget http://answers.com/$1 -O $1.html
# extracting wav audio URL
SOUND=$(cat $1.html | grep wav | sed "s/.*(\(.*\)).*/\1/g" | sed "s/'//g")
# saving a local copy
wget $SOUND -O $1.wav
# removing html
rm -f $1.html
# playing with a player
mplayer $1.wav
[\code]

I've also found another method (many thanks to a friend, Marcelo), which there's no need a local copy of the wav file:
Code:
#!/bin/bash

# getting a local copy of page
wget http://answers.com/$1 -O $1.html
# preparing mplayer input
SOUND=$(grep -o "playIt.*http.*wav" $1.html | sed "s/playIt('//")
# removing html
rm -f $1.html
# playing with a player
mplayer $SOUND

Best regards,
Lourenco

Last edited by lourencojunior; 06-04-2008 at 01:28 PM. Reason: mispelling
 
Old 06-05-2008, 11:17 AM   #4
lourencojunior
LQ Newbie
 
Registered: May 2008
Posts: 8

Original Poster
Rep: Reputation: 0
Thumbs up

Improved version:

Code:
#!/bin/bash

# Checking input params
if (( $# != 1 )); then
	NAME=$(basename $0)
	echo "Usage:"
	echo "  $NAME <word>"
	echo "Where:"
	echo "  <word> = a English word to get pronunciation audio file"
	echo "Return values:"
	echo "  0: Sucess"
	echo "  1: Word not found"
	echo "  2: Invalid params"
	exit 2
fi

# Where cache files will be stored
CACHE_DIR=~ljr/tmp/answers.com/
# File name
FILENAME=$CACHE_DIR$1.wav

# If there's no cache, download it
if ! test -f $FILENAME; then

	# Getting audio file URL
	SOUND=$(lynx -source http://www.answers.com/$1 | sed -n "/.*playIt.*('\(.*\.wav\).*/{s//\1/p;q;}")

	# Checking if it was found
	if test -z "$SOUND"; then
		echo "$1: word not found"
		exit 1
	fi

	# Downloading for caching
	wget $SOUND -O $FILENAME

fi

# Speak to me, Dorothy ;-)
play $FILENAME

# Signaling success
exit 0


# END OF FILE
Best regards,
Lourenco.
 
  


Reply

Tags
cat, cut, expression, regular, sed, tr, wget


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Sooner the better, I need answers, please! Keith E Meyerhoffer Linux - Newbie 8 12-15-2007 01:31 AM
I need some answers ! Captainsj *BSD 9 03-25-2006 07:34 PM
App to download streamed audio file ? kozaki Linux - Software 4 10-15-2005 08:20 AM
Is it possible to download real video,audio? akihandyman Linux - Newbie 1 12-20-2003 10:32 AM
well... some questions looking for answers murshed Linux - General 13 01-16-2003 08:42 AM


All times are GMT -5. The time now is 05:25 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration