LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Blogs > Michael Uplawski
User Name
Password

Notices


Rate this Entry

download radio broadcasts in mp3 format

Posted 01-26-2019 at 07:11 AM by Michael Uplawski
Updated 03-09-2019 at 01:44 AM by Michael Uplawski (remark on general interest)

Edit: I just mention that the procedure used in the below script is generally applicable in any situation where you want to get a piece of “Web” while avoiding a downright Web-Browser and wherever you can replace it by curl or wget.

No this is not spectacular.

When I try to listen to a precise broadcast on France Culture (www.franceculture.fr), because I have missed it in the morning, I am confronted with a page that wants to open more than 25 connections to sites external to Radio France servers. This is probably due to the choice of the Web-developers there, to use Google-libraries (that's what “everybody” does).

Among those sites are (of course) doubleclick.net, ads.twitter.com and other stuff which has nothing to do with my radio broadcast.

Reading the source-code of the page to find urls to download is cumbersome. But it has the advantage to stay a rather reliable procedure, as the radio stations of Radio France do not often do significant changes to their Web-sites.

As I am a (“learned”) Informatician.., all which works reliably in the always same way.., I do not do.

The script follows. There are messages in French, but if you care for France Culture, this will not shock you. What may shock you are calls to nokogiri and torify. I cannot know what you make of that. Maybe do not use this or adapt it to your needs...

But nokogiri is really great.
Code:
#!/bin/bash
# This script downloads radio-broadcasts in mp3-format from
# the sites of Radio-France.
# The only argument to the script is the URL to a player-page,
# i.e. the page for 1 broadcast, showing a play-button on top.
#
# ©2019-2019 Michael Uplawski <michael.uplawski@uplawski.eu>
# Use ths script at your own risk, modify it as you please.
# But maybe leave the copyright-notice intact. Thank You.

SC=`basename "$0"`

if [ $# -ne 1 ]
then
  clear
  echo -e "ERREUR ! Il faut l'URL d'une page avec un audio-player"
  echo -e "Exemple :\n\t"$SC" https://www.franceculture.fr/emissions/la-fabrique-mediatique/defiance-envers-les-medias-quelles-solutions-22"
  exit 1
fi

# --------- SOME DEFINITIONS ----------
# The command to extract an mp3-file from a page
EXTR_CULT='puts $_.at_css("div.heading-zone-wrapper>div.heading-zone-player-button>button.replay-button/@data-asset-source")'

EXTR_INTER='puts $_.at_css("div.cover-emission-actions-buttons-wrapper>button.replay-button/@data-url")'

EXTR=""

if [[ $1 == *"franceinter"* ]]
then
  EXTR=$EXTR_INTER
elif [[ $1 == *"franceculture"* ]]
then
  EXTR=$EXTR_CULT
else
  echo -e "ERREUR ! Téléchargements sont possibles seulement des sites de"
  echo -e "France-Culture ou France-Inter !"
  exit 2
fi
# extract the URL of the mp3
mp3=`torify curl -s "$1" | nokogiri -e "$EXTR"`
# extract the title of the broadcast
title=`torify curl -s "$1" | nokogiri -e 'puts $_.at_css("title/text()")'`
title=`echo "$title"|tr -s "[:space:][:punct:]" _` 

# Output-file
OFL="$title".mp3
echo $OFL

# --------> ACTION <---------
# download the mp3
torify wget -c "$mp3" --output-document="$OFL"
# <-------- END ACTION --------->
#EOF
Views 280 Comments 1
« Prev     Main     Next »
Total Comments 1

Comments

  1. Old Comment
    Nokogiri is a nice addition to the web scrapping tools like pup and webscraper.io.
    I don't know how to use any of the three, but I hope I get the opportunity to use
    them and play with them someday.

    I came to know pup when Microsoft took over github and removed youtube-dl for
    copy-right infringement (maybe by fear of having a lawsuit by some of its many
    adversaries? but that's another topic). As a reaction, people started writing
    replacement software, and one of them was a simple bash script that was like under
    50 lines of code that would download media from youtube using "pup" and "jq", two
    tools I never heard of before.

    Then searching for webscraping tools I also discovered webscraper.io, which took
    on the task of identifying the paths for you, with a fair degree of precision, all
    with a visual tool like the developer bar of your favorite browser that let you
    point and click on the elements of the page to inspect their HTML and CSS properties,
    along with their paths and other useful information.

    Thanks for the nice share!
    Posted 07-28-2022 at 07:41 AM by ychaouche ychaouche is offline
 

  



All times are GMT -5. The time now is 04:43 AM.

Main Menu
Advertisement
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration