LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > General
User Name
Password
General This forum is for non-technical general discussion which can include both Linux and non-Linux topics. Have fun!

Notices


Reply
  Search this Thread
Old 09-20-2020, 05:20 PM   #1
jr_bob_dobbs
Member
 
Registered: Mar 2009
Distribution: Slackware,Linux From Scratch
Posts: 462
Blog Entries: 74

Rep: Reputation: 90
youtube video url series backtracking


Say you find a link to a youtube video on a forum. It's a link to part five of i don't know how many parts. Until recently, I'd pop the URL into links (the only safe non-popup non-clutter way to look at modern/current web sites) and find the URL to part 1, and all the others. and then use youtube-dl on part one and then assess and decide if I wanted to see the other parts.

Well, they (youtube) recently changed something so that doesn't work. More anti-text-mode programming on their part. Finally I had to give up and look at the page for part five in a full javoids-on browser. Wow, what a bunch of popup, makes it feel like a wack a mole game or something, but less fun. Right. Anyway, so after all that work on my part ... no links for first or previous. Good show. Golf clap. Let's reduce the actual functionality of our web site. Good job.

/sarcasm

Anyway, does anyone know of any way of getting to part one of a series of videos, given only one link to part five?

Thank you.

p.s.
I know this should not be posted here because it is "technical" but where else here could I post it, since this is not a OS-dependent, and this not a Linux question?
 
Old 09-21-2020, 01:50 PM   #2
teckk
Senior Member
 
Registered: Oct 2004
Distribution: FreeBSD Arch
Posts: 3,016

Rep: Reputation: 836Reputation: 836Reputation: 836Reputation: 836Reputation: 836Reputation: 836Reputation: 836
I turned to python to deal with utube years ago. You can search that using whatever criteria you wish. And I don't use their api either.

Here is an example that will search it, without javascripts running.
This simple one uses urllib.
Code:
#!/usr/bin/python

from urllib import request, error, parse
from json import loads, dumps

base_url = 'https://www.youtube.com'
log = 'utsearch.log'

#Colors
RED = '\033[31m'
BLUE = '\033[34m'
CYAN = '\033[36m'
GREEN = '\33[32m'
BLACK  = '\33[30m'
WHITE  = '\33[37m'
YELLOW = '\033[33m'
END = '\033[0;0m'
BOLD = '\033[1m'
ITALIC   = '\33[3m'
UNDERLINE = '\033[4m'


class YoutubeSearch:
    def __init__(self, search_terms: str, max_results=None):
        self.search_terms = search_terms
        self.max_results = max_results
        self.videos = self.search()

    def search(self):
        encoded_search = parse.quote(self.search_terms)

        #Search Utube by results
        urla = f'{base_url}/results?search_query={encoded_search}'
        #Search utube by date
        urlb = (f'{base_url}/results?search_query={encoded_search}'
                '&search_sort=video_date_uploaded')
        #Search utube by views
        urlc = (f'{base_url}/results?search_query={encoded_search}'
                '&search_sort=video_view_count')
        #Select search type
        url = urlb

        page = request.urlopen(request.Request(url))
        response = page.read().decode()    
        results = self.parse_html(response)
        
        if self.max_results is not None and len(results) > self.max_results:
            return results[: self.max_results]
        return results

    def parse_html(self, response):
        results = []
        start = (response.index('window["ytInitialData"]')
                + len('window["ytInitialData"]') + 3)
            
        end = response.index("};", start) + 1
        json_str = response[start:end]
        data = loads(json_str)

        videos = data["contents"]["twoColumnSearchResultsRenderer"][
        "primaryContents"]["sectionListRenderer"]["contents"][0][
        "itemSectionRenderer"]["contents"]

        #Get items from page, make a dictionary.
        for video in videos:
            res = {}
            if "videoRenderer" in video.keys():
                video_data = video.get("videoRenderer", {})
                res["Video Id:"] = video_data.get("videoId", None)
                res["Image:"] = [thumb.get(
                    "url", None) for thumb in video_data.get(
                    "thumbnail", {}).get("thumbnails", [{}]) ]
                res["Title:"] = video_data.get(
                    "title", {}).get("runs", [[{}]])[0].get(
                    "text", None)
                res["Description:"] = video_data.get(
                    "descriptionSnippet", {}).get("runs", [{}])[0].get(
                    "text", None)
                res["Channel:"] = video_data.get(
                    "longBylineText", {}).get("runs", [[{}]])[0].get(
                    "text", None)
                res["Duration:"] = video_data.get(
                    "lengthText", {}).get("simpleText", 0)
                res["Views:"] = video_data.get(
                    "viewCountText", {}).get("simpleText", 0) 
                res["Url:"] = video_data.get(
                    "navigationEndpoint", {}).get(
                    "commandMetadata", {}).get(
                    "webCommandMetadata", {}).get("url", None)
                results.append(res)
        return results

    def to_dict(self):
        return self.videos
 
if __name__ == '__main__':
    
    uts = input('Enter search terms for youtube :')
    #Set number of video results here.
    max_results = 100
    results = YoutubeSearch(uts, max_results).to_dict()
    
    #Print results to terminal and write to log
    with open(log, 'a') as f:
        print('%s\n'*5 % (CYAN, BOLD, 'Start Search', '='*70, END))
        f.write('%s\n'*3 % ('', 'Start Search', '='*70))
        
        for result in results:
            result['Image:'] = result['Image:'][0]
            #Make complete url 
            result['Url:'] = (base_url + result['Url:'])

            print(YELLOW, '-'*70, END)
            f.write('%s\n' % ('-'*70))
            
            for key, val in result.items():
                #Colorize output
                print(GREEN, BOLD, key, END, val)
                f.write("%s  %s\n" % (key,val))
I've got a dozen scripts that I have made over time that search it in different ways. Anyway...That's my answer. You can also import youtube-dl and use it's functionality.
 
Old 10-01-2020, 04:45 PM   #3
jr_bob_dobbs
Member
 
Registered: Mar 2009
Distribution: Slackware,Linux From Scratch
Posts: 462

Original Poster
Blog Entries: 74

Rep: Reputation: 90
@teckk thank you, that was interesting. I've had to write custom things to get around stuff as well. The most programming my case was for getting cartoons to fetch), and that was in C with a little bit of bash.

I've already got youtube-dl, but I can't tell if it can ascertain next and/or previous videos from one video's url.
 
Old 10-02-2020, 08:49 AM   #4
teckk
Senior Member
 
Registered: Oct 2004
Distribution: FreeBSD Arch
Posts: 3,016

Rep: Reputation: 836Reputation: 836Reputation: 836Reputation: 836Reputation: 836Reputation: 836Reputation: 836
If it is from one uploader, then search for videos from that uploader. All of their videos will show. More likely that you will get all 3 episodes lined up in your search, or at least on the same page. You can search videos by date uploaded too. If it's from the same uploader it's likely that they uploaded them all at once. Even if the did not they may be on the same search page.

You can search youtube with youtube-dl. Read man youtube-dl when you have a free hour. That man page is getting larger.
 
Old 10-02-2020, 02:50 PM   #5
ondoho
LQ Addict
 
Registered: Dec 2013
Posts: 15,616
Blog Entries: 9

Rep: Reputation: 4515Reputation: 4515Reputation: 4515Reputation: 4515Reputation: 4515Reputation: 4515Reputation: 4515Reputation: 4515Reputation: 4515Reputation: 4515Reputation: 4515
Thanks teckk for that python youtube searcher! It works...
It's the one thing I need to avoid opening the YT web site at all. Recently it started popping up some "Consent Required" thing, very annoying.
 
Old 10-02-2020, 04:37 PM   #6
teckk
Senior Member
 
Registered: Oct 2004
Distribution: FreeBSD Arch
Posts: 3,016

Rep: Reputation: 836Reputation: 836Reputation: 836Reputation: 836Reputation: 836Reputation: 836Reputation: 836
Welcome, I've made several that do different searches, using different modules.

I know that you like youtube-dl. You can import it into python and use it. It is python after all.

This is really basic.
Code:
>>> from youtube_dl import YoutubeDL
>>> yturl = 'https://m.youtube.com/watch?v=kqtD5dpn9C8'
>>> yt = YoutubeDL()
>>> ulist = []
>>> info = yt.extract_info(yturl, download=False)
>>> print(ulist)
['Python Tutorial - Python for Beginners [2020]']
>>> ulist.append(info['description'])
print(ulist)
['Python Tutorial - Python for Beginners [2020]', 'Python Tutorial - Python for Beginners (2020 EDITION) - Learn Python quickly & easily (in 1 hour)! \n🙏 Enjoyed this video? Please vote for me as the Top Programming Guru: https://bit.ly/2G7tf2s\n👍 Subscribe for more Python tutorials like this: https://goo.gl/6PYaGF\n🔥 Want to learn more? Watch my complete Python course: https://youtu.be/_uQrJ0TkZlc\n\n 
--<snip>--

>>> ulist = []
>>> for i in info['formats']:
...     ulist.append(i['format_id'])
>>> print(ulist)
['249', '250', '140', '251', '160', '133', '278', '242', '134', '135', '243', '136', '244', '247', '137', '248', '18', '22']
That works quite well actually. I can't think of anything that works with youtube better than youtube-dl.
 
Old 10-02-2020, 04:57 PM   #7
teckk
Senior Member
 
Registered: Oct 2004
Distribution: FreeBSD Arch
Posts: 3,016

Rep: Reputation: 836Reputation: 836Reputation: 836Reputation: 836Reputation: 836Reputation: 836Reputation: 836
This will spit you out more info than you want. Title, description, actual url of videos, formats.

Code:
from youtube_dl import YoutubeDL

url = 'https://m.youtube.com/watch?v=kqtD5dpn9C8'

#Get Utube video urls
def getUtube():
    ulist = []
    yt = YoutubeDL()
    info = yt.extract_info(url, download=False)
    ulist.append(info['title'])
    ulist.append('')
    ulist.append(info['description'])
    ulist.append('')
    for i in info['formats']:
        ulist.append(i['format_id'])
        ulist.append(i['url'])
        ulist.append('')
        utList  = '\n'.join(ulist)
        print(utList)
        
getUtube()
And they change it from time to time. You'll have to download a source page and see what they have changed.
 
Old 10-03-2020, 02:41 AM   #8
ondoho
LQ Addict
 
Registered: Dec 2013
Posts: 15,616
Blog Entries: 9

Rep: Reputation: 4515Reputation: 4515Reputation: 4515Reputation: 4515Reputation: 4515Reputation: 4515Reputation: 4515Reputation: 4515Reputation: 4515Reputation: 4515Reputation: 4515
Quote:
Originally Posted by teckk View Post
That works quite well actually. I can't think of anything that works with youtube better than youtube-dl.
But it's always based on the video URL isn't it? One cannot enter search phrases?
(I did have a lok at the man page but found no option to do that)
 
Old 10-03-2020, 06:43 AM   #9
boughtonp
Member
 
Registered: Feb 2007
Location: UK
Distribution: Debian
Posts: 711

Rep: Reputation: 446Reputation: 446Reputation: 446Reputation: 446Reputation: 446

I just remembered this: https://gitlab.com/uoou/ytp

Downside is that it requires an API key (and thus a Google account) for the searching, though maybe that bit could be swapped for a suitably modified version of teckk's code.


Last edited by boughtonp; 10-03-2020 at 06:49 AM.
 
Old 10-03-2020, 08:56 AM   #10
ondoho
LQ Addict
 
Registered: Dec 2013
Posts: 15,616
Blog Entries: 9

Rep: Reputation: 4515Reputation: 4515Reputation: 4515Reputation: 4515Reputation: 4515Reputation: 4515Reputation: 4515Reputation: 4515Reputation: 4515Reputation: 4515Reputation: 4515
@teckk: is there a license on your code?
I have modified it and would like to share it.

Quote:
Originally Posted by boughtonp View Post
Downside is that it requires an API key
And that's the downside with so many of these tools.
 
Old 10-03-2020, 03:58 PM   #11
teckk
Senior Member
 
Registered: Oct 2004
Distribution: FreeBSD Arch
Posts: 3,016

Rep: Reputation: 836Reputation: 836Reputation: 836Reputation: 836Reputation: 836Reputation: 836Reputation: 836
Quote:
@teckk: is there a license on your code?
No, just parts of scripts that I made for myself to do something. I do look at other scripts I see posted online for ideas. There are lots of python snippets online. Such as stackoverflow. There are also python examples on youtube. Everyone else labors and gives their stuff away. So I do too.

The arch AUR and arch repo has lots of python. And then I use what I have installed. I think that urllib is a little more handy that requests. I use pyqt5 and qtwebengine instead of selenium and firefox. Anyway no, I make a script every now and then for something needed.

I don't know if a youtube page will even load in a browser now unless you have scripts turned on. Dillo, w3m, palemoon with scripts off won't display them.

There are python scripts that use googles api to search youtube. That kind of defeats the point though.
 
Old 10-03-2020, 04:12 PM   #12
boughtonp
Member
 
Registered: Feb 2007
Location: UK
Distribution: Debian
Posts: 711

Rep: Reputation: 446Reputation: 446Reputation: 446Reputation: 446Reputation: 446
Quote:
Originally Posted by teckk View Post
Quote:
is there a license on your code?
No [...] Everyone else labors and gives their stuff away. So I do too.
If you don't explicitly assign a license, you're not giving it away.

https://choosealicense.com/no-permission/


Last edited by boughtonp; 10-03-2020 at 04:15 PM.
 
Old 10-03-2020, 04:17 PM   #13
teckk
Senior Member
 
Registered: Oct 2004
Distribution: FreeBSD Arch
Posts: 3,016

Rep: Reputation: 836Reputation: 836Reputation: 836Reputation: 836Reputation: 836Reputation: 836Reputation: 836
Quote:
But it's always based on the video URL isn't it? One cannot enter search phrases?
You can search with youtube-dl. First 10 hits for python
Code:
youtube-dl -g ytsearch10:python
You can also search by date
Code:
ytsearchdate:keyword, ytsearchdate10:keyword, ytsearchdateall:keyword

youtube-dl -g ytsearchdate3:pyqt5
Multi search term.
Code:
youtube-dl -g "ytsearch3:python scrape with urllib"
Look at man youtube-dl to spit out what info you are wanting

You can control that better by importing youtube_dl
 
Old 10-03-2020, 04:18 PM   #14
teckk
Senior Member
 
Registered: Oct 2004
Distribution: FreeBSD Arch
Posts: 3,016

Rep: Reputation: 836Reputation: 836Reputation: 836Reputation: 836Reputation: 836Reputation: 836Reputation: 836
Quote:
If you don't explicitly assign a license, you're not giving it away.
Oh ok, I'll have to read that. Never really noticed that.
 
Old 10-04-2020, 01:27 AM   #15
ondoho
LQ Addict
 
Registered: Dec 2013
Posts: 15,616
Blog Entries: 9

Rep: Reputation: 4515Reputation: 4515Reputation: 4515Reputation: 4515Reputation: 4515Reputation: 4515Reputation: 4515Reputation: 4515Reputation: 4515Reputation: 4515Reputation: 4515
Quote:
Originally Posted by teckk View Post
No, just parts of scripts that I made for myself to do something. I do look at other scripts I see posted online for ideas. There are lots of python snippets online. Such as stackoverflow. There are also python examples on youtube. Everyone else labors and gives their stuff away. So I do too.
OK, I'll take that as permission to redistribute under some sort of FOSS license.
Let me know what you decide on, I'll put a note in the code. For now I slapped a GPL3 on it.
This sort of stuff might seem minuscule and pointless, but I prefer to stay on top of it.

Thanks, anyhow.
All this finally got me started on python!

I didn't change the parsing mechanism, I concentrated on usability. I changed the output formatting, and it takes search terms from the clipboard & copies a chosen URL to the clipboard. That way I can immediately launch the video with another script.
The input mechanism uses readline, that's particularly cool I think: copy-pasting, line editing etc.
Here it is.

Quote:
Originally Posted by teckk View Post
You can search with youtube-dl. First 10 hits for python
Code:
youtube-dl -g ytsearch10:python
So you can!
Quote:
Look at man youtube-dl to spit out what info you are wanting
Youtube-related "search" or "ytsearch" is not mentioned in the man page; it's one of the extractors, actually:
Code:
$> youtube-dl --list-extractors|grep -i search
CiscoLiveSearch
mailru:music:search
screen.yahoo:search
soundcloud:search
video.google:search
youtube:search
youtube:search:date
youtube:search_url
Quote:
You can control that better by importing youtube_dl
You mean in python? You bet I'll be playing with this!
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
LXer: YouTube-dlG : A Front-End GUI App For Popular YouTube-DL Video Downloader LXer Syndicated Linux News 0 09-14-2017 01:20 PM
Youtube-dl's warning while downloading youtube video. stf92 General 4 08-31-2017 12:12 PM
Getting the youtube video URL at current time. stf92 General 4 07-12-2017 04:35 PM
[SOLVED] Backtracking to sanity volkerdi Slackware 135 06-21-2013 02:19 PM
I guess a simple backtracking problem spank Programming 15 08-27-2003 11:21 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > General

All times are GMT -5. The time now is 02:42 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration