LinuxQuestions.org - youtube video url series backtracking

- General (https://www.linuxquestions.org/questions/general-10/)

- - youtube video url series backtracking (https://www.linuxquestions.org/questions/general-10/youtube-video-url-series-backtracking-4175682452/)

youtube video url series backtracking

Say you find a link to a youtube video on a forum. It's a link to part five of i don't know how many parts. Until recently, I'd pop the URL into links (the only safe non-popup non-clutter way to look at modern/current web sites) and find the URL to part 1, and all the others. and then use youtube-dl on part one and then assess and decide if I wanted to see the other parts.

Well, they (youtube) recently changed something so that doesn't work. More anti-text-mode programming on their part. Finally I had to give up and look at the page for part five in a full javoids-on browser. Wow, what a bunch of popup, makes it feel like a wack a mole game or something, but less fun. Right. Anyway, so after all that work on my part ... no links for first or previous. Good show. Golf clap. Let's reduce the actual functionality of our web site. Good job.

/sarcasm

Anyway, does anyone know of any way of getting to part one of a series of videos, given only one link to part five?

Thank you.

p.s.
I know this should not be posted here because it is "technical" but where else here could I post it, since this is not a OS-dependent, and this not a Linux question?

I turned to python to deal with utube years ago. You can search that using whatever criteria you wish. And I don't use their api either.

Here is an example that will search it, without javascripts running.
This simple one uses urllib.

Code:

#!/usr/bin/python



from urllib import request, error, parse

from json import loads, dumps



base_url = 'https://www.youtube.com'

log = 'utsearch.log'



#Colors

RED = '\033[31m'

BLUE = '\033[34m'

CYAN = '\033[36m'

GREEN = '\33[32m'

BLACK  = '\33[30m'

WHITE  = '\33[37m'

YELLOW = '\033[33m'

END = '\033[0;0m'

BOLD = '\033[1m'

ITALIC  = '\33[3m'

UNDERLINE = '\033[4m'





class YoutubeSearch:

    def __init__(self, search_terms: str, max_results=None):

        self.search_terms = search_terms

        self.max_results = max_results

        self.videos = self.search()



    def search(self):

        encoded_search = parse.quote(self.search_terms)



        #Search Utube by results

        urla = f'{base_url}/results?search_query={encoded_search}'

        #Search utube by date

        urlb = (f'{base_url}/results?search_query={encoded_search}'

                '&search_sort=video_date_uploaded')

        #Search utube by views

        urlc = (f'{base_url}/results?search_query={encoded_search}'

                '&search_sort=video_view_count')

        #Select search type

        url = urlb



        page = request.urlopen(request.Request(url))

        response = page.read().decode()    

        results = self.parse_html(response)

        

        if self.max_results is not None and len(results) > self.max_results:

            return results[: self.max_results]

        return results



    def parse_html(self, response):

        results = []

        start = (response.index('window["ytInitialData"]')

                + len('window["ytInitialData"]') + 3)

            

        end = response.index("};", start) + 1

        json_str = response[start:end]

        data = loads(json_str)



        videos = data["contents"]["twoColumnSearchResultsRenderer"][

        "primaryContents"]["sectionListRenderer"]["contents"][0][

        "itemSectionRenderer"]["contents"]



        #Get items from page, make a dictionary.

        for video in videos:

            res = {}

            if "videoRenderer" in video.keys():

                video_data = video.get("videoRenderer", {})

                res["Video Id:"] = video_data.get("videoId", None)

                res["Image:"] = [thumb.get(

                    "url", None) for thumb in video_data.get(

                    "thumbnail", {}).get("thumbnails", [{}]) ]

                res["Title:"] = video_data.get(

                    "title", {}).get("runs", [[{}]])[0].get(

                    "text", None)

                res["Description:"] = video_data.get(

                    "descriptionSnippet", {}).get("runs", [{}])[0].get(

                    "text", None)

                res["Channel:"] = video_data.get(

                    "longBylineText", {}).get("runs", [[{}]])[0].get(

                    "text", None)

                res["Duration:"] = video_data.get(

                    "lengthText", {}).get("simpleText", 0)

                res["Views:"] = video_data.get(

                    "viewCountText", {}).get("simpleText", 0) 

                res["Url:"] = video_data.get(

                    "navigationEndpoint", {}).get(

                    "commandMetadata", {}).get(

                    "webCommandMetadata", {}).get("url", None)

                results.append(res)

        return results



    def to_dict(self):

        return self.videos

 

if __name__ == '__main__':

    

    uts = input('Enter search terms for youtube :')

    #Set number of video results here.

    max_results = 100

    results = YoutubeSearch(uts, max_results).to_dict()

    

    #Print results to terminal and write to log

    with open(log, 'a') as f:

        print('%s\n'*5 % (CYAN, BOLD, 'Start Search', '='*70, END))

        f.write('%s\n'*3 % ('', 'Start Search', '='*70))

        

        for result in results:

            result['Image:'] = result['Image:'][0]

            #Make complete url 

            result['Url:'] = (base_url + result['Url:'])



            print(YELLOW, '-'*70, END)

            f.write('%s\n' % ('-'*70))

            

            for key, val in result.items():

                #Colorize output

                print(GREEN, BOLD, key, END, val)

                f.write("%s  %s\n" % (key,val))

I've got a dozen scripts that I have made over time that search it in different ways. Anyway...That's my answer. You can also import youtube-dl and use it's functionality.

@teckk thank you, that was interesting. I've had to write custom things to get around stuff as well. The most programming my case was for getting cartoons to fetch), and that was in C with a little bit of bash.

I've already got youtube-dl, but I can't tell if it can ascertain next and/or previous videos from one video's url.

If it is from one uploader, then search for videos from that uploader. All of their videos will show. More likely that you will get all 3 episodes lined up in your search, or at least on the same page. You can search videos by date uploaded too. If it's from the same uploader it's likely that they uploaded them all at once. Even if the did not they may be on the same search page.

You can search youtube with youtube-dl. Read man youtube-dl when you have a free hour. That man page is getting larger.

Thanks teckk for that python youtube searcher! It works...
It's the one thing I need to avoid opening the YT web site at all. Recently it started popping up some "Consent Required" thing, very annoying.

Welcome, I've made several that do different searches, using different modules.

I know that you like youtube-dl. You can import it into python and use it. It is python after all.

This is really basic.

Code:

>>> from youtube_dl import YoutubeDL

>>> yturl = 'https://m.youtube.com/watch?v=kqtD5dpn9C8'

>>> yt = YoutubeDL()

>>> ulist = []

>>> info = yt.extract_info(yturl, download=False)

>>> print(ulist)

['Python Tutorial - Python for Beginners [2020]']

>>> ulist.append(info['description'])

print(ulist)

['Python Tutorial - Python for Beginners [2020]', 'Python Tutorial - Python for Beginners (2020 EDITION) - Learn Python quickly & easily (in 1 hour)! \n🙏 Enjoyed this video? Please vote for me as the Top Programming Guru: https://bit.ly/2G7tf2s\n👍 Subscribe for more Python tutorials like this: https://goo.gl/6PYaGF\n🔥 Want to learn more? Watch my complete Python course: https://youtu.be/_uQrJ0TkZlc\n\n 

--<snip>--



>>> ulist = []

>>> for i in info['formats']:

...    ulist.append(i['format_id'])

>>> print(ulist)

['249', '250', '140', '251', '160', '133', '278', '242', '134', '135', '243', '136', '244', '247', '137', '248', '18', '22']

That works quite well actually. I can't think of anything that works with youtube better than youtube-dl.

This will spit you out more info than you want. Title, description, actual url of videos, formats.

Code:

from youtube_dl import YoutubeDL



url = 'https://m.youtube.com/watch?v=kqtD5dpn9C8'



#Get Utube video urls

def getUtube():

    ulist = []

    yt = YoutubeDL()

    info = yt.extract_info(url, download=False)

    ulist.append(info['title'])

    ulist.append('')

    ulist.append(info['description'])

    ulist.append('')

    for i in info['formats']:

        ulist.append(i['format_id'])

        ulist.append(i['url'])

        ulist.append('')

        utList  = '\n'.join(ulist)

        print(utList)

        

getUtube()

And they change it from time to time. You'll have to download a source page and see what they have changed.

Quote:

Originally Posted by teckk (Post 6172074)

That works quite well actually. I can't think of anything that works with youtube better than youtube-dl.

But it's always based on the video URL isn't it? One cannot enter search phrases?
(I did have a lok at the man page but found no option to do that)

I just remembered this: https://gitlab.com/uoou/ytp

Downside is that it requires an API key (and thus a Google account) for the searching, though maybe that bit could be swapped for a suitably modified version of teckk's code.

@teckk: is there a license on your code?
I have modified it and would like to share it.

Quote:

Originally Posted by boughtonp (Post 6172211)

Downside is that it requires an API key

And that's the downside with so many of these tools.

Quote:

@teckk: is there a license on your code?

No, just parts of scripts that I made for myself to do something. I do look at other scripts I see posted online for ideas. There are lots of python snippets online. Such as stackoverflow. There are also python examples on youtube. Everyone else labors and gives their stuff away. So I do too.

The arch AUR and arch repo has lots of python. And then I use what I have installed. I think that urllib is a little more handy that requests. I use pyqt5 and qtwebengine instead of selenium and firefox. Anyway no, I make a script every now and then for something needed.

I don't know if a youtube page will even load in a browser now unless you have scripts turned on. Dillo, w3m, palemoon with scripts off won't display them.

There are python scripts that use googles api to search youtube. That kind of defeats the point though.

Quote:

Originally Posted by teckk (Post 6172322)

Quote:

is there a license on your code?

No [...] Everyone else labors and gives their stuff away. So I do too.

If you don't explicitly assign a license, you're not giving it away.

https://choosealicense.com/no-permission/

Quote:

But it's always based on the video URL isn't it? One cannot enter search phrases?

You can search with youtube-dl. First 10 hits for python

Code:

youtube-dl -g ytsearch10:python

You can also search by date

Code:

ytsearchdate:keyword, ytsearchdate10:keyword, ytsearchdateall:keyword



youtube-dl -g ytsearchdate3:pyqt5

Multi search term.

Code:

youtube-dl -g "ytsearch3:python scrape with urllib"

Look at man youtube-dl to spit out what info you are wanting

You can control that better by importing youtube_dl

Quote:

If you don't explicitly assign a license, you're not giving it away.

Oh ok, I'll have to read that. Never really noticed that.

Quote:

Originally Posted by teckk (Post 6172322)

OK, I'll take that as permission to redistribute under some sort of FOSS license.
Let me know what you decide on, I'll put a note in the code. For now I slapped a GPL3 on it.
This sort of stuff might seem minuscule and pointless, but I prefer to stay on top of it.

Thanks, anyhow.
All this finally got me started on python!

I didn't change the parsing mechanism, I concentrated on usability. I changed the output formatting, and it takes search terms from the clipboard & copies a chosen URL to the clipboard. That way I can immediately launch the video with another script.
The input mechanism uses readline, that's particularly cool I think: copy-pasting, line editing etc.
Here it is.

Quote:

Originally Posted by teckk (Post 6172325)

You can search with youtube-dl. First 10 hits for python

Code:

youtube-dl -g ytsearch10:python

So you can! :eek:

Quote:

Look at man youtube-dl to spit out what info you are wanting

Youtube-related "search" or "ytsearch" is not mentioned in the man page; it's one of the extractors, actually:

Code:

$> youtube-dl --list-extractors|grep -i search

CiscoLiveSearch

mailru:music:search

screen.yahoo:search

soundcloud:search

video.google:search

youtube:search

youtube:search:date

youtube:search_url

Quote:

You can control that better by importing youtube_dl

You mean in python? You bet I'll be playing with this!