LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   General (https://www.linuxquestions.org/questions/general-10/)
-   -   youtube video url series backtracking (https://www.linuxquestions.org/questions/general-10/youtube-video-url-series-backtracking-4175682452/)

jr_bob_dobbs 09-20-2020 05:20 PM

youtube video url series backtracking
 
Say you find a link to a youtube video on a forum. It's a link to part five of i don't know how many parts. Until recently, I'd pop the URL into links (the only safe non-popup non-clutter way to look at modern/current web sites) and find the URL to part 1, and all the others. and then use youtube-dl on part one and then assess and decide if I wanted to see the other parts.

Well, they (youtube) recently changed something so that doesn't work. More anti-text-mode programming on their part. Finally I had to give up and look at the page for part five in a full javoids-on browser. Wow, what a bunch of popup, makes it feel like a wack a mole game or something, but less fun. Right. Anyway, so after all that work on my part ... no links for first or previous. Good show. Golf clap. Let's reduce the actual functionality of our web site. Good job.

/sarcasm

Anyway, does anyone know of any way of getting to part one of a series of videos, given only one link to part five?

Thank you.

p.s.
I know this should not be posted here because it is "technical" but where else here could I post it, since this is not a OS-dependent, and this not a Linux question?

teckk 09-21-2020 01:50 PM

I turned to python to deal with utube years ago. You can search that using whatever criteria you wish. And I don't use their api either.

Here is an example that will search it, without javascripts running.
This simple one uses urllib.
Code:

#!/usr/bin/python

from urllib import request, error, parse
from json import loads, dumps

base_url = 'https://www.youtube.com'
log = 'utsearch.log'

#Colors
RED = '\033[31m'
BLUE = '\033[34m'
CYAN = '\033[36m'
GREEN = '\33[32m'
BLACK  = '\33[30m'
WHITE  = '\33[37m'
YELLOW = '\033[33m'
END = '\033[0;0m'
BOLD = '\033[1m'
ITALIC  = '\33[3m'
UNDERLINE = '\033[4m'


class YoutubeSearch:
    def __init__(self, search_terms: str, max_results=None):
        self.search_terms = search_terms
        self.max_results = max_results
        self.videos = self.search()

    def search(self):
        encoded_search = parse.quote(self.search_terms)

        #Search Utube by results
        urla = f'{base_url}/results?search_query={encoded_search}'
        #Search utube by date
        urlb = (f'{base_url}/results?search_query={encoded_search}'
                '&search_sort=video_date_uploaded')
        #Search utube by views
        urlc = (f'{base_url}/results?search_query={encoded_search}'
                '&search_sort=video_view_count')
        #Select search type
        url = urlb

        page = request.urlopen(request.Request(url))
        response = page.read().decode()   
        results = self.parse_html(response)
       
        if self.max_results is not None and len(results) > self.max_results:
            return results[: self.max_results]
        return results

    def parse_html(self, response):
        results = []
        start = (response.index('window["ytInitialData"]')
                + len('window["ytInitialData"]') + 3)
           
        end = response.index("};", start) + 1
        json_str = response[start:end]
        data = loads(json_str)

        videos = data["contents"]["twoColumnSearchResultsRenderer"][
        "primaryContents"]["sectionListRenderer"]["contents"][0][
        "itemSectionRenderer"]["contents"]

        #Get items from page, make a dictionary.
        for video in videos:
            res = {}
            if "videoRenderer" in video.keys():
                video_data = video.get("videoRenderer", {})
                res["Video Id:"] = video_data.get("videoId", None)
                res["Image:"] = [thumb.get(
                    "url", None) for thumb in video_data.get(
                    "thumbnail", {}).get("thumbnails", [{}]) ]
                res["Title:"] = video_data.get(
                    "title", {}).get("runs", [[{}]])[0].get(
                    "text", None)
                res["Description:"] = video_data.get(
                    "descriptionSnippet", {}).get("runs", [{}])[0].get(
                    "text", None)
                res["Channel:"] = video_data.get(
                    "longBylineText", {}).get("runs", [[{}]])[0].get(
                    "text", None)
                res["Duration:"] = video_data.get(
                    "lengthText", {}).get("simpleText", 0)
                res["Views:"] = video_data.get(
                    "viewCountText", {}).get("simpleText", 0)
                res["Url:"] = video_data.get(
                    "navigationEndpoint", {}).get(
                    "commandMetadata", {}).get(
                    "webCommandMetadata", {}).get("url", None)
                results.append(res)
        return results

    def to_dict(self):
        return self.videos
 
if __name__ == '__main__':
   
    uts = input('Enter search terms for youtube :')
    #Set number of video results here.
    max_results = 100
    results = YoutubeSearch(uts, max_results).to_dict()
   
    #Print results to terminal and write to log
    with open(log, 'a') as f:
        print('%s\n'*5 % (CYAN, BOLD, 'Start Search', '='*70, END))
        f.write('%s\n'*3 % ('', 'Start Search', '='*70))
       
        for result in results:
            result['Image:'] = result['Image:'][0]
            #Make complete url
            result['Url:'] = (base_url + result['Url:'])

            print(YELLOW, '-'*70, END)
            f.write('%s\n' % ('-'*70))
           
            for key, val in result.items():
                #Colorize output
                print(GREEN, BOLD, key, END, val)
                f.write("%s  %s\n" % (key,val))

I've got a dozen scripts that I have made over time that search it in different ways. Anyway...That's my answer. You can also import youtube-dl and use it's functionality.

jr_bob_dobbs 10-01-2020 04:45 PM

@teckk thank you, that was interesting. I've had to write custom things to get around stuff as well. The most programming my case was for getting cartoons to fetch), and that was in C with a little bit of bash.

I've already got youtube-dl, but I can't tell if it can ascertain next and/or previous videos from one video's url.

teckk 10-02-2020 08:49 AM

If it is from one uploader, then search for videos from that uploader. All of their videos will show. More likely that you will get all 3 episodes lined up in your search, or at least on the same page. You can search videos by date uploaded too. If it's from the same uploader it's likely that they uploaded them all at once. Even if the did not they may be on the same search page.

You can search youtube with youtube-dl. Read man youtube-dl when you have a free hour. That man page is getting larger.

ondoho 10-02-2020 02:50 PM

Thanks teckk for that python youtube searcher! It works...
It's the one thing I need to avoid opening the YT web site at all. Recently it started popping up some "Consent Required" thing, very annoying.

teckk 10-02-2020 04:37 PM

Welcome, I've made several that do different searches, using different modules.

I know that you like youtube-dl. You can import it into python and use it. It is python after all.

This is really basic.
Code:

>>> from youtube_dl import YoutubeDL
>>> yturl = 'https://m.youtube.com/watch?v=kqtD5dpn9C8'
>>> yt = YoutubeDL()
>>> ulist = []
>>> info = yt.extract_info(yturl, download=False)
>>> print(ulist)
['Python Tutorial - Python for Beginners [2020]']
>>> ulist.append(info['description'])
print(ulist)
['Python Tutorial - Python for Beginners [2020]', 'Python Tutorial - Python for Beginners (2020 EDITION) - Learn Python quickly & easily (in 1 hour)! \n🙏 Enjoyed this video? Please vote for me as the Top Programming Guru: https://bit.ly/2G7tf2s\n👍 Subscribe for more Python tutorials like this: https://goo.gl/6PYaGF\n🔥 Want to learn more? Watch my complete Python course: https://youtu.be/_uQrJ0TkZlc\n\n
--<snip>--

>>> ulist = []
>>> for i in info['formats']:
...    ulist.append(i['format_id'])
>>> print(ulist)
['249', '250', '140', '251', '160', '133', '278', '242', '134', '135', '243', '136', '244', '247', '137', '248', '18', '22']

That works quite well actually. I can't think of anything that works with youtube better than youtube-dl.

teckk 10-02-2020 04:57 PM

This will spit you out more info than you want. Title, description, actual url of videos, formats.

Code:

from youtube_dl import YoutubeDL

url = 'https://m.youtube.com/watch?v=kqtD5dpn9C8'

#Get Utube video urls
def getUtube():
    ulist = []
    yt = YoutubeDL()
    info = yt.extract_info(url, download=False)
    ulist.append(info['title'])
    ulist.append('')
    ulist.append(info['description'])
    ulist.append('')
    for i in info['formats']:
        ulist.append(i['format_id'])
        ulist.append(i['url'])
        ulist.append('')
        utList  = '\n'.join(ulist)
        print(utList)
       
getUtube()

And they change it from time to time. You'll have to download a source page and see what they have changed.

ondoho 10-03-2020 02:41 AM

Quote:

Originally Posted by teckk (Post 6172074)
That works quite well actually. I can't think of anything that works with youtube better than youtube-dl.

But it's always based on the video URL isn't it? One cannot enter search phrases?
(I did have a lok at the man page but found no option to do that)

boughtonp 10-03-2020 06:43 AM


 
I just remembered this: https://gitlab.com/uoou/ytp

Downside is that it requires an API key (and thus a Google account) for the searching, though maybe that bit could be swapped for a suitably modified version of teckk's code.


ondoho 10-03-2020 08:56 AM

@teckk: is there a license on your code?
I have modified it and would like to share it.

Quote:

Originally Posted by boughtonp (Post 6172211)
Downside is that it requires an API key

And that's the downside with so many of these tools.

teckk 10-03-2020 03:58 PM

Quote:

@teckk: is there a license on your code?
No, just parts of scripts that I made for myself to do something. I do look at other scripts I see posted online for ideas. There are lots of python snippets online. Such as stackoverflow. There are also python examples on youtube. Everyone else labors and gives their stuff away. So I do too.

The arch AUR and arch repo has lots of python. And then I use what I have installed. I think that urllib is a little more handy that requests. I use pyqt5 and qtwebengine instead of selenium and firefox. Anyway no, I make a script every now and then for something needed.

I don't know if a youtube page will even load in a browser now unless you have scripts turned on. Dillo, w3m, palemoon with scripts off won't display them.

There are python scripts that use googles api to search youtube. That kind of defeats the point though.

boughtonp 10-03-2020 04:12 PM

Quote:

Originally Posted by teckk (Post 6172322)
Quote:

is there a license on your code?
No [...] Everyone else labors and gives their stuff away. So I do too.

If you don't explicitly assign a license, you're not giving it away.

https://choosealicense.com/no-permission/


teckk 10-03-2020 04:17 PM

Quote:

But it's always based on the video URL isn't it? One cannot enter search phrases?
You can search with youtube-dl. First 10 hits for python
Code:

youtube-dl -g ytsearch10:python
You can also search by date
Code:

ytsearchdate:keyword, ytsearchdate10:keyword, ytsearchdateall:keyword

youtube-dl -g ytsearchdate3:pyqt5

Multi search term.
Code:

youtube-dl -g "ytsearch3:python scrape with urllib"
Look at man youtube-dl to spit out what info you are wanting

You can control that better by importing youtube_dl

teckk 10-03-2020 04:18 PM

Quote:

If you don't explicitly assign a license, you're not giving it away.
Oh ok, I'll have to read that. Never really noticed that.

ondoho 10-04-2020 01:27 AM

Quote:

Originally Posted by teckk (Post 6172322)
No, just parts of scripts that I made for myself to do something. I do look at other scripts I see posted online for ideas. There are lots of python snippets online. Such as stackoverflow. There are also python examples on youtube. Everyone else labors and gives their stuff away. So I do too.

OK, I'll take that as permission to redistribute under some sort of FOSS license.
Let me know what you decide on, I'll put a note in the code. For now I slapped a GPL3 on it.
This sort of stuff might seem minuscule and pointless, but I prefer to stay on top of it.

Thanks, anyhow.
All this finally got me started on python!

I didn't change the parsing mechanism, I concentrated on usability. I changed the output formatting, and it takes search terms from the clipboard & copies a chosen URL to the clipboard. That way I can immediately launch the video with another script.
The input mechanism uses readline, that's particularly cool I think: copy-pasting, line editing etc.
Here it is.

Quote:

Originally Posted by teckk (Post 6172325)
You can search with youtube-dl. First 10 hits for python
Code:

youtube-dl -g ytsearch10:python

So you can! :eek:
Quote:

Look at man youtube-dl to spit out what info you are wanting
Youtube-related "search" or "ytsearch" is not mentioned in the man page; it's one of the extractors, actually:
Code:

$> youtube-dl --list-extractors|grep -i search
CiscoLiveSearch
mailru:music:search
screen.yahoo:search
soundcloud:search
video.google:search
youtube:search
youtube:search:date
youtube:search_url

Quote:

You can control that better by importing youtube_dl
You mean in python? You bet I'll be playing with this!


All times are GMT -5. The time now is 08:45 PM.