why do YouTube video codes have _ and - ?
does anyone know why YouTube uses _ and - in the video codes of their URLs?
|
Do you mean like this?
https://m.youtube.com/watch?v=abc123de-fg https://m.youtube.com/watch?v=ABcDE_12f45 I's a 11 character identifier. Apparently _ and - are allowed characters. |
It's most likely a variant of Base64 using URL-friendly characters, as defined in sections 4/5 of RFC 4648.
(I don't know why they use Base64 though.) |
Quote:
|
Quote:
or may they are encoding a 66 bit number. |
I don't think YT video IDs are encoded or checksums or anything.
Considering how many videos there have ever been on youtube, cumulatively... They are probably simple sequential IDs with all alphanumeric characters + (under)score. |
i would not expect them to be checksums. but that would not be impossible. it would still have to be an effective ID since that is all that varies in the URL to know which video is requested. the remaining question is whether the data is 64 bits or 66 bits. if it is 64 bits and i designed it, there would only be 57 alphanumerics.
|
Quote:
Quote:
It also points out that, due to the nature of base64, a video ID will always end in one of these characters [048AEIMQUYcgkosw] whilst a channel ID is more constrained and will end in one of only four characters [AQgw]. (An ID that doesn't end in any of those characters would be evidence against base64 being used, if you can find one?) Quote:
|
i doubt they are sequential IDs. i have seen videos posted the same day with radically different IDs and other videos posted long ago with and ID between them. they could be picked randomly with a, hopefully strong, random number generator. just how strong it needs to be could be a debate topic.
i do suspect they are 64-bit integers and not 66-bit because that is so many that 66 would have little advantage over 64. i'd then go with 128-bit. 11 characters allows encoding 64 bits in a base as small as 57 (2**64 < 57**11). that base allows an alphabet with 7 fewer characters. i'd get rid of the 2 special characters, first. then, i'd get rid of 5 characters that could look like another to humans, such as the letter O. my alphabet57: Code:
ABCDEFGHJKLMNPQRSTUVWXYZabcdefghjkmnpqrstuvwxyz0123456789 |
can we find even one user who has posted over a hundred million videos even if just auto-posting junk?
|
All times are GMT -5. The time now is 03:34 AM. |