LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   General (https://www.linuxquestions.org/questions/general-10/)
-   -   Firefox adult/drug/gambling/alcohol site filter: md5 encryption/deciphering & base64 encoding/decoding (https://www.linuxquestions.org/questions/general-10/firefox-adult-drug-gambling-alcohol-site-filter-md5-encryption-deciphering-and-base64-encoding-decoding-4175648640/)

l0f4r0 02-19-2019 12:28 PM

Firefox adult/drug/gambling/alcohol site filter: md5 encryption/deciphering & base64 encoding/decoding
 
Following that article, I'm trying to understand how one can go from https://dxr.mozilla.org/mozilla-cent...ilterAdult.jsm to https://github.com/matthewruttley/co...ter/sites.json.

It seems base domains are first md5 encrypted and then base64 encoded but I cannot verify that with any provided example...

Let's take "bet365.com":
Code:

$ echo -n "bet365.com" | md5sum
aa90cd229c5a87c1548aa5e6e4845a52
$ echo -n "aa90cd229c5a87c1548aa5e6e4845a52" | base64
YWE5MGNkMjI5YzVhODdjMTU0OGFhNWU2ZTQ4NDVhNTI=

...but the last result string doesn't match anything in https://dxr.mozilla.org/mozilla-cent...ilterAdult.jsm!

I've certainly missed something because my last base64 string:
  • ends with 1 "=" instead of 2,
  • contains 45 characters instead of 25...

...but I don't know what/where.
Any hint please?
Thanks :)

273 02-19-2019 01:41 PM

They my well contain a salt? I'm certainly wishing I had an alternative to Firefox if they are really so patheic as to censor with a "naughty list" -- I didn't realise Firefox was programmeed by babies.

Woolie Wool 02-19-2019 03:03 PM

Have you considered that some Firefox users are parents? Parents take care of these little creatures called "children", who are incompetent, at least in law, to make decisions for themselves, thus parents have the option of a content filter. If you do not use it, it does not affect you.

fido_dogstoyevsky 02-19-2019 04:43 PM

Quote:

Originally Posted by Woolie Wool (Post 5964161)
Have you considered that some Firefox users are parents? Parents take care of these little creatures called "children", who are incompetent, at least in law, to make decisions for themselves, thus parents have the option of a content filter. If you do not use it, it does not affect you.

Actually one of the required competencies for a parent is the ability to supervise their children and not rely on some third party. The filter should be unnecessary.

Woolie Wool 02-19-2019 04:45 PM

You want to supervise your child 24/7 like she's a kid in the goddamn Panopticon or something? Why does it matter to you? It's not your kid. Nobody's making you turn the filter on.

scasey 02-19-2019 04:54 PM

Personally, the first thing I do with Firefox is turn off ALL of that new tab stuff. I configure it to give me the search page for DDG.

So, I can't say if it's even possible to turn of the referenced filtering. Discussion of whether or not the filter is seems, to me, to be off-topic.

That said, l0f4r0, I don't think I understand your question, either.
Is bet365.com being filtered from the new tab?
Do you want it to be and it's not?

Please clarify.

fido_dogstoyevsky 02-19-2019 04:59 PM

Quote:

Originally Posted by Woolie Wool (Post 5964202)
You want to supervise your child 24/7 like she's a kid in the goddamn Panopticon or something? Why does it matter to you? It's not your kid. Nobody's making you turn the filter on.

Actually, it was my kids.

If you're not willing or able to supervise 24/7 you shouldn't consider being a parent.

Edit: "supervise" means watching for problems, not necessarily "forbidding".

l0f4r0 02-20-2019 07:02 AM

Quote:

Originally Posted by 273 (Post 5964127)
They my well contain a salt?

Humm very clever, didn't think about that :)
It's a possibility indeed.

Quote:

Originally Posted by scasey (Post 5964208)
That said, l0f4r0, I don't think I understand your question, either.
Is bet365.com being filtered from the new tab?
Do you want it to be and it's not?
Please clarify.

No no, actually my thread is not about the filter relevance (so people try to keep on topic please) nor content.
I'm just trying to verify technically by myself how plain text base domain entries have been encrypted&encoded so they appear that way in the filter source code.
bet365.com was just an example to go through the whole process (I hoped to fall on my feet but apparently not...)

ntubski 02-20-2019 07:07 AM

Quote:

Originally Posted by Woolie Wool (Post 5964161)
Have you considered that some Firefox users are parents? Parents take care of these little creatures called "children", who are incompetent, at least in law, to make decisions for themselves, thus parents have the option of a content filter. If you do not use it, it does not affect you.

It's not really a content filter, it just blocks sites from showing on the "new tab screen". You (or your children) can still visit as much as you want...


Quote:

Originally Posted by l0f4r0 (Post 5964105)
It seems base domains are first md5 encrypted and then base64 encoded but I cannot verify that with any provided example...

Let's take "bet365.com":
Code:

$ echo -n "bet365.com" | md5sum
aa90cd229c5a87c1548aa5e6e4845a52
$ echo -n "aa90cd229c5a87c1548aa5e6e4845a52" | base64
YWE5MGNkMjI5YzVhODdjMTU0OGFhNWU2ZTQ4NDVhNTI=

...but the last result string doesn't match anything in https://dxr.mozilla.org/mozilla-cent...ilterAdult.jsm!

First note that md5sum has some extra output apart from the has itself:
Code:

$ echo -n "bet365.com" | md5sum
aa90cd229c5a87c1548aa5e6e4845a52  -

We can get rid of that with tr, but the base64 result still doesn't look right:

Code:

$ echo -n "bet365.com" | md5sum | tr -dc 'a-f0-9'
aa90cd229c5a87c1548aa5e6e4845a52$ echo -n "bet365.com" | md5sum | tr -dc 'a-f0-9' | base64
YWE5MGNkMjI5YzVhODdjMTU0OGFhNWU2ZTQ4NDVhNTI=

The problem is that md5sum prints the hash in base16, so we're getting the base64 of the base16 ASCII encoding. We can use xxd to convert to binary:

Code:

$ echo -n "bet365.com" | md5sum | tr -dc 'a-f0-9' | xxd -r -p | base64
qpDNIpxah8FUiqXm5IRaUg==

Now we have something that looks like it should be in the list, but you won't find it! That's because the list is updated over time, but you can find it in the original: https://hg.mozilla.org/mozilla-centr...b096219c#l1.45

l0f4r0 02-20-2019 11:29 AM

^ Awesome, very well done ntubski :)
I didn't realize that MD5 is hex/base16. For me it was just ASCII with [a-f0-9] range...
Question: why not encode base16 into base64 directly? Would there be any drawback?

273 02-20-2019 01:37 PM

Quote:

Originally Posted by Woolie Wool (Post 5964161)
Have you considered that some Firefox users are parents? Parents take care of these little creatures called "children", who are incompetent, at least in law, to make decisions for themselves, thus parents have the option of a content filter. If you do not use it, it does not affect you.

You are responsible for your prodgeny and nobody else. It does affect me -- for example I had yo tell my ISP "I want to see porn'" just to get a slightly less filtered internet. What is wrong with you people? Parent your children properly! Stop making everyone else a victim of your inability to parent!

ntubski 02-20-2019 04:16 PM

Quote:

Originally Posted by l0f4r0 (Post 5964557)
Question: why not encode base16 into base64 directly? Would there be any drawback?

Not sure what you mean. If you meant the command pipeline I posted above, it's just that I'm not aware of any base16->base64 program. If you meant the Firefox code, then it already encodes directly to base64 (without going to base16 at all).

PS Can't you guys take your off-topic "Censorship!" vs "Think of the children!" arguments to another thread?

ondoho 02-21-2019 05:12 AM

Quote:

Originally Posted by l0f4r0 (Post 5964105)

what a perfect way of showing a computer-literate kid where the naughty stuff is!
Quote:

Originally Posted by ntubski (Post 5964430)
It's not really a content filter, it just blocks sites from showing on the "new tab screen". You (or your children) can still visit as much as you want...

oh i see. probably makes sense, but way too much work to keep a list like that updated...
also it sort of implies that FF users go to "bad" sites and want to hide that from other users...
______________________
Quote:

Originally Posted by fido_dogstoyevsky (Post 5964200)
Actually one of the required competencies for a parent is the ability to supervise their children and not rely on some third party.

Quote:

Originally Posted by fido_dogstoyevsky (Post 5964212)
If you're not willing or able to supervise 24/7 you shouldn't consider being a parent.
Edit: "supervise" means watching for problems, not necessarily "forbidding".

QFT!
definitely true for the first few years; it gets easier after that.

l0f4r0 02-24-2019 12:35 PM

Quote:

Originally Posted by ondoho (Post 5964887)
what a perfect way of showing a computer-literate kid where the naughty stuff is!

Not sure if it's a criticism towards me but, just in case, do you really think computer-literate kids need that kind of listing to locate naughty stuff nowadays? ^^

Quote:

Originally Posted by ntubski (Post 5964672)
Not sure what you mean. If you meant the command pipeline I posted above, it's just that I'm not aware of any base16->base64 program. If you meant the Firefox code, then it already encodes directly to base64 (without going to base16 at all).

I was just referring to your following sentence:
Quote:

The problem is that md5sum prints the hash in base16, so we're getting the base64 of the base16 ASCII encoding.
That's why you suggested to convert base16 to binary via xxd and then to base64.
My question is: would there be any drawback to convert from base16 to base64 directly? I don't mean a program which would do that without explicitely going to temporary binary form, but I really mean a base64 of a base16 string. In other words I mean:
Code:

echo -n "bet365.com" | md5sum | tr -dc 'a-f0-9' | base64
versus
Code:

echo -n "bet365.com" | md5sum | tr -dc 'a-f0-9' | xxd -r -p | base64
Quote:

Originally Posted by ntubski (Post 5964672)
PS Can't you guys take your off-topic "Censorship!" vs "Think of the children!" arguments to another thread?

+1 definitely. As I said previously this is not the goal here. This thread is only technical (md5/base64). Thank you for your understanding.

ntubski 02-24-2019 08:49 PM

Quote:

Originally Posted by l0f4r0 (Post 5966350)
My question is: would there be any drawback to convert from base16 to base64 directly? I don't mean a program which would do that without explicitely going to temporary binary form, but I really mean a base64 of a base16 string. In other words I mean:
Code:

echo -n "bet365.com" | md5sum | tr -dc 'a-f0-9' | base64
versus
Code:

echo -n "bet365.com" | md5sum | tr -dc 'a-f0-9' | xxd -r -p | base64

Yes, the drawback is that the former gives the wrong answer :)

l0f4r0 02-25-2019 03:25 PM

^ Yes of course :D
But I think it would be technically OK if Firefox used the same algorithm from its side. Eventually, it's just a choice, isn't it?

ntubski 02-26-2019 07:33 AM

Quote:

Originally Posted by l0f4r0 (Post 5966839)
But I think it would be technically OK if Firefox used the same algorithm from its side. Eventually, it's just a choice, isn't it?

Yes, but it would be a pretty silly choice for Firefox to convert to base16, and then take base64 of the ascii of that. It takes extra space and would be slightly slower for no benefit.

l0f4r0 03-10-2019 04:48 PM

Thank you ntubski.
You have answered all my questions :)

Sammy885 06-08-2020 03:17 PM

It is a great option to have.


All times are GMT -5. The time now is 05:48 PM.