Today, I just unlocked a major achievement and got audio/video calls (DTLS-SRTP) working with someone who resides behind a nation-wide firewall that blocks many message services.
- Server: Raspberry Pi running Prosody on Debian GNU/Linux (and lighttpd for updating LetsEncrypt certificates)
- Clients: Conversations.im on Android/Linux
Until today, we "only" had working end-to-end OMEMO encryption in both private and group chat.
It turns out that audio/video calls are being routed directly between the clients over UDP, after they have agreed on an encryption key and exchanged addresses over the secure XMPP connection. The server will not see the audio/video packet stream at all; the calls are a feature of the client, not part of XMPP itself.
If either party is behind a NAT router that would not translate UDP traffic, the audio/video chat simply would not work at all. As soon as I disabled my phone's WLAN connection to the NAT router, it switched to the mobile data connection with a public IP address, and the calls just worked.
Why do we really need proprietary things that are not even end-to-end encrypted, like Zoom, Slack, or Teams? And why are not more companies paying the likes of conversations.im for a hosted service? Is it just because the corporate
PHBs never got fried for buying the IBM of the day?