vanitasvitae's blog


Archive for the ‘xmpp’ Category

More XEPs for Smack

Wednesday, February 7th, 2018

I spent the last weekend from Thursday to Sunday in Brussels at the XSF-Summit (here is a very nice post about it by JCBrand) and the FOSDEM. It was really nice to meet all the faces belonging to the JIDs you otherwise only see in the MUCs or on GitHub in real life.

There was a lot of discussion about how to make XMPP more accessible to the masses and one point that came up was to pay more attention to XMPP libraries, as they are often somewhat of a gateway for new developers who discovered the XMPP protocol. A good library with good documentation can help those new developers immensely to get started with XMPP.

During my stay and while on the train, I found some time to work on Smack again and so I added support for 3 more XEPs:

Ge0rG gave a talk about what’s currently wrong with the XMPP protocol.  One suggested improvement was to rely more on Stable and Unique Stanza IDs to improve message identification in various use-cases, so I quickly implemented XEP-0359.

XEP-0372: References is one dependency of XEP-0385: Stateless Inline Media Sharing, which I’m planning to implement next, so the boring lengthy train ride was spent adding support for XEP-0372.

A very nice XEP to implement was XEP-0392: Consistent Color Generation, which is used to generate consistent colors for usernames across different clients. I really like the accessibility aspect of that XEP, as it provides methods to generate colors easily distinguishable by users with color vision deficiency.

I hope my contributions will draw one or two developers who seak to implement a chat client themselves to the awesome XMPP protocol :)

Happy Hacking!

Smack: Some busy nights

Tuesday, January 16th, 2018

Hello everyone!

This weekend I stayed up late almost every evening. Thus I decided that I wanted to code something, but I wasn’t sure what, so I took a look at the list of published XEPs to maybe find something that is easy to implement, but missing from Smack.

I found that XEP-0394: Message Markup was missing from Smacks list of supported extensions, so I began to code. The next day I finished my work and created Smack#194. One or two nights later I again stayed up late and decided to take another look for an unimplemented XEP. I settled on XEP-0382: Spoiler Messages  this time, which was really easy to implement (apart from the one little attribute, which for whatever reason I struggled to parse until I found a solution). The result of that night is Smack#195.

So if you find yourself laying awake one night with no chance to sleep, just look out for an easy to do task on your favourite free software project. I’m sure this will help you sleep better once the task is done.

Happy Hacking!
Vanitasvitae

Reworking smack-omemo

Wednesday, January 10th, 2018

A bit over a year ago I started working on smack-omemo as part of my bachelor thesis. Looking back at the past year, I can say there could have hardly been a better topic for my thesis. Working with Smack brought me deep into the XMPP world, got me in contact with a lot of cool people and taught me a lot. Especially the past Google Summer of Code improved my skills substantially. During said event, I took a break from working on smack-omemo, while focussing on a Jingle implementation instead. After the 3 months were over, I dedicated my time to smack-omemo again and realized, that there were some points that needed improvements.

One major issue was, that my “OmemoStore” class, which is responsible for storing keys, sessions, etc. was not having access to the users data before the user logged in. The reason for that is, that my implementation allows multiple OMEMO instances to be running on the same connection. That requires the OmemoStore to store keys for multiple instances (devices), which I distinguished based on the Jid and deviceId of the user. The problem here is, that the Jid is unknown before the user logged in (they might use a burner jid for example, or use an authentication system with username and password which differ from the jid).

While this is an edgecase, it lead to issues. I implemented a workaround for that problem (using the username instead of BareJid in case the connection is not authenticated), which caused numerous problems.

I thought about replacing the Jid as an identifier with something else, but nothing was suitable, so I started a major rework of the implementation as a whole. One important aspect I wanted to preserve is that smack-omemo should still be somewhat usable even when the connection is not authenticated (ie. the user should still be able to scan qr codes and make trust decisions).

The result of my work (so far) is a diff of “+6,300 −5,361″, and a modified API (sorry to all those who already use smack-omemo :O). One major change is, that the OmemoStore no longer stores trust decisions. Instead those decisions are now made by the client itself, who must implement a OmemoTrustCallback. That way trust decisions can be made while the OmemoManager is offline. Everything else what remained in the OmemoStore is only needed when the connection is authenticated and messages are received.

Furthermore I got rid of the OmemoSession class. Session handling is done in libsignal already, so why would I want to have a session related class as well (especially since libsignal doesn’t give you any feedback about what happens with the session, so you have to keep sessions in sync manually)? I recommend everyone who wants to implement OMEMO themselves not to create a “OmemoSession” class and instead rely on libsignals session management.

OMEMO sessions are somewhat brittle. You can never know, whether a recipient received your message, or if it failed to decrypt for some reason. There is no signalling to provide feedback about the sessions state. Because of the fact that even message encryption can go wrong, the old API was very ugly. Originally I first checked, whether there are devices which still need a trust decision to be made and threw an exception if that was the case. Then I tried to build sessions for devices without session and threw an exception when session negotiation failed. Then I tried to encrypt the message for all recipients and threw an exception if something went wrong… Oh and the exception I threw when sessions could not be negotiated contained a list of all devices with intact sessions, so the user could retry to encrypt the message, only for all devices which had a session.

Ugly!!!

The new API is much cleaner. I still throw an exception when there are undecided devices, but otherwise I always return an OmemoMessage object. That object has a map of OmemoDevices for which message encryption failed, alongside the respective exceptions, so the client can check if and what went wrong.

Also sessions are now “completed” whenever a preKeyMessage arrives.
Prior to this change it could happen, that two senders chose the same PreKey from a bundle in order to create a session. That could cause on of both session to break which lead to message loss. Now whenever smack-omemo receives a preKeyMessage, it instantly responds with an empty message to make the session stable.
This was proposed by Philipp Hörist.

Other changes include a new OmemoStore implementation, the CachingOmemoStore, which can either wrap other OmemoStores to provide a caching layer, or can be used standalone as an ephemeral store for testing purposes.

Also the integration tests were improved and are much simpler and more readable now.

All in all the code got much cleaner now and I hope that at some point it will be audited to find all the bugs I oversaw :D (everyone who wants to take a look for themselves, the code can currently be found at Smacks Repository. I’m always thankful for any types of feedback)

I hope this changes will make it to Smack 4.2.3, even though here are still some things I have to do, but all in all I’m already pretty satisfied with how smack-omemo turned out so far.

Happy Hacking!

Final GSoC Blog Post – Results

Friday, August 25th, 2017

This is my final GSoC update post. My name is Paul Schaub and I participated in the Google Summer of Code for the XMPP Standards Foundation. My project was about implementing encrypted Jingle File Transfer for the client library Smack.

Google Summer of Code was a great experience! This is a sentence you probably read in almost every single students GSoC resumé. Let’s take a look at how I experienced GSoC.

GSoC for me was…

  • tiring. Writing tests is something almost everybody hates. I do especially. In the beginning I imposed on myself to do test-driven development, but I quickly began to focus on real coding instead. Nevertheless, often I sat down for hours and wrote tests for classes and methods that were designed to allow easy tests, so I have an okayish test coverage, but it is far from perfect.
  • nerve-wracking. Dealing with bugs of sub-protocols I never worked with before drove me crazy sometimes. Especially that SOCKS5 Transport bug has a special place on my hit list. All in all I think the Jingle protocol has some flaws which make it harder to implement than it should be and those flaws (more precise – the decisions I derived from the Jingle XEP design) often really bugged me.
  • highly demotivational. Can you imagine how devastating it can be, when you spend days and days getting your implementation working with itself, only to see it miserably fail when you test it the first time with another implementation? It didn’t help that I tested my coded not only against one, but two other applications.
  • desocializing. I had to dedicate a huge part of my day to coding. As a result I often had no time left for friends and family.
  • devastating. In the end I wrote far more code than I should have. Most of it was discarded along the way and got replaced. It really annoyed me having to start from zero over and over again. Also giving up on goals like using NIO for asynchronous event handling pricked my pride.
  • to the highest degree depressing. Especially near the end I sometimes sat down and starred at my screen for some time without really knowing what to do. Those days I ended up making small meaningless changes like fixing typos in my documentation or refactoring. Going to bed later on those days made me feel bad and kinda guilty.

But on the other hand, GSoC also…

  • taught me, that coding is not only fun. Annoying parts like tests are (sometimes) important and in the end it is very satisfying to see that I managed to tame my code to a degree where all test cases run successful.
  • gave me deep insights into areas and protocols that were completely new to me. Also I think you can only fully gain understanding of how things work if you tinker with it for more than just one day. Finding and fixing bugs are a good exercise for this purpose.
  • was super motivational. Contrary (or let’s say additional) to what I wrote above, it is highly motivational to see your code finally work hand in hand with other implementations. That’s the magic of decentralized protocols – knowing that on the other end is a completely different device, running a completely different operating system with a language you might have never seen before, but still in some magical way, both implementations harmonize with each other like two musicians from different countries do when playing together.
  • was very social by itself. Meeting online with members of the community was a real pleasure and I hope to be able to participate in conversations and meetings in the future too. Also there were one or two (or three…) evenings that I spent gaming with my friends, but *shht* don’t tell anyone ;P.
  • taught me important lessons. While I usually aim too high with my goals, I learned that sometimes you have to take a step back to set more reasonable goals. Nevertheless ambitious goals can’t hurt and I heard that you grow together with your challenges.
  • was satisfying as f***. Sure, sometimes I was feeling bad because I could have done more on that day, but in the end I was a little surprised seeing the whole picture and how much I really accomplished.

The Result

An overview about what I achieved during the Google Summer of Code can be found on my project page.

This week I did some more changes to my code and wrote more tests *gah*. I’m really proud that my JET proposal found it’s way into the XSFs inbox. I’m really excited, what the future may bring for my specification :)

I want to take the chance to thank everyone in the XMPP community for welcoming me and allowing me to become a part of it. I also want to thank Florian Schmaus for mentoring me and giving me the idea to participate in GSoC in the first place.

As a last impulse -  why do we need Google (thank you too btw :D ) to initiate programs like GSoC? Shouldn’t there be more public, government financiered programs? I certainly think so, even though Google did a very good job.

So in the end GSoC was really a great experience. Sometimes it was hard and challenging, but I’m sure it would have been boring without a certain degree of difficulty. I’d absolutely do it again (either as a student or maybe a mentor?) and I can only encourage students interested in free software to apply next year :)

Happy Hacking!

GSoC Week 11.5: Success!

Wednesday, August 16th, 2017

Newsflash: My Jingle File Transfer is compatible with Gajim!

The Gajim developers recently fixed a bug in their Jingle Socks5 transport implementation and now I can send and receive files without any problems between Gajim and my test client. Funny side note: The bug they fixed was very familiar to me *cough* :P

Happy Hacking!

GSoC Week 11: Practical Use

Monday, August 14th, 2017

You know what makes code of a software library even better? Client code that makes it do stuff! I present to you xmpp_sync!

Xmpp_sync is a small command line tool which allows you to sync file from one devices to one or more other devices via XMPP. It works a little bit like you might now it from eg. owncloud or nextcloud. Just drop the files into one folder and they automagically appear on your other devices. At the moment it works only unidirectional, so files get synchronized in one direction, but not in the other.

The program has to modes; master mode and slave mode. In general, a client started in master mode will send files to all clients started in slave mode. So lets say we want to mirror contents from one directory to another. We start the client on our master machine and give it a path to the directory we want to monitor. On the other machines we start the client in slave mode, and then add them to the master client. Whenever we now drop a file into the directory, it will automatically be sent to all registered slaves via Jingle File Transfer. Files do also get send when they get modified by the user. I registered a FileWatcher in order to get notified of such events. For this purpose I got in touch with java NIO again.

Currently the transmission is made unencrypted (as described in XEP-0234), but I plan to also utilize my Jingle Encrypted Transports (JET) code/spec to send the files OMEMO encrypted in the future. My plan for the long run is to further improve JET, so that it might get implemented by other clients.

Besides that I found the configuration error in my ejabberd configuration which prevented my Socks5 proxy from working. The server was listening at 127.0.0.1 by default, so the port was not reachable from the outside world. Now I can finally test on my own server :D

I also tested my code against Gajims implementation and found some more mistakes I made, which are now fixed. The Jingle InBandBytestream Transport is sort of working, but there are some more smaller things I need to change.

Thats all for the week.

Happy Hacking :)

GSoC Week 10: Finding that damn little bug

Monday, August 7th, 2017

Finally!! I found a bug which I was after for the last week. Now I finally got that little bas****.

The bug happened in my code for the Jingle SOCKS5 Bytestream Transport (XEP-0260). SOCKS5 proxies are used whenever the two endpoints can’t reach one another directly due to firewalls etc. In such a case, another entity (eg. the XMPP server) can jump in to act as a proxy between both endpoints. For that reason, the initiator (Alice) first collects available proxies, and sends them over to the responder (Bob). The responder does the same and sends its candidates back to the initiator. Both then try to connect to the candidates (in this case proxies) they got sent from their peer. In order for the proxy to know, who wants to talk to whom, both include a destination address, which is calculated as SHA1(sid, providerJid, targetJid), where the provider is the party which sent the candidates to the target.

The alert reader will know, that there are two different destination addresses in the game by now. The first one being SHA1(sid, Alice, Bob) and the second one SHA1(sid, Bob, Alice). The issue is that somewhere in the logs I ended up with 3 different destination addresses. How the hell did that happen. For the answer lets look at an example stanza:

<iq from=’romeo@montague.lit/orchard’
id=’xn28s7gk’
to=’juliet@capulet.lit/balcony’
type=’set’>
<jingle xmlns=’urn:xmpp:jingle:1′
action=’session-initiate’
initiator=’romeo@montague.lit/orchard’
sid=’a73sjjvkla37jfea’>
<content creator=’initiator’ name=’ex’>
<description xmlns=’urn:xmpp:example’/>
<transport xmlns=’urn:xmpp:jingle:transports:s5b:1′
dstaddr=’972b7bf47291ca609517f67f86b5081086052dad’
mode=’tcp’
sid=’vj3hs98y’>
<candidate cid=’hft54dqy’
host=’192.168.4.1′
jid=’romeo@montague.lit/orchard’
port=’5086′
priority=’8257636′
type=’direct’/>
</transport>
</content>
</jingle>
</iq>

Here we have a session initiation with a Jingle SOCKS5 Bytestream transport. The transport exists of one candidate. Now where was my error?

You might have noted, that there are two attributes with the name ‘sid’ in the stanza. The first one is the so called session id, the id of the session. This should not be of interest for our case. The second one however is the stream id. Thats the sid that gets crunched in the SHA1 algorithm to produce the destination address.

Well, yeah… In one tiny method used to update my transport with the candidates of the responder, I used session.getSid() instead of transport.getSid()… That was the bug, that cost me a week.

Now it wasn’t too bad. While I searched for the error, I read through the XEPs again and again, discovering some more issues which I discussed with other developers. Also I began testing my implementation against Gajim and I’m happy to tell you that the InBand Bytestream code is already working sort of. Sometimes a few bytes get lost, but we live in times of Big Data, so thats not too bad, am I right :P ?

In the last 3 weeks I plan to stabilize the API some more. Currently you can only receive data into files, but I plan to add another method which gives back a bytestream instead.

Also I need more tests. Things like that nasty sid bug can be prevented and found using junit tests, so I’ll definitely stock up on that front.

Thats all for now :) Happy Hacking!

GSoC Week 9: Bringing it back to life.

Wednesday, August 2nd, 2017

The 9th week of GSoC is there. I’m surprised of how fast time went by, but I guess that happens when you enjoy what you do :)

I’m happy to report that the third iteration of my Jingle code is working again. There are still many bugs and Socks5Transport is still missing, but all in all I’m really happy with how it turned out. Next I’ll make the implementation more solid and add more features like transport replacing etc.

Apart from normal Jingle File Transfer I also started working on my JET protocol. JET is short for Jingle Encrypted Transfers which is my approach to combining Jingle sessions with end-to-end encryption. My focus lays on modularity and easy extensibility. Roughly JET works as follows:

Lets assume, Alice wants to send an encrypted file to Bob. Luckily Alice and Bob already do have a secure OMEMO session. Alice now sends a JET File transfer request to Bob, which includes a security element which contains an OMEMO key transport message. Bob can decrypt the key transport message with his OMEMO session to retrieve an AES key. That key will be used to encrypt/decrypt the file Alice sends to Bob as soon as the jingle session negotiation is finished.

This protocol should in theory work with any end-to-end encryption method, for example also with OpenPGP. Also JET is in theory not limited to file transfer, but could also be used to secure other types of Jingle sessions, eg. Audio/Video calls. Since the development is in a very early state, it would be nice to get some feedback from more experienced developers and members of the XMPP community. A rendered version of the JET specification can be found here.

I’m very happy that encrypted File transfer already works in my implementation. I created an integration test for that, which transports the encryption key using OMEMO. Apropos tests: I created a basic JingleTransport test, which tests transport methods. Currently SOCKS5 is still failing, but I’m very close to a solution.

During the week I opened another PR against the XEPs repo, which adds a missing attribute to a XML schema in the Jingle File Transfer XEP.

GSoC Week 8: Reworking

Wednesday, July 26th, 2017

The 8th. week of GSoC is there. The second evaluation phase has begun.

This week I was not as productive as the weeks before. This is due to me having my last (hopefully :D ) bachelors exam. But that is now done, so I can finally give 100% to coding :) .

I made some more progress on my Jingle code rework. Most of the transports code is now finished. I started to rework the descriptions code which includes the base classes as well as the JingleFileTransfer code. I’m very happy with the new design, since it is way more modular and less interlocked than the last iteration. Below you can find an UML-like diagram of the current structure.

UML diagram of the jingle code

UML-like diagram of the current jingle implementation

During the rework I stumbled across a slight ambiguity in the Jingle XEP(s), which made me wonder: There are multiple jingle actions, which denote the purpose of a Jingle stanza (eg. transport-replace to replace the used transport message, session-initiate to initiate a session (duh) and so forth). Now there is the session-info Jingle action, which is used to announce session specific events. This is used for example in RTP sessions to let the peers client ring during a call, or to send the checksum of a file in a file transfer session. My problem with this is, that such use-cases are in my opinion highly description related and instead the description-info action should be used. The description part of a Jingle session is the part that represents the actual purpose of the session (eg. file transfer, video calls etc.).

The session itself is description agnostic, since it only bundles together a set of contents. One content is composed of one description, one transport and optionally one security element. In a content you should be able to combine different description, transport and security components in an arbitrary way. Thats the whole purpose of the Jingle protocol. In my opinion it does no make much sense to denote description related informational stanzas with the session-info action.

My proposal to make more use of the description-info element is also consistent with other uses of *-info actions. The transport-info action for example is used to denote transport related stanzas, while the security-info action is used for security related information.

But why do I even care?

Lets get back to my implementation for that :) . As you can see in the diagram above, I split the different Jingle components into different classes like JingleTransport, JingleSecurity, JingleDescription and so on. Now I’d like to pass component-related jingles down to the respective classes (a transport-info usually only contains information valuable to the transport-component). I’d like to do the same for the JingleDescription. At the moment I have no real “recipient” for the session-info action. It might contain session related infos, but might also be interesting for the description. As a consequence I have to make exceptions for those actions, which make the code more bloated and less logical.

Another point is, that such session-info elements (due to the fact that they target a single content in most cases) often contain a “name” attribute that matches the name of the targeted content. I’d propose to not only replace session-info with description-info, but also specify, that the description-info MUST have one or more content child elements that denote the targeted contents. That would make parsing much easier, since the parser always can expect content elements.

That’s all for now :)

Happy Hacking!

GSoC: Week 7

Tuesday, July 18th, 2017

This is my update post for the 7th week of GSoC. The next evaluation phase is slowly approaching and it is still a lot of work to do.

This week I started to work on integrating encryption in my Jingle File Transfer code. I’m very pleased, that only very minor changes are required to my OMEMO code. The OmemoManager just has to implement a single interface with two methods and that’s it. The interface should be pretty forward to implement in other encryption methods as well.

Unfortunately the same is not true for the code I wrote during the GSoC. There are many things I haven’t thought about, which require major changes, so it looks like I’ll have to rethink the concept once again. My goal is to make the implementation so flexible, that description (eg. video chat, file transfer…), transport (eg. Socks5, IBB…) and security (XTLS, my Jet spec etc.) can be mixed in arbitrary ways without adding in glue code for new specifications. Flow told me, this is going to get complicated, but I want to try anyway :D . For “safety” reasons, I’ll keep a seperate working branch in case the next iteration does not turn out as I want.

Yesterday Flow found an error in smack-omemo, which caused bundles not getting fetched. The mistake I made was, that a synchronous CarbonListener was registered in the packet-reader thread. This caused the packet-reader thread to timeout on certain messages, even though the stanzas arrived. It is nice to have this out of the way and it was a good lesson about the pitfalls of parallel programming.

While reading the Jingle File Transfer XEP I also found some missing XML schemes and proposed to replace the usage of xs:nonNegativeInteger and xs:positiveNumber with xs:unsignedLong to simplify/unify the process of implementing the XEP.

Thats basically it for this week. Unfortunately I have an upcoming exam at university next week, which means even less free time for me, but I’ll manage that. In the worst case I always have a second try on the exam :)

Happy Hacking!