stargrave's blog

FSFE Fellowship Blogs weblog

GoVPN: secure censorship resistant VPN daemon history and implementation decisions

March 13th, 2016

This article tells about GoVPN free software daemon: why it was born, what tasks it is aimed to solve, technical overview.

Birth and aimed targets.

There are plenty of various data transport securing protocols and implementations. If you just want to connect either two computers, or two networks, then you can use: TLS, SSH, IPsec, MPPE, OpenVPN, tinc and many others. All of them could provide confidentiality, authenticity of transmitted data and both sides authentication.

But I, being an ordinary user, found that lacking of strong password authentication capability is very inconvenient. Without strong password based authentication I always have to carry high entropy private keys with me. But, being the human, I am able to memorize long passphrases, that have enough entropy for authenticating myself and establishing secure channel.

Probably the most known strong password authentication protocol is Secure Remote Password (SRP). Except for various JavaScript-based implementations, I know only lsh SSHv2 daemon supporting SRP and GnuTLS supporting TLS-SRP. Replacing OpenSSH with lsh is troublesome. TLS-SRP must be supported not only by the underlying library. So SRP hardly can be used in most cases in practice.

My first target: strong password authentication and state-of-art robust cryptography.

Moreover the next problem is protocols and code complexity. Is is the strongest enemy of security and especially all cryptography-related solutions. TLS is badly designed (remember at least MAC-then-Encrypt) and the most popular OpenSSL library is hated by overall humanity. OpenSSH gained -etm MAC modes not so long ago. IPsec is good protocol, but its configuration is not so easy. OpenVPN is working and relatively simple solution, but it is not aware of modern fast encryption and authentication algorithms. And the codebase of all those projects is big enough not to look at, but just trust and hope that no serious bugs won’t be found anymore. OpenSSL demonstrates us that huge open-source community is not enough for finding critical bugs.

My second target: KISS, small codebase and simple, very simple reviewable code and protocol. Without unnecessary complexity, without explicit compatibility with previous solutions.

The next question I am aware of is: why all those existing protocols are too easy to distinguish one from another and filter on DPI-level state firewalls? Basically I do not have much against censorship, because it is necessary anyway, but DPI solutions, as a rule, are so crude and clumsy that deal big harm to innocent servers and users, destroying the Internet as a fact, leaving only huge Faceboogle corporations alive. I like all-or-nothing solutions: either I have got working data transmission routed payed channel through the ISP, or I have got nothing, because no — having only Facebook, YouTube, Gmail and VKontakte access is useless to me at all.

My third target: make more or less censorship-resistant protocol, where nobody can distinguish it from, for example, cat /dev/urandom | nc remote.

And of course, zero target: make it free software, without any doubts, so everyone can benefit from its existence.

Daemon overview.

GoVPN does not use any new fresh technologies and protocols. It does not use not well-studied cryptographic solutions. I do not violate the rule: do not create and implement crypto by yourself. Well, less of more. All critical low-level algorithms, except for some simple ones, are included and written by true crypto gurus. All cryptography must be proved by time.

I decided to use Go programming language. It is mature enough for that kind of tasks, very easy to read and support. Simplicity, reviewability and supportability can easily be achieved with it.

From VPN daemon point of view, here is its current state:

  • Works with layer 2 TAP virtual network interfaces.
  • Single server can work with multiple clients, each with its own configuration, possible up/down-hooks.
  • Works over either UDP, TCP, or HTTP proxies with CONNECT method. IPv4/IPv6 supported.
  • Client is single executable binary with a few command line options. Server is a single executable binary with single YAML configuration file.
  • Built-in rehandshaking and heartbeating.

Client authentication tokens.

All client are identified by 128-bit random number. It is not explicitly transmitted in the clear — so others can not distinguish one client’s session from another. Mutual client-server authentication is performed using so-called pre-shared verifier. Client’s identity, verifier and memorable passphrase is everything you need. Example client id with verifier is:

$argon2d$m=4096,t=128,p=1$4lG67PhgB0qCh7xB+3a+eA$NjUo1kV/L19wP2+htdJA4qIVNlS72riT3E8wfse4jJM

Transport protocol.

Let’s dive deeper in its protocol. Basically it includes: transport protocol and handshake protocol.

Transport protocol is very straightforward from modern cryptographic point of view. Basically it is similar (but not the same) to Bernstein’s NaCl solution:

TAG || ENCRYPTED || NONCE

Tag is Poly1305 authentication over the whole data packet. Nonce is the incrementing counter (odd values are server ones, even are client’s). Encryption is done over padded payload with Salsa20 symmetric encryption algorithm.

Nonce is not secret information, so can be sent in the clear. But it will be easily detected and censored — one knows that this is some kind of nonce-encrypted traffic. So I decided to obfuscate it using PRP (pseudo random permutation) function XTEA. It is very simple in implementation and fast enough for short (8 byte) payloads. It does not add any security, but randomizes the data making DPI censorship the hard task. Nonce encryption key is derived from the session one after the handshake stage.

Nonce is used for replay-attack detection and prevention. We memorize the previous ones and check if they are met again. In TCP mode all messages have guaranteed delivery order, so any desynchronization leads to immediate disconnection. In UDP mode messages can be delivered in varying time, so we have small bucket storage of nonces.

Most protocols does not hide underlying messages lengths. Data can stay confidential, but its size and time of appearance can tell much about traffic inside the VPN. For example relatively easily you can tell that DHCP is passing through the tunnel. Moreover you can watch impact of data transmission inside the tunnel and external system’s behaviour. This is metainformation leak.

Noise can be used to hide message length. GoVPN pads the payload before encryption by appending 0x80 and necessary number of zeros. Anyway after encryption they will look like pseudo-random noise. Heartbeat packets have zero payload length, consisting only of padding. All packets will have the same (maximal) size. Of course this consumes the traffic, so it can be rather expensive.

PAYLOAD || 0x80 || 00 || ...

Authentication tag looks like noise that never repeats among all sessions (probability is negligible), encrypted nonce with ephemeral session key also repeats with negligible probability, and an encrypted payload also look like noise. Adversary does not see any structure.

GoVPN also can hide messages timestamps: time of their appearance. Idea is pretty simple and similar to the noise: constant packet rate traffic. Your tunnel will have fixed transmission speed. Large data amount will be slowly transmitted, while absence of the real payload will be hidden with zero-sized (but padded) packets. One can not distinguish the “empty” channel from the loaded one.

Why nonce is located at the end of the packet? Because we do not have already separated one from another messages in TCP mode, unlike UDP. In TCP mode we have got stream of pseudo-random bytes. But it guarantees order of delivery — so we can predict the next nonce value. As we know nonce PRP encryption key, we can also predict its real value. So we just wait for that expected value to determine the borders of transmitted message. We can not add clearly visible structure, because it will be visible also to DPI system and thus can be censored.

Salsa20 encryption key is generated every time for each session during handshake procedure. It is ephemeral — so compromising of your passphrase can not reveal encryption and authentication keys. This is called perfect forward secrecy (PFS) option. Poly1305 uses one-time authentication keys derived from Salsa20’s ciphertext, similarly to NaCl. Unlike many block-cipher based modes and implementations, Salsa20+Poly1305 does not consume entropy for any kind of initialization vectors.

Handshake protocol.

The most complex part is the handshake procedure.

At first, you need Diffie-Hellman protocol. It is simple, well-studied and de-facto protocol for establishing ephemeral session keys. Our choice is curve25519 protocol. It could be very trivial:

┌─┐          ┌─┐
│C│          │S│
└┬┘          └┬┘
 │  CDHPub    │
 │───────────>│
 │            │
 │  SDHPub    │
 │<───────────│
 │            │

Peers send their public curve25519 public keys and performs computation that should result in identical result. That result is not random data ready to be used as a key, but elliptic curve point. We can hash it for example to make it uniform pseudo-random string — session key.

SessionKey = H(curve25519(ourPrivate, remotePublic))

This scheme of course can not be used because it lacks peers authentication. We can use encrypted key exchange (EKE) technique: encrypt Diffie-Hellman packets with pre-shared symmetric secret. That way we provides indirect authentication: if any peer does not know shared symmetric secret, then it won’t decipher public key correctly and derive the same session key. For symmetric encryption we could use Salsa20:

┌─┐                     ┌─┐
│C│                     │S│
└┬┘                     └┬┘
 │enc(SharedKey, CDHPub) │
 │──────────────────────>│
 │                       │
 │enc(SharedKey, SDHPub) │
 │<──────────────────────│
 │                       │

Salsa20 is a stream cipher, so it is fatal if encryption parameters are used twice. Our shared secret is constant value, so we have to provide random nonce R each time. It is not secret information, so we can send it in the clear. The response packet from the server can increment it to derive another usable nonce value:

┌─┐                           ┌─┐
│C│                           │S│
└┬┘                           └┬┘
 │R, enc(SharedKey, R, CDHPub) │
 │────────────────────────────>│
 │                             │
 │enc(SharedKey, R+1, SDHPub)  │
 │<────────────────────────────│
 │                             │

We can not use low-entropy passwords for SharedKey in the scheme above. One can intercept our packets and brute-force (dictionary attack) the password, checking on each attempt if deciphered message contains elliptic curve point. Problem here is that adversary is capable to understand if he decrypted the message successfully.

Thank goodness for Elligator encoding algorithm! This encoding is capable to encode some elliptic curve points to the uniform string and vice versa. Not all points can be converted — only a half in the average, so we could generate ephemeral curve25519 keypairs more than once during single session. By applying this encoding we remove adversary’s ability to distinguish successful decryption from the failed one — any plaintext will look like uniform pseudo-random string. That solution is commonly called password authenticated key agreement (PAKE).

┌─┐                              ┌─┐
│C│                              │S│
└┬┘                              └┬┘
 │R, enc(Password, R, El(CDHPub)) │
 │───────────────────────────────>│
 │                                │
 │enc(Password, R+1, El(SDHPub))  │
 │<───────────────────────────────│
 │                                │

But we still do not authenticate peers explicitly. Of course if our passwords are not equal, then derived session key will be wrong and transport layer authentication will fail immediately, but nobody guarantees us that transport layer will transmit packets immediately after handshake is completed.

For that task we just send random number using the session-key and wait for the same response from the remote side. So client authentication will look like this (RS is the server’s random number):

┌─┐                                            ┌─┐
│C│                                            │S│
└┬┘                                            └┬┘
 │       R, enc(Password, R, El(CDHPub))        │
 │─────────────────────────────────────────────>│
 │                                              │
 │enc(Password, R+1, El(SDHPub)), enc(K, R, RS) │
 │                                              │
 │                                              ────┐
 │                                                  │ compare(RS)
 │                                              <───┘
 │                                              │

And to perform mutual authentication we do the same (RC is client’s random number):

┌─┐                                            ┌─┐
│C│                                            │S│
└┬┘                                            └┬┘
 │       R, enc(Password, R, El(CDHPub))        │
 │─────────────────────────────────────────────>│
 │                                              │
 │enc(Password, R+1, El(SDHPub)), enc(K, R, RS) │
 │                                              │
 │                                              ────┐
 │                                                  │ compare(RS)
 │                                              <───┘
 │                                              │
 │               enc(K, R+2, RC)                │
 │<─────────────────────────────────────────────│
 │                                              │
 ────┐                                          │
     │ compare(RC)                              │
 <───┘                                          │

This is under question is it needed, but some protocols provide explicit pre-master keys, master key sources. Diffie-Hellman derived keys may contain not enough entropy for long-time usage. So we additionally transmit pre-master secrets (this is terminology is taken from TLS) from both sides: 256-bit random strings. Resulting master session key that will be used in the transport protocol is just a XOR of two pre-master keys. If one communication party does not behave honestly and does not generate ephemeral keys every time — XORing its permanent keys with the random ones of the honest one will give your perfect forward secrecy anyway. SC and SS are pre-master keys of the client and server sides.

┌─┐                                               ┌─┐
│C│                                               │S│
└┬┘                                               └┬┘
 │        R, enc(Password, R, El(CDHPub))          │
 │────────────────────────────────────────────────>│
 │                                                 │
 │enc(Password, R+1, El(SDHPub)), enc(K, R, RS+SS) │
 │                                                 │
 │                                                 ────┐
 │                                                     │ compare(RS)
 │                                                 <───┘
 │                                                 │
 │                enc(K, R+2, RC)                  │
 │<────────────────────────────────────────────────│
 │                                                 │
 ────┐                                             │
     │ compare(RC)                                 │
 <───┘                                             │

Augmented EKE.

Are we satisfied now? Not yet! Our password is known both to client and server. If the later one is compromised, then adversary get our secret. There are so-called augmented encrypted key exchange protocols. Actual secret is kept only on client’s side. Server side keeps so called verifier — something that can approve client knowledge of the secret.

That kind of proof can be achieved using asymmetric digital signatures. So we use the passphrase as an entropy source for creating digital signature keypair. Its public key is exactly that kind of verifier that will be stored on the server’s side. For convenience we use hash of that public key as a key for symmetric encryption in EKE protocol.

For proving the knowledge of the secret key we have to make a signature with it. We just sign our handshake ephemeral symmetric key. H() is the hash function (BLAKE2b algorithm), DSAPub is the public key derived from user’s passphrase (ed25519 algorithm).

┌─┐                                                ┌─┐
│C│                                                │S│
└┬┘                                                └┬┘
 │        R, enc(H(DSAPub), R, El(CDHPub))          │
 │─────────────────────────────────────────────────>│
 │                                                  │
 │enc(H(DSAPub), R+1, El(SDHPub)), enc(K, R, RS+SS) │
 │                                                  │
 │                                                  ────┐
 │                                                      │ compare(RS)
 │                                                  <───┘
 │                                                  │
 │                                                  ────┐
 │                                                      │ Verify(DSAPub, Sign(DSAPriv, K), K)
 │                                                  <───┘
 │                                                  │
 │                 enc(K, R+2, RC)                  │
 │<─────────────────────────────────────────────────│
 │                                                  │
 ────┐                                              │
     │ compare(RC)                                  │
 <───┘                                              │

I want to note again: R, El(…), all sent ciphertexts — all of them looks like a random strings for the third party that never repeat and does not have any visible structure. So DPI hardly can determine is it GoVPN’s handshake messages.

Elligator encoding of curve25519 public keys provides zero-knowledge strong password authentication, that is immune to offline dictionary attacks. Even if our password is “1234” — you can not check in offline if it is true while having all intercepted ciphertexts.

Server does not know our cleartext secret passphrase — it knows only its derivative in the form of public key. But it still can be dictionary attacked. If server’s verifiers are compromised, then you can quickly check if public key (verifier) corresponds for example to “1234” password.

We can not protect ourselves from this kind of attack. Strong passphrases still is important. But at least we can harden dictionary attack by strengthening those password. It is well known practice: PBKDF2, bcrypt, scrypt and similar technologies. As a rule they contain some very slow function (to decrease attack rate) and a “salt” for increasing the entropy and randomizing equal passwords.

We use password hashing competition winner: Argon2 algorithm. Client’s identity used a salt. ed25519 keypair is generated from the strengthened password derivation. It is computed only during session initialization on the client side once.

PrivateKey    Verifier -----> Server storage
    ^         ^
    |        /
    |       /
    |      /
ed25519Generate(strongpass)
                     ^
                     |
                     |
                  Argon2(Password, salt=ClientId)
                                           ^
                                           |
                                           |
                                        ClientId = random(128bit)

DPI resistant handshake packets.

And again there is still another problem: we have not yet transmitted our client’s identity. Server does not know what verifier must be used for handshake processing. If we transmit it in clear, then third party will see the same repeated string during each handshake. It does not harm confidentiality and security, but it is the leakage of deanonymization metainformation.

Moreover all handshake packets have the same size and behaviour: 48 bytes from client to server, 80 bytes response, 120 bytes again, 16 bytes response. Handshake behaviour still differs from the transport one.

Each handshake packet is padded similarly to transport messages:

HANDSHAKE MSG = [R] || enc(PAYLOAD || 0x80 || 0x00 || ...)

After its encryption we have got pseudo-random noise with maximal size indistinguishable from other packets.

And each handshake packet has appended so called IDtag. This tag is XTEA encryption of the first 8 bytes of the message using client’s identity as a key. When server gets handshake messages it takes all known client identities and tries to decrypt last 8 bytes and compare it with the first 8 bytes of the message. Of course this search time grows linearly with the number of clients, but XTEA is pretty fast and that searching is needed only during handshake messages processing.

      HANDSHAKE MSG = [R] || enc(PAYLOAD || 0x80 || 0x00 || ...) ||
XTEA(ClientId, 8bytes([R] || enc(PAYLOAD || ...)))

This feature is also good at saving server’s resources: it won’t try to participate in handshake with unknown clients. So adversary can send any random data and receive nothing in response.

But an adversary can intercept the first client’s handshake message and repeat it again. Because it is valid from the server’s point of view: it will respond to it. You can not finish that handshake session, but at least you know that GoVPN server is sitting on that port and it knows that client’s identity.

To mitigate this kind of attack, we use synchronized clocks. Well, dependency on time is an awful thing. It complicates things very much. So this is only an option. To randomize client identities we just take current time, round it to specified amount, for example ten seconds, and XOR with the client’s identity — every ten seconds an encryption key for IDtag is altered.

               HANDSHAKE MSG = [R] || enc(PAYLOAD || 0x80 || 0x00 ...) ||
XTEA(TIME XOR ClientId, 8bytes([R] || enc(PAYLOAD || ...)))

At last we are quite satisfied with that protocol. Of course you must use strong passphrase and high quality entropy source for ephemeral keys and random numbers generation.

Additional remarks.

Not all operating systems provide good PRNG out-of-box. GoVPN has ability to use other than /dev/urandom entropy sources through Entropy Gathering Daemon compatible protocol.

GoVPN is only layer-2 VPN daemon. It knows nothing about layer-3 IP addresses, routes and anything close to that subject. It uses layer-2 TAP interfaces and you have to manually configure and control how you clients work with the routing and addresses. There are support for convenient up and down scripts executed after session either initialization or termination.

I thought about making some kind of stunnel replacement from it, for example tunneling of either single TCP connection, or externally executed command’s stdin/stdout. But all of this are much more complicated task comparing to the VPN. I decided that you should use specialized tools for all of this. Anyway you can use GoVPN for creating IPv6 link-local only small networks where all you socat, stunnel, SSH, whatever works.

Encryptionless mode.

GoVPN also includes so called encryptionless mode of operation. Its necessity is under question and mainly theoretical.

Assume that you operate under jurisdictions where using of encryption functions is illegal. This mode (actually XTEA PRP encryption of the nonce is still performed) uses only authentication functions. Unfortunately it is much more resource and traffic hungry.

This mode is based on relatively old Ronald L. Rivest’s work about “chaffing and winnowing”. Additionally it uses another well known all-or-nothing transformation (AONT): Optimal Asymmetric Encryption Padding (OAEP). Actually OAEP is slightly changed: length field replaced with hash-based checksumming taken from SAEP+.

Chaffing-and-Winnowing idea is pretty simple in our context: except sending just only single bit of required data, you always send two bits, always 0 and always 1. But you also provide authentication information for each of them: so you can distinguish the bit you really need from the junk (chaff).

For each input byte (8 bits) you send 16 MACs. Odd ones are for 0 bit value, even are for 1 bit value. Only single valid MAC in the pair is allowed.


   VALID    INVLD    INVLD    VALID    INVLD    VALID    INVLD    VALID
   MAC00 || MAC01 || MAC02 || MAC03 || MAC04 || MAC05 || MAC06 || MAC07 ||

   INVLD    VALID    VALID    INVLD    VALID    INVLD    VALID    INVLD
|| MAC08 || MAC09 || MAC10 || MAC11 || MAC12 || MAC13 || MAC14 || MAC15

In that example we have 0, 1, 1, 1, 1, 0, 0, 0 valid bits and byte 01111000.

GoVPN uses Poly1305 as a MAC. So for transmitting single byte we spent 256 bytes of real traffic: 16 128-bit MACs. Each Poly1305 requires one-time authentication key. We take them from XSalsa20 output stream. XSalsa20 differs from Salsa20: it uses longer 192-bit nonces.

MAC00Key, MAC01Key, ... = XSalsa20(
    encryptionKey=SessionKey,
    nonce=PacketNum || 0x00 ... || ByteNum,
    plaintext=0x00 ...
)

As session key is unique for each session and packet numbers do not repeat, we guarantee that one-time authentication keys won’t repeat too.

Sending 256 times more traffic is really very expensive. So AONT can help us here. Its idea is simple: either provide all bits of the message to retrieve it, or you won’t recover anything from it. The main difference of AONT from the encryption: it is keyless. It is just a transformation.

AONT takes message M and some random number r. AONT package consists of two parts P1, P2:

PKG = P1 || P2
 P1 = expand(r) XOR (M || H(r || M))
 P2 = H(P1) XOR r

+-----------------------+-----------+
|         M             | H(r || M) |
+-----------------------+-----------+
          |                  ^
          |                   \
          .                    \
         XOR <-- expand(r)  XOR
          |                         \
          |                          \
          .                           .
+-----------------------------------+----+
|        P1                         | P2 |
+-----------------------------------+----+

If any of your bit in either P1 or P2 is tampered — you will detect this. We use BLAKE2b as a hash function H() and Salsa20 as an expander for the random number. r is used as a key for Salsa20.

Only 16 bytes (128-bit security margin) of this AONT package are chaffed-and-winnowed during transmission. We use 256-bit random number during AONT packaging. So each transmitted packet requires 16 * 256 + 32 = 4128 bytes of overhead. Comparing to 1500 MTU bytes this is not so huge value as 256 times more of clear chaffing-and-winnowing.

Conclusions.

  • We have got strong password authenticated augmented key agreement protocol with zero-knowledge mutual peers authentication.
  • Authentication tokens are resistant to offline dictionary attacks even if server’s database/hard drive is compromised.
  • Replay attack protection, perfect forward secrecy.
  • DPI resistance: all transport and handshake messages looks like random data without any repeating structure. Message lengths and timestamps can be hidden with the noise.
  • Relatively small codebase:
    • 6 screens of transport protocol;
    • 7 screens of handshake protocol;
    • 2 screens of verifier related code;
    • 2 screens of chaffing-and-winnowing related code;
    • 1 screen of AONT related code;
    • 3+3 screens (UDP and TCP) of server related main code;
    • 2+2 screens (UDP and TCP) of client related main code.
  • Enough throughput performance: my Intel i5 notebook CPU under Go 1.5 gives 786 Mbps of UDP packets throughput.

Russian translation of Samsung TV spies on viewer

February 20th, 2015

Just made russian translation of Bruce Schneier’s Samsung Television Spies on Viewers.

Blowfish securing ivi.ru

February 2nd, 2015

Wrote an article (on russian language) about how we use Blowfish encryption algorithm for various task related to security and performance issues. Proof-of-work, hashing and MAC functions.

Boycott Docker

February 1st, 2015

I have created boycottdocker.org campaign website. Together with other software developers we feel the huge pain with that unconvenient, network principles destroying tool. It must be really good from big cloud computing corporations point of view, who run millions and thousands of containers. But I wrote all of that from ordinary not so big company with several hundred servers. Docker, kubernetes and Mesos do not help at all. Many things are impossible to do if they do not fit in their strict architecture borders: you have to spend more time on fighting with them, In most cases it is much more easier to write own shell-scripts that will automate our job as we wish, with maximal impact for our productivity, without any cumbersome cloud-specific technologies.

Moscow’s Crypto InstallFest

August 10th, 2014

There was 3-in-1 event in Moscow a week ago: PGP keysigning party, cryptoparty and installfest, called Crypto InstallFest, organized by Russian’s Pirate Party. Several dozens of people have come. There were various speeches and lectures, video link and chatting with Runa Sandvik (she is involved in TorProject.org) and workshops at last. Someone was helped with Ubuntu installing, someone with PGP (GnuPG of course). Also there were many discussions about cryptocurrencies and Bitcoin (I can not call it cryptocurrency). There are some dicussions and photographs in social networks: Vkontakte, Facebook.

Minimalist simple high performance secure VPN daemon

July 30th, 2014

Several days ago I decided to make an alternative for OpenVPN: GoVPN. OpenVPN uses rather slow HMAC for message authentication and no zero-knowledge password authenticated key exchanges. He is pretty simple, but with not so high security margin and performance.

I wrote already working (but of course with possibly many bugs) daemon on Go programming language. It uses one of the faster crypto algorithms available today and achieves zero-knowledge mutual pre-shared key authenticated key exchange. All derived keys are per-session, so even if PSK is compromised, there is no way to decrypt captured traffic (perfect forward secrecy property).

It does neither interface nor IP-address and routing management: it is the task of underlying OS facilities. And currently it can work with only single client. But I am planning to fix that: so it can be used with many clients simultaneously. Moreover secure remote password can be better choice to allow humans use memorable passwords instead of 256bit keys.

I think that the main comparative advantage is small code size, that can be easily analyzed, audited and fixed. From technical point of overview: it uses Salsa20, Poly1305, Curve25519 and DH-EKE with PSK.

Minimalistic simple IRC server written on Go

May 11th, 2014

Go programming language seems to be very interesting for me. I used to program several years on Perl, Lua and last time on Python. Go is like C, but with convenient necessary features I wished for. I have never coded on it before (except two half-screen sized functions) and this weekend is the first time I decided to write something useful.

As XMPP organization killed global Jabber network (requiring TLSed interserver connections), I looked for good global chatting solution. (Un?)fortunately only IRC protocols seems to be simple enough and there are clients for every available platform I presume. Big global IRC networks are not protocol compatible between themselves. Existing IRC daemons are not so easy and quick to setup. Anyway even in XMPP world there are islands of separated servers, so even if my server can not communicate with other ones — so be it.

I used miniircd IRC daemon: it is written on Python, does not have configuration files and satisfies all my IRC needs. But as I decided to try Go: I rewrote it. Single executable binary and pretty fast working daemon with ability to save logs and channel states. https://github.com/stargrave/goircd

I still did not cover all it’s commands with unittests (however I began with TDD development principle, I wished to chat with my friends stronger), so probably there should be bugs. And as I do not have any experience in that language: bugs have to exists. Currently it works pretty well for private/personal use. You can easily create TLS connections with crywrap utility. And of course it is free software!

Moscow cryptoparty (2013-11-03)

November 6th, 2013

There was rather big cryptoparty in Moscow several days ago. As for me: it went rather good, because there were people satisfied with software they used (not convinced before) (proven enough TrueCrypt, GnuPG and Pidgin’s OTR plugin) and with newly installed proven enough dm-crypt/TrueCrypt, GnuPG and Pidgin’s OTR pluging privacy saving software. Not only privacy related issues were talked about, but cryptoanarchism related too. Some kind of theoretical lecture, software installation workshop lasted for nearly four hours. After that there was rave music afterparty. Many kudos to organizers (perfect but good enough for the first time for so big audience) and people interested in their own privacy and anonymity issues. Hope most of them enjoyed it and will visit the future ones.

Photograph from cryptoparty

Here is my more descriptive opinion and answers to some critique, unfortunately on russian.

Хотел бы высказать своё субъективное мнение о том как недавно прошла криптовечеринка в Москве (хотя я конечно не нейтральная сторона, но подготовкой занимался в меньшей степени нежели чем другие). В сторонних блогах в основном негативные отзывы типа “хуже некуда”, “полный fail” и тому подобное. И я скорее полностью не согласен с этим. Возможно люди просто преувеличивали, или слишком многого и невозможного хотели в противном случае.

Какие же назывались провалы:

  • вступительные речи с “мы сами будем искать и наказывать преступников”. Да, это безусловно цензура, это проявление чего-то авторитарного, против чего как-раз таки шифропанки и борются. Однако даже я (а я умею придираться) решил плюнуть на это, ведь понятно что хотели передать факт того что шифропанки всё своё делают не для того чтобы покрывать преступников, им тоже всякие гады противны, но любая технология будет всегда нести как добро, так и зло. Злоумышленники (которые хотят лишить нас приватности) всегда будут показывать только плохую сторону.
  • вступительная речь с “facebook, email уже более не безопасны” тоже конечно призывает вопрос “а когда они были и кто считал что они когда-то такими были?”, но, опять же, лично я решил не придираться и вполне поверю что большинство не технарей серьёзно были уверены в обратном и Сноуден их наверное шокировал, хотя он просто подтвердил факт что да — никаких предположений, за вами следят и прослушивают
  • гораздо более серьёзно это то что демонстрировалось проприетарное закрытое ПО (Microsoft Windows, Adobe Reader) и использовались сервисы абсолютно неуважающие приватность (Google+) для видеотрансляции и организация проводилась в Facebook. Это бесспорно неправильно, но Михаил не раз отметил для пришедших что надо использовать только и только открытое и свободное ПО по возможности. Сразу спрыгнуть со своих MacOsX и iOS на FreeBSD или GNU/Linux не каждый может. Лично я попросил прощения за демонстрацию с Adobe-ом, но она не нарушала ничью приватность и чтобы людей не задерживать ещё больше и мне не заниматься доставанием своего правильного свободного компьютера — решили сэкономить время. Хотя позволить себе создавать презентацию вне LaTeX/beamer я не мог. Но делалось же это всё в попыхах и впервые и чтобы уж наверняка: решили немного отойти от правильного шифропанковского ПО и сервисов на первый раз
  • не отрепетированное вступление. Ну да, было такое. Было даже забавно. Первый блин комом, намотали на ус. Это всего лишь пять минут попытки вклинить нечто что привлечёт больше людей, нежели чем только технарей пришедших пообсуждать BitCoin-ы

Во-первых у нас не Германия, где люди реально на полном серьёзе обеспокоены своей приватность. У них, как говорят, реально наказывают за нелегальное использование BitTorrent-а. У нас даже сложно представить как это и у нас люди особо то и разницы между свободным и свободнораспространяемым ПО не знают. То есть у нас не знают цену приватности (так как её ещё не теряли в таком объёме) и менее образованы. Устроить в зале массированное обсуждение за и против сервиса Cryptocat у нас ещё банально рано.

Во-вторых организаторы очень правильно и не раз подчеркнули что целевая аудитория это, грубо говоря, домохозяйки с Интернетом, которым и надо показать возникающие опасности и как их можно побороть на пальцах. Михаил во время workshop-а, как мне показалось, правильные давал рекомендации, показывал разносторонний софт (почему OTR лучше других и какие у него недостатки, альтернативы проприетарным облакам), демонстрировал воочию на практике работу SMP, призывал кооперировать друг с другом. Ко мне подходили люди и интересовались силой, актуальностью, приемлимостью TrueCrypt, PGP и GNUnet софта. Устроить массовое цокание по клавиатурам технически было невозможно, так как меньшинство пришло со своими компьютерами и целью посещения криптопати было явно не обустроить свои инструменты и средства связи шифропанковским софтом.

В-третьих организаторы решили устроить нечто большее чем загнать людей в помещение с проектором и микрофоном и изнасиловать их технику чтобы АНБ/ФСБ не добралась до неё. Вместо темы только шифропанков затронута и криптоанархия, в общем-то не упоминающейся в других странах. Создана атмосфера, музыкальное сопровождение, попытка представления. Устроили afterparty rave, но не могу его оценить так как поклонник куда более тяжёлой и экстремальной музыки.

Без них не было бы ничего, особо бы никто не почесался. Первый блин комом, но все учли ошибки. Как и первый сексуальный опыт — он может показаться не совсем тем что ожидалось. Но самое то главное: оттуда вышли люди с установленным софтом, вышли люди понимающие необходимость протоколов социалистов миллионеров, вышли люди переживавшие за юзабельность TrueCrypt и реализаций PGP? Значит бесследно или впустую не прошло. Можно (и нужно) лучше, но для первого раза более чем отлично.

Why I am switching back to *BSD

June 5th, 2013

I was FreeBSD user for six years and worked with it’s versions from 5.0 to 7.0. There appeared to much work with GNU/Linux related subsystems exclusively and it was easier for me to switch yet another UNIX-like operating system temporarily.

I tried several distributions but stayed exactly on Debian. My requirements were:

  • mature, stable and reliable system without any bleeding edge software. I do not worry that there is no latest version of Firefox for example. Included in stable Debian’s distribution one fully satisfies me. Maybe it is not so fast as can be, but it is mature and working.
  • less or more permanent distributions overall architecture without any sudden surprises after yet another packages upgrade. Of course sometimes it can not be skipped, but serious changes are always must be in a major software/distribution version that is rather seldom event.
  • big collection and wide availability of various software. Debian has one of the biggest packages collection. And all of their binary compiled versions can be easily installed using single command. Of course you must trust it’s maintainers. I trust and rely on them.
  • it’s basic installation should not have anything that I am going to remove as a first step. Just minimal bunch of tools and daemons. Ubuntu for example does not provide that: I have to remove huge piles of GNOME-related things and only then install my preferred ones.

Debian even now is the single distribution that can fit in those requirements. But several weeks ago I was very disappointed hearing that most part of it’s developers support integration with systemd.

You see, modern GNU/Linux-es are not a UNIX-like OS with UNIX-way hackerish concepts anymore. UNIX-es in my opinion always were very beautiful and smart programmers creations with really very elegant tasks solving. Most GNU/Linux-es lost that property.

Several decades there were quite few interprocess communication choices. Most time it is either plain text or, unfortunately, binary data floating between conveyors, pipes, domain or network sockets. Each daemon representing any subsystem can be less or more uniquely determined by socket path of pair of network address and port. In nearly all cases it can satisfy anybody.

Even at the very early days of UNIX systems hackers preferred plain text and similar driven protocols and file formats. Though rather relatively big SMTP responses are not as good as binary ones could be, exceptionally on that time slow links, hackers preferred human readable choices anyway, because they are simple, easy to debug, easy to maintain and easy to use.

But GNU/Linux does not like idea of beauty clever decisions and long time proven software. It’s developers (I can not call them hackers in most cases anymore) have to invent the wheel again and create yet another incompatible solution like several IPCs before and DBus itself. It requires heavy dependencies, it does not use well known socket-like paths and addresses, it uses unreadable binary protocol, it is slow and does neither guarantee any delivery nor has any buffering queue.

Access to various low level hardware devices used simple device node filesystem-like access. Of course many of them dictate standards existence and audio has one: Open Sound System, represented by entries inside /dev. Easy to use, easy to implement proven and mature system. If you want to stream audio data other the network you can easily use UNIX power to connect it for example with either pipe or network socket.

GNU/Linux folks do not understand that elegant solution and invent ALSA, aRts, ESD, NAS and PulseAudio at last. So many reinvented creations for rather simple thing. Of course OSS is not the right solution if you have to mix various sound inputs and outputs of both hardware and software modules. But JACK does this job pretty well. GNU/Linux developers do not think so again.

What about operating system’s initialization part? You have various daemons that should be started and controlled. You have to do various file system related steps, manage process execution somehow. All that tasks are done for a long time using shell interpreter, intended to solve them. As a fact each daemon has small shell script used to control server’s behaviour. Hackers need to glue those daemons together. For me, it seems to be very elegant solution to include trivial plain-text metainformation as script’s comments and to create symbolic links dependent on that metainfo with number included to force sorting done right, as in System V.

UNIX-way is to have many small tools, where each of them does single job, but does it well. Simple separate initialization system, simple separate logging system, simple separate shell interpreters, simple IPC socket-oriented libraries, simple daemons, cron, inetd and so on. Looks simple, clear and nice.

You are wrong! Modern GNU/Linux-es can not accept that, because they are missing written on compiled language (does not depend on already existing software for controlling process flows (shells)) program, with own IPC dependency, with own declarative language bloated combine of initialization, logging, cron/at-ing, inetd-ing and DBus/socket listening systems at once. Wait, systemd is pretty modular: several dozens of separate executable. Hackerish SysV is just a shell interpreter with several shell-scripts. Thirty years ago logs have been written on rather small hard drives in plain text, but today seems that hard drives became much smaller and more expensive and systemd decided to write human unreadable and unprocessable with any kind of sed/awk/shell/perl tools binary logs.

I still do not understand why GNOME and derivative distributions (I am sure that udev, systemd, dbus and GNOME are single aggregate) does not use very simple mailcap-files to decide what to do with various kinds of data. mailcap contains plain text lines with data content type and shell script code saying what program you need to run and apply to data. Just find the line by it’s content type and execute related command line. This can be done with single sed call. Just simple plain text file to rule all user’s software preferences. GNOME has to prerun software that will register itself on DBus (should be already running), then another software must create proper message, send it over DBus hoping that someone with catch it doing probably what user wants. It is awful.

And at last I see in Debian maillists that they are going to remove local sendmail server. I see what is happening: when systems are created by very clever hackers — they are very cool for educated technicians and other hackers. When ordinary labour crowd is falling in this world: it will be ruined. Usenet was destroyed like that. Email etiquette has mostly disappeared and replaced by top-posted huge quoted HTML messages, after user-friendly email clients born.

Security is not compatible with user-friendliness. Simple clever hacks are not compatible with classical user’s world of view. Developers never speaks users on the same language. There is always separation of developer-friendly and user-friendly. They can not coexists together, as like servers are pretty different from desktops.

Current Debian is very developer and server friendly system, while Ubuntu is aimed to be user-friendly. Systemd is great for desktop requirements, so let’s integrate it to desktop system. Why one is going to replace cron/at, SysV/rc, inetd, sockets, syslog, devnodes with single all-in-one bloated monolithic combine and remove sendmail? What will stay from UNIX itself? Arch Linux is going to mess /bin and /sbin with /usr/bin. So I won’t even find /bin/sh in that OS. It is not UNIX-like system anymore. It is yet another unmaintainable crap of compiled monolithic POSIX-compatible (hope so) code.

Of course there are really true hackerish UNIX-like GNU/Linux distributions, but all known ones require much manual work with them. Free software *BSD does not, as it has cool port collections and well maintained high quality overall system’s design (not a pile of absolutely different software pieces).

Zsh killer features

March 31st, 2013

At last I realized that zsh shell is really much more useful and better to use than either tcsh or far much than bash. Many shells provide different very cool features, that looks like a killing feature, but in most cases, for me, all of them are in seldom use. tcsh has plenty history substitution options, as bash has large quantity of parameter expansion techniques. But I hardly use even a small piece of them. Of course they can greatly reduce overall character input count, but are too bloated to remember.

One of the most often mentioned zsh‘s feature is it’s commands completion. It is convenient possibility to use fancy menus to either select process you are going to kill, or git’s subcommand to execute. Or maybe just to choose corresponding directory or filename. Well, sometimes, in my opinion, this can be pretty useful, but character count during those menus exploration in most cases, visual analyzing of all those entries leads to too high interaction (human with computer) delay and I will enter two more characters of filename and complete it with Tab faster. Entering filename’s part and hitting Tab is one context, but looking for necessary entry is uncomparable another one. Context switching is an expensive operation.

Moreover all those completions can be very relaxating: you will forget your files hierarchy, forget what options does command have, forget what targets exist in your Makefile, forget how to easily automate PID saving and killing by it, forget how to either make cool shell aliases or write yet another extra useful small Perl-script. Of course there is no distinct border of those unskilled relaxation: if there is file “foo” and “bar”, then obviously there is no need to force hacker typing it’s fullname. Remote SSH directories transparent observation and completion is another undoubtedly useful feature. But all of those completions exist in other shells except zsh.

Anyway there are some killer features that made me zsh hard fan and currently no word about switching back to either tcsh or bash. Here they are:

  • multiline editing capabilities are extremely useful without creating many temporary one-time separate shell-scripts. And of course you can easily edit them inside external text editor.
  • **/*-like various path expansion that saved me from huge quantity of find occurrences and together with *(.)-like things your will forget about it in most cases at all. I had several aliases and external shell-scripts, all calling find, but now throw them out.
  • command spellchecking — tcsh already had this feature, but bash did not. With high speed typing, error rate is pretty high too and this feature can save much time and nerves.
  • autopushd possibility — each cd to directory acts like pushd and you can easily travel back like in your browser. This feature is particularly useful together with Z plugin.
  • autocd option is also presented in bash. It is under a big questions of usefulness, as ambiguity may appear too often with it. With this option you can suppress cd before directory name and shell automatically understands that you are going to change it.
  • filename extension related aliases that saves a lot of time from entering zathura to view PDF files, sxiv for images and so on.
  • it is much faster than bash. Without turned off unused extensions it starts faster, runs faster, completes faster. Each dozen of milliseconds are nice to spend not awaiting for many-bogomips powerful computer.

And there is separate killer must have plugin that I met firstly when using bash (it also works with zsh of course): Z directory jumper. It tracks each time you change directory and offers quick jumping to previously visited place identified by regular expression and directory visiting frequencies. And it works perfectly with autocd and autopushd.

zmv feature looks very promising and seems that it will replace another bunch of Perl/shell-scripts from my computer. However it requires some learning curve of course.