Open Data – how to make it succeed, how to make it fail

This is a talk I gave on September 26th as part of an Ignite session for the hackathon Hack4DK. The hackathon was organized by the Danish Agency for Culture and was centered around recently released cultural heritage data. The talk was an Ignite talk, which means I had to talk exactly five minutes, accompanied by exactly twenty slides (PDF) which display for exactly fifteen seconds each.

Below, the actual speech I gave:

As you can read in the program for this event, I’m a software engineer at Magenta and a board member at Open Space Aarhus, our local community hackerspace. I am also an active Fellow of Free Software Foundation Europe.

This means that my background is in professional free software development AND in the hacker community around Open Space Aarhus. You might say that I represent a hacker’s point of view.

In free (or “open source”) software, the things you need to be able to do with a program are quickly described: You need to be legally entitled to USE, STUDY, CHANGE and DISTRIBUTE the software you work with. This enables sharing and user freedom and avoids expensive licensing.

In the hacker community, our slogan is, somewhat more playfully:

Build what you need, share what you build

AND

Be awesome (and have fun).

From both perspectives the requirements for open data are the same: We must be legally entitled to use them AND to share them – to distribute them ourselves.

If I am to build a free software app from your data, anyone must be allowed to use it, for any purpose. If people are to share what I build, it must be legal for them to do so. If not, my users might get sued.

This means that open data must always concede their users the following rights:

  • A free license, for instance the Creative Commons license used by Wikipedia
  • Redistribution and copying must be allowed
  • The data must be available in formats following open standards

Conversely, data are NOT open if they

  • have a license that limits commercial use in any way, or
  • don’t have a FREE license, or
  • if they don’t have any license at all, or
  • if they are only available in closed or patented formats.

Apps built on such data are not freely hackable and distributable as embodied e.g. in the Open Definition (http://opendefinition.org/okd/).

People from Wikipedia, from Creative Commons and from a plethora of excellent organizations have spoken at last year’s Hack4DK event, and everybody contributing to this year’s event should be aware of these things. But if I look at this year’s contributors of data, several present data with no license or with non-open licenses which are useless from an open data perspective.

One site affirms that its data are experimental and not to be used for commercial purposes. I wouldn’t dream of touching such data in an “open” context like a hackathon.

Worse, the data in question are apparently graphical renderings of maps that are hundreds of years old and thus in the public domain. So these contributors are not just offering data, they are simultaneously removing these data from the public domain and limiting their usefulness to the public.

On another site I find lots of nice and useful data – but, in many cases, no license!

I might claim good faith and use the data anyway, but if no license is given this implied permission could always be revoked and my customers might get sued. I do trust their good intentions, but I frankly think that someone who choose to call themselves “Open Data Aarhus” should know better than that.

And finally, an image offered for download by an art museum is accompanied by very hostile copyright language – which is also pointless, as that statue passed into the public domain centuries ago.

The point here is: If you want to open your data, don’t do it grudgingly. You don’t need hostile copyright language; what you do need is a nice and clear license allowing everybody to use, share, remix and distribute your data.

Cultural heritage data could play a very important part in a free and open society. The possibilities are virtually endless. But we must be free to use them.

Put your data out there under a clear, permissive and non-revokable license and allow users and businesses to share and redistribute them.

In that way a lot of very valuable knowledge and a lot of very valuable works of art may form the basis of many valuable contributions to our modern, digital culture.

Happy hacking! And thanks for having me here today.

I believe the organizers recorded the event on video, and I’ll post the video here as well when it’s available – which is, unfortunately, not just yet.