Fellowship interview with Leif-Jöran Olsson

Leif-Jöran Olsson is a language technologist and XML enthusiast with a long history in the Swedish solidarity movement. I sat down for an interview with Leif-Jöran and asked him about his background, his education and the various projects he’s been involved in.

Stian Rødven Eide: A major part of your work has centered around language technology (LT). What was your point of entry to this field, and how does it relate to your dedication to Free Software? Were you already interested in Free Software when you started your education?

Leif-Jöran Olsson
Leif-Jöran Olsson

Leif-Jöran Olsson: I was initially very interested in usable design, and joined the Mechanical Engineering programme. But, after two years, I found out that human communication was much more fascinating. My introduction to Free Software came after Gymnasium, where we mostly used proprietary software like turbo c++ and turbo prolog. Since I come from a rather unprivileged background, and could not afford to buy software, this prompted my search for free tools. I later moved to Uppsala with my own family and attended the Master’s programme in Language Engineering, which, coincidently with my search for education in human communication, was started in the autumn of 1994.

SRE: Some of your earlier projects have had a focus on machine translation. How does that relate to your later involvement with Språkbanken (the Language Bank) at the University of Gothenburg? Has Free Software played a part in your work?

LJO: While not the best venue for Free Software historically, machine translation was one of the primary areas when I worked in the Department of Linguistics at Uppsala University from 1998 until 2003. Here at Språkbanken, however, we have a heterogenous research environment for language technology infrastructure, primarily focusing on Free Software. Which is really great. We are not doing machine translation at all here. Being a rather shy business, machine translation has got many proprietary and secret tools involved – quite contrary to Free Software ideals. Instead, there is a focus on hard results, meaning that statistical methods, which are cheap in labour, are favoured. The machine translation research/work in Europe is mostly carried out in EC projects with large companies involved. This makes the real knowledge and gain for society rather small. In Uppsala, we did rule based transfer translation and chart parsing, which connects more to linguistic theory than statistical methods, and one of my tasks was to manage a controlled vocabulary. That makes the translation easier. But here, in my work at Språkbanken, Free Software has played a major and contributing role. I am also grateful to the director of Språkbanken for letting me use some of my time to work on eXist-db.

SRE: That leads me to the next question. One of your most active software projects at the moment is the eXist XML database. Why did you choose to get involved in that project? What are the advantages of using eXist-db rather than an SQL database?

LJO: We had been working with sgml and later xml-technologies for a long time, annotating the corpus materials used in the research. We were using eXist-db in our work and wanted to contribute back. This resulted in an active involvement in the project. SQL databases are good for strictly regular or structured (the S in SQL) relational data. Xml on the contrary is all about hierarchy and sequence. This is the power of the information model. Making irregular relations and annotations of
language material are very good examples for using xml technologies. Many people draw the conclusion that xml is too verbose and bloated, confusing the serialised human readable format with the information model. Remember, there are highly compact binary serialisations too. Almost all previous and current LT tools are using different input and output formats, which makes the interaction hard. Being so easily able to do things, like transformations of materials with standard tools, is invaluable. Since we are working on infrastructure, it is a natural choice to use an xml database, since you can avoid the overhead of parsing the data every time you want to use the linguistic annotations and corpus materials in interaction with yet another tool. We also have the sematic web technologies coming. Of course, you are better off with a relational database in a data shuffling situation, but, as soon you need to do irregular, read hierarchical and/or sequential, queries, it mostly boils down to a few easily intelligible rows in XQuery, rather than pages of SQL code.

SRE: You have also taken it upon yourself to maintain the recently liberated bookkeeping software JFSAccounting. As such software often needs to be adapted to every country’s specific laws, Free Software solutions are not always available without a certain amount of work involved. Do you find that there still are missing pieces in the Free Software ecosystem, with regards to the basic tools needed to run a business or organisation in Sweden?

LJO: Accounting software has certainly been one of the map’s white spots, and administrative tools for managing organisations are generally scarce. This I had experienced first hand through my involvement with solidarity movement organisations, and that’s why I took the opportunity to begin liberating the accounting and administrative tool JFSAccounting. The first publicly available liberated version is to be released at FSCONS 2009 (I will prepare fribokföring.se for the
promotion of this to organisations). The piece missing in the tool is a member register (matricle), as it initially was aimed at businesses. But the customer register part can hopefully be adopted with the right terminology for a release next year.

SRE: You’re also involved in the Swedish Syndicalist movement, especially through SAC (a federation of local workers’ unions). According to its principles, SAC is built upon political independence, a decentralised structure, local democracy and solidarity. To a certain extent, this seems to mirror some of the basic values common in the Free Software movement. Do you think that workers’ organisations such as SAC can help Free Software adoption among businesses and public institutions?

LJO: Actually, it was quite a hard job to make the federation accept a policy on primarily using Free Software, something that was finally achieved during the spring of 2008. The federation’s servers have been running Debian GNU/Linux for years, but it was much harder to get a policy for using Free Software on the client machines. Fear of the unknown and the comfort of the habit were the main reasons for this. There are still quite a few Free Software advocates in the different local unions, so yes, I think it can help the adoption of Free Software in other organisations. Many are engaged in several local, national or international organisations beyond their union. Eventually, people get used to the concept of Free Software and regard its freedoms in the same sense as the working class struggle. They realise the common ground they share with the Free Software hackers, and then, they don’t want to go back to proprietary software.

SRE: Through your involvement with SAC and your own company, aptly named Friprogramvarusyndikatet (The Free Software Syndicate), you have also established Serengeti, a network for solidarity and Free Software that offers free hosting for non-profit organisations, as well as a mailing list for discussions. Can you tell us a bit about the background for Serengeti? Do you have plans to expand its activities?

LJO: As I mentioned, I have met quite a bit of fear of technology, and an ignorance of the negative consequences from putting yourself completely in the hands of proprietary market actors. At the same time, many people that are attracted to Free Software are afraid of politics and only see Free Software as neutral and apolitical. This resulted us forming a loose network called Serengeti. We are aiming for a more stable network that can promote the use of Free Software in solidarity movements, and also help bridging the surplus of knowledge from therein. Building on tradition, we do it bottom up, starting out with a mailing list.

Our warm thanks to Leif-Jöran for taking the time to answer our questions. You can read more about him and his projects on his Gothenburg University page.

Fellowship interview with Andreas Tolf Tolfsen

Andreas Tolf Tolfsen is a web technologist, developer and aspiring musicologist – who works at Opera Software, and regularly fights for digital freedoms. I sat down for a Jabber session with Andreas, asking him about his work, his life and his music.

Stian Rødven Eide: Through your employment at Opera Software, you work quite a lot with web standards. What are the difficulties in making a browser display pages correctly? Has HTML 5 posed particular problems?

Andreas Tolf Tolfsen
Andreas Tolf Tolfsen

Andreas Tolf Tolfsen: The great thing about the web is that anyone can do it. The concept behind it is the principle of universal accessibility; that anyone should be able to read its contents. I am convinced that the web will have a greater impact on the world than the advent of the printing press, in that everyone, irrespective of their technical experience, is invited to participate.

The bad thing about the web is that anyone can do it. With more people contributing, the higher the chance that someone will break something. The use of invalid code syntax, lack of standards-compliancy, proprietary formats, and uncharted behaviour are all challenges facing anyone who attempts to make sense of various de-facto web tag soups.

Luckily we have browsers which actually facilitate, and at times encourage, this kind of behaviour. Traditionally, web standards have advocated one way of doing things, while web browsers, on the other hand, have tried to make the best out of what they are presented with. Because probably as much as 94 % of the web consists of pages with invalid syntax, we should rather be asking ourselves if there is a better way of designing standards.

Web standards have generally been about telling people how to do things, and not so much about what the expected results are. In particular, web standards do not cover how browsers should handle exceptions to the sets of strict rules in the event that the syntax is not semantically correct. Additionally, few web standards are concerned with backwards compatibility, which is a major concern for web browser manufacturers.

So, the biggest part of the job with getting web pages to be displayed correctly is actually to figure out what the correct behaviour is. In this respect, HTML 5 solves more problems than it creates. A big part of this involves fixing HTML 4, which contains parts known to be wrong. HTML 5 will still be a big advance in attaining open standards on the web. For the first time, all browser manufacturers, and dozens of volunteers, are involved in the drafting of the specification.

SRE: Like Mozilla, Opera has decided to implement Theora and Vorbis support for the <video>- and <audio>-tags. Do you think HTML5 has a chance of making Ogg Theora and Vorbis more established standards, even though they were left out of the official specification?

ATT: Currently, there is no good way of embedding video and audio in web pages. A web developer must follow different approaches dependent upon operating system and web browser species. This is inconvenient, and most fall back to using a proprietary Flash solution. This is unfortunate, because it defies the entire point of open web standards.

With the HTML 5 specification, using the Ogg codecs was initially proposed. Apple, however, decided not to implement Ogg in Safari, citing “submarine” patents as a key issue. The result is that we end up in a “plugin prison”, where the video and audio files that are supported are entirely dependent upon what codecs Quicktime (or Windows Media Player, for that matter) supports.

There is no point for HTML 5 to specify something that we know browsers cannot implement. But in reference to your question, I think what Chromium, Mozilla and Opera do will have only limited effect, seeing as they control only a small segment of the market, compared to Microsoft Internet Explorer. Still, with Ogg being the standard of the world’s largest website, Wikipedia, I think Ogg has come to stay. It’s good to see Ogg natively implemented in the majority of browsers, but the goal of having a universal video and audio codec for the web will take a few more years.

However, I find it interesting that the Chromium Project has implemented Ogg support in their fork of WebKit. I hope that the folks over in the official WebKit Project camp will follow Chromium’s example, and do the same thing. Even if Safari is without Ogg support, there is no reason why the free software alternative WebKit should be.

SRE: You also work at E-tjenesten, a Free Software cooperative that you co-founded and that focusses on web development. Can you briefly describe the projects that you work on there, such as talko and Bikube? What are the long-term goals for the cooperative?

ATT: Bikube (Norwegian for “beehive”) is a tool for collaboration. It lets you keep track of work and deadlines, share files, discuss, and get stuff done. talko is actually the software that runs beneath this website, which is yet to be launched.

At E-tjenesten SA we are trying to phase out various consultant work we have been doing, and focussing more on web application development. One of our goals is to develop useful tools that let people do what they want, the way they want to do it: We don’t force our own beliefs on to our customers.

SRE: As a dedicated communist, you have been active in the political party Rødt (“Red” in English), particularly working on campaigns for Free Software, integrity and filesharing. Do you regard these causes as a natural part of contemporary socialist ideology? Is the dedication to such issues widespread among the Norwegian Left?

ATT: Certainly! The thread is that knowledge should be made accessible to all, and that the fantastic things made possible through internet and collaboration might lay the foundation for a new form of society. I think this concept is quite widespread, in the sense that if people are given the right tools, and access to free knowledge, one is taking large portions of market-governed areas out of capitalistic control, and in to communistic control.

File-sharing benefits society, but violates the old model of payment for film and music. Unfortunately, the industry is waging war on their own customers instead of exploiting the possibilities that new technology offers.

Many see file-sharing as a question of right and wrong according to today’s legislation, but this is not what the campaign for legalizing file-sharing is about. According to present legislation, file-sharing copyrighted material is almost always illegal. The campaign, however, raises a political question of whether this legislation holds any function today.

Through collaboration, millions of people all over the world have built the world’s largest library and made it accessible to all. One is able to share music, film, software, and knowledge in a scale not before possible. The distribution of this material is for all practical purposes free. Ten years ago, it was virtually impossible to have access to all the world’s culture 24 hours a day, but as my generation grows up, it’s seen as a necessity.

Most will agree that the internet is the future for distributing film, music and digital content. Subsequently, most will also agree that the industry needs money to continue production of good music and film. The most important divide is, however, between those who want to apply the same old models of financing that we have today on the internet, and those who understand that a market economy with a “pay by track” solution doesn’t work, and is never going to do so.

The number of digital copies is not limited. What limits the distribution of arbitrary copies of a song is the speed of the network you are on, and modern peer-to-peer file-sharing programs have solved this issue elegantly: Millions of computers in ordinary homes ensure that everything is available, at any time, and thus also solves the problem of net neutrality with “high-speed” lanes to facilitate distribution of this, and other kinds of online content.

A digital copy that is distributed in this way is an abundance. In the real world, when a person buys a CD from the record store, there is one less CD for the rest of us to acquire. On the internet, on the other hand, when someone downloads a CD from someone else, it becomes multiplied. The irony is that the more people who are interested in something, the more accessible it becomes. This should be an ideal situation, but for the record industry it becomes a nightmare when their business model collapses.

The question of file-sharing is largely tied up to the question of copyright. Recently, the people behind the Swedish torrent tracker Pirate Bay were convicted of violating copyright legislation. Among other things, I created the widely popular Filesharer.org campaign to support the accused, and it had an overwhelming response. In just a few days, almost 4000 people uploaded a picture of themselves to show the industry who the “real” criminals were. The campaign got covered by the media all across the world, and even made national television in a couple of countries.

My point here is to show that today’s copyright legislation is outdated and needs to be revised. Richard Stallman has made sensible suggestions as to how we can approach this issue. Interestingly, all political parties in Norway answered “yes” to the question “[i]s the current copyright legislation sufficiently adjusted to today’s digital society?” in a campaign by EFN (Electronic Frontier Foundation, Norway) and FriBit. This means that the climate for a new copyright debate in Norway is good.

SRE: You’re also personally involved in EFN on similar issues. How much momentum has the organisation gained, and how difficult has it been to work for these issues in Norway?

ATT: I’d like to first explain what the EFN is: Electronic Frontier Foundation in Norway is a loosely organized discussion list concerned with civil liberties, privacy and freedom of expression in the digital society. Over the past year, EFN has been organizing several events, such as a debate on file-sharing where Cory Doctorow was present, and a demonstration for a free internet in front of the Norwegian parliament, made comments on the Norwegian government’s proposal for the use of open standards in public sector, and been involved in battling the Data Retention Directive.

EFN plays an important role in Norway, but unfortunately often as a single critical voice in the information politics discussion. I would say they are regarded as a group of enthusiasts worth listening to. There are a lot of highly talented people in EFN, who’s been working hard since 1995 (180 members) to build the organization to what it is today (around 1000 members).

With a possible Norwegian implementation of the EU’s Data Retention Directive (directive 2006/24/EF), requiring telecommunications companies to store traffic data on the citizen’s electronic communication (e-mail, SMS, telephone, internet) for up to two years, Norwegian’s right to privacy will be grossly violated. This is an issue EFN, and many of EFN’s members, have been deeply involved in.

The Data Retention Directive was adopted by the EU on 15 March 2006, but the Norwegian government has not officially decided whether the directive should be made Norwegian law or not. According the EEA agreement, Norway holds a reservation right, as we are not members of the EU. This right has never thus far been exercised. But then, we have never faced a directive representing this great a threat to democracy’s fundamental values, as what the data retention directive does at present.

The director of the Norwegian Data Inspectorate, Georg Apenes, has warned about yielding to “totalitarian passion”, and Thomas Finneid, board member in EFN, is calling it “[t]he most important debate about democracy in [N]orway since the war”.

SRE: As a pianist, composer and musicology student, you have no doubt been exploring Free Software alternatives for music production and notation. Do you find that Free Software solutions are sufficient for your musical needs? Are there any particular programs you’d recommend to others in your situation?

ATT: Oh, absolutely! I would argue that the most aesthetically beautiful notation software out there is GNU Lilypond. It beats the proprietary alternatives by a good margin. It’s an absolutely fantastic piece of software, as is often the case with GNU software in general.

When I write papers, I use the tool lilypond-book to compile LaTeX articles with Lilypond notation embedded, which is much better than having to export graphic files from proprietary alternatives. I don’t think Lilypond’s gained much hold in the musicological field yet, but it’s certainly encouraging that it’s the best out there.

I also use a piece of software called SPEAR (Sinusoidal Partial Editing Analysis and Resynthesis), which allows you to edit and manipulate partials in sound files. There’s also Audacity, a tool for recording and editing sounds, which I use a lot.

In recent musicological research, especially with work related to music cognition and movement, there has been a surge of new, interesting software developed as a result of a need to find better and more accurate ways of empirically documenting body movement in relation to music. In particular, the research centre fourMs at the Institute of Musicology at the University of Oslo have made some very interesting software that’s used, among other things, for movement analysis of video, real-time audiative analysis, production of sound with embedded control devices (such as a game controller), and for producing sound based on motiongrams of video recordings.

Of course, this software is available under the GPL. This not only encourages others to use and improve it, but also allows critical readers and other musicologists to verify the empirical data collected with the tools. Today, research projects are often granted funding even though the results of the research are not possible to verify (or even to falsify, to check that experiments can be reproduced), because one needs to buy access to closed platforms, or even licenses for the research material itself. This is a good reason why we, by principle, should not trust research done with proprietary, closed-source tools.

SRE: You have been involved in FSFE for several years as a translator and web developer. What is your personal take on FSFE’s current web presence, and what do you think should be improved?

ATT: FSFE is a wonderful organization that does a lot of exemplary work. One of its biggest strengths is its diversity: That we have translations in 30 languages of our website is an incredible achievement, and the fact that we are able to influence debates on free software and open standards on a European level is a proof of the significance of FSFE.

However, despite good results, I don’t always think we are good enough at showing off the results of our work. Another point is that we likely have a good potential to involve our Fellows and other activists better in our activities.

FSFE’s web presence consists mostly of one-way communication through newsletters and news articles on our homepage. There are many great resources there, although as discussed on the web-list some time ago, most of it is poorly organized, and a lot of content is hidden away.

I think it would be good if we started a discussion, not only on web presence, but about communication in general in FSFE, with emphasis on developing tools to help Fellows and other sympathizers, and on improving the general structure of our website.

Even though our policies are good, how we present them counts also. I’m a bit reluctant about going too much into detail on what I see as the biggest rooms for improvement, but I hope that people will either heartily disagree, or wearily agree with me; and be inspired to participate in such an effort. Either is good, really, for I think a good discussion on this is needed.

Many thanks go to Andreas for his insightful comments. You can read more about him and his projects on his home page at E-tjenesten.