Paul Boddie's Free Software-related blog


Archive for the ‘Python’ Category

On forms of apparent progress

Sunday, May 8th, 2022

Over the years, I have had a few things to say about technological change, churn, and the appearance of progress, a few of them touching on the evolution and development of the Python programming language. Some of my articles have probably seemed a bit outspoken, perhaps even unfair. It was somewhat reassuring, then, to encounter the reflections of a longstanding author of Python books and his use of rather stronger language than I think I ever used. It was also particularly reassuring because I apparently complain about things in far too general a way, not giving specific examples of phenomena for anything actionable to be done about them. So let us see whether we can emerge from the other end of this article in better shape than we are at this point in it.

Now, the longstanding author in question is none other than Mark Lutz whose books “Programming Python” and “Learning Python” must surely have been bestsellers for their publisher over the years. As someone who has, for many years, been teaching Python to a broad audience of newcomers to the language and to programming in general, his views overlap with mine about how Python has become increasingly incoherent and overly complicated, as its creators or stewards pursue some kind of agenda of supposed improvement without properly taking into account the needs of the broadest reaches of its user community. Instead, as with numerous Free Software projects, an unscrutable “vision” is used to impose change based on aesthetics and contemporary fashions, unrooted in functional need, by self-appointed authorities who often lack an awareness or understanding of historical precedent or genuine user need.

Such assertions are perhaps less kind to Python’s own developers than they should be. Those choosing to shoehorn new features into Python arguably have more sense of precedent than, say, the average desktop environment developer imitating Apple in what could uncharitably be described as an ongoing veiled audition for a job in Cupertino. Nevertheless, I feel that language developers would be rather more conservative if they only considered what teaching their language to newcomers entails or what effect their changes have on the people who have written code in their language. Am I being unfair? Let us read what Mr Lutz has to say on the matter:

The real problem with Python, of course, is that its evolution is driven by narcissism, not user feedback. That inevitably makes your programs beholden to ever-shifting whims and ever-hungry egos. Such dependencies might have been laughable in the past. In the age of Facebook, unfortunately, this paradigm permeates Python, open source, and the computer field. In fact, it extends well beyond all three; narcissism is a sign of our times.

You won’t find a shortage of similar sentiments on his running commentary of Python releases. Let us, then, take a look at some experiences and try to review such assertions. Maybe I am not being so unreasonable (or impractical) in my criticism after all!

Out in the Field

In a recent job, of which more might be written another time, Python was introduced to people more familiar with languages such as R (which comes across as a terrible language, but again, another time perhaps). It didn’t help that as part of that introduction, they were exposed to things like this:

    def method(self, arg: Dict[Something, SomethingElse]):
        return arg.items()

When newcomers are already having to digest new syntax, new concepts (classes and objects!), and why there is a “self” parameter, unnecessary ornamentation such as the type annotations included in the above, only increases the cognitive burden. It also doesn’t help to then say, “Oh, the type declarations are optional and Python doesn’t really check them, anyway!” What is the student supposed to do with that information? Many years ago now, Java was mocked for confronting its newcomers with boilerplate like this classic:

    public static void main(String[] args)

But exposing things that the student is then directed to ignore is simply doing precisely the same thing for which Java was criticised. Of course, in Python, the above method could simply have been written as follows:

    def method(self, arg):
        return arg.items()

Indeed, for the above method to be valid in the broadest sense, the only constraint on the nature of the “arg” parameter is that it offer an attribute called “items” that can be called with no arguments. By prescriptively imposing a limitation on “arg” as was done above, insisting that it be a dictionary, the method becomes less general and less usable. Moreover, the nature of Python itself is neglected or mischaracterised: the student might believe that only a certain type would be acceptable, just as one might suggest that the author of that code also fails to see that a range of different, conformant kinds of objects could be used with the method. Such practices discourage or conceal polymorphism and generic functionality at a point when the beginner’s mind should be opened to them.

As Mr Lutz puts such things in the context of a different feature introduced in Python 3.5:

To put that another way: unless you’re willing to try explaining a new feature to people learning the language, you just shouldn’t do it.

The tragedy is that Python in its essential form is a fairly intuitive and readable language. But as he also says in the specific context of type annotations:

Thrashing (and rudeness) aside, the larger problem with this proposal’s extensions is that many programmers will code them—either in imitation of other languages they already know, or in a misguided effort to prove that they are more clever than others. It happens. Over time, type declarations will start appearing commonly in examples on the web, and in Python’s own standard library. This has the effect of making a supposedly optional feature a mandatory topic for every Python programmer.

And I can certainly say from observation that in various professional cultures, including academia where my own recent observations were made, there is a persistent phenomenon where people demonstrate “best practice” to show that they as a software development practitioner (or, indeed, a practitioner of anything else related to the career in question) are aware of the latest developments, are able to communicate them to their less well-informed colleagues, and are presumably the ones who should be foremost in anyone’s consideration for any future hiring round or promotion. Unfortunately, this enthusiasm is not always tempered by considered reflection, either on the nature of the supposed innovation itself, or on the consequences its proliferation will have.

Perversely, such enthusiasm, provoked by the continual hustle for funding, positions, publications and reputation, risks causing a trail of broken programs, and yet at the same time, much is made of the need for software development to be done “properly” in academia, that people do research that is reproducible and whose computational elements are repeatable. It doesn’t help that those ambitions must also be squared with other apparent needs such as offering tools and services to others. And the need to offer such things in a robust and secure fashion sometimes has to coexist with the need to offer them in a convenient form, where appropriate. Taking all of these things into consideration is quite the headache.

A Positive Legacy

Amusingly, some have come to realise that Python’s best hope for reproducible research is precisely the thing that Python’s core developers have abandoned – Python 2.7 – and precisely because they have abandoned it. In an article about reproducing old, published results, albeit of a rather less than scientific nature, Nicholas Rougier sought to bring an old program back to life, aiming to find a way of obtaining or recovering the program’s sources, constructing an executable form of the program, and deploying and running that program on a suitable system. To run his old program, written for the Apple IIe microcomputer in Applesoft BASIC, required the use of emulators and, for complete authenticity, modern hardware expansions to transfer the software to floppy disks to run on an original Apple IIe machine.

And yet, the ability to revive and deploy a program developed 32 years earlier was possible thanks to the Apple machine’s status as a mature, well-understood platform with an enthusiastic community developing new projects and products. These initiatives were only able to offer such extensive support for a range of different “retrocomputing” activities because the platform has for a long time effectively been “frozen”. Contrasting such a static target with rapidly evolving modern programming languages and environments, Rougier concluded that “an advanced programming language that is guaranteed not to evolve anymore” would actually be a benefit for reproducible science, that few people use many of the new features of Python 3, and that Python 2.7 could equally be such a “highly fertile ground for development” that the proprietary Applesoft BASIC had proven to be for a whole community of developers and users.

Naturally, no language designer ever wants to be told that their work is finished. Lutz asserts that “a bloated system that is in a perpetual state of change will eventually be of more interest to its changers than its prospective users”, which is provocative but also rings true. CPython (the implementation of Python in the C programming language) has always had various technical deficiencies – the lack of proper multithreading, for instance – but its developers who also happen to be the language designers seem to prefer tweaking the language instead. Other languages have gained in popularity at Python’s expense by seeking to address such deficiencies and to meet the frustrated expectations of Python developers. Or as Lutz notes:

While Python developers were busy playing in their sandbox, web browsers mandated JavaScript, Android mandated Java, and iOS became so proprietary and closed that it holds almost no interest to generalist developers.

In parts of academia familiar with Python, languages like Rust and Julia are now name-dropped, although I doubt that many of those doing the name-dropping realise what they are in for if they decide to write everything in Rust. Meanwhile, Python 2 code is still used, against a backdrop of insistent but often ignored requests from systems administrators for people to migrate code to Python 3 so that newer operating system distributions can be deployed. In other sectors, such migration is meant to be factored into the cost of doing business, but in places like academia where software maintenance generally doesn’t get funding, no amount of shaming or passive-aggressive coercion is magically going to get many programs updated at all.

Of course, we could advocate that everybody simply run their old software in virtual machines or containers, just as was possible with that Applesoft BASIC program from over thirty years ago. Indeed, containerisation is the hot thing in places like academia just as it undoubtedly is elsewhere. But unlike the Apple II community who will hopefully stick with what they know, I have my doubts that all those technological lubricants marketed under the buzzword “containers!” will still be delivering the desired performance decades from now. As people jump from yesterday’s hot solution to today’s and on to tomorrow’s (Docker, with or without root, to Singularity/Apptainer, and on to whatever else we have somehow deserved), just the confusion around the tooling will be enough to make the whole exercise something of an ordeal.

A Detour to the Past

Over the last couple of years, I have been increasingly interested in looking back over the course of the last few decades, back to the time when I was first introduced to microcomputers, and even back beyond that to the age of mainframes when IBM reigned supreme and the likes of ICL sought to defend their niche and to remain competitive, or even relevant, as the industry shifted beneath them. Obviously, I was not in a position to fully digest the state of the industry as a schoolchild fascinated with the idea that a computer could seemingly take over a television set and show text and graphics on the screen, and I was certainly not “taking” all the necessary computing publications to build up a sophisticated overview, either.

But these days, many publications from decades past – magazines, newspapers, academic and corporate journals – are available from sites like the Internet Archive, and it becomes possible to sample the sentiments and mood of the times, frustrations about the state of then-current, affordable technology, expectations of products to come, and so on. Those of us who grew up in the microcomputing era saw an obvious progression in computing technologies: faster processors, more memory, better graphics, more and faster storage, more sophisticated user interfaces, increased reliability, better development tools, and so on. Technologies such as Unix were “the future”, labelled as impending to the point of often being ridiculed as too expensive, too demanding or too complicated, perhaps never to see the limelight after all. People were just impatient: we got there in the end.

While all of that was going on, other trends were afoot at the lowest levels of computing. Computer instruction set architectures had become more complicated as the capabilities they offered had expanded. Although such complexity, broadly categorised using labels such as CISC, had been seen as necessary or at least desirable to be able to offer system implementers a set of convenient tools to more readily accomplish their work, the burden of delivering such complexity risked making products unreliable, costly and late. For example, the National Semiconductor 32016 processor, seeking to muscle in on the territory of Digital Equipment Corporation and its VAX line of computers, suffered delays in getting to market and performance deficiencies that impaired its competitiveness.

Although capable and in some respects elegant, it turned out that these kinds of processing architectures were not necessarily delivering what was actually important, either in terms of raw performance for end-users or in terms of convenience for developers. Realisations were had that some of the complexity was superfluous, that programmers did not use certain instructions often or at all, and that a flawed understanding of programmers’ needs had led to the retention of functionality that did not need to be inscribed in silicon with all the associated costs and risks that this would entail. Instead, simpler, more orthogonal architectures could be delivered that offered instructions that programmers or, crucially, their compilers would actually use. The idea of RISC was thereby born.

As the concept of RISC took off, pursued by the likes of IBM, UCB and Sun, Stanford University and MIPS, Acorn (and subsequently ARM), HP, and even Digital, Intel and Motorola, amongst others, the concept of the workstation became more fully realised. It may have been claimed by some commentator or other that “the personal computer killed the workstation” or words to that effect, but in fact, the personal computer effectively became the workstation during the course of the 1990s and early years of the twenty-first century, albeit somewhat delayed by Microsoft’s sluggish delivery of appropriately sophisticated operating systems throughout its largely captive audience.

For a few people in the 1980s, the workstation vision was the dream: the realisation of their expectations for what a computer should do. Although expectations should always be updated to take new circumstances and developments into account, it is increasingly difficult to see the same rate of progress in this century’s decades that we saw in the final decades of the last century, at least in terms of general usability, stability and the emergence of new and useful computational capabilities. Some might well argue that graphics and video processing or networked computing have progressed immeasurably, these certainly having delivered benefits for visualisation, gaming, communications and the provision of online infrastructure, but in other regards, we seem stuck with something very familiar to that of twenty years ago but with increasingly disillusioned developers and disempowered users.

What we might take away from this historical diversion is that sometimes a focus on the essentials, on simplicity, and on the features that genuinely matter make more of a difference than just pressing ahead with increasingly esoteric and baroque functionality that benefits few and yet brings its own set of risks and costs. And we should recognise that progress is largely acknowledged only when it delivers perceptable benefits. In terms of delivering a computer language and environment, this may necessarily entail emphasising the stability and simplicity of the language, focusing instead on remedying the deficiencies of the underlying language technology to give users the kind of progress they might actually welcome.

A Dark Currency

Mark Lutz had intended to stop commentating on newer versions of Python, reflecting on the forces at work that makes Python what it now is:

In the end, the convolution of Python was not a technical story. It was a sociological story, and a human story. If you create a work with qualities positive enough to make it popular, but also encourage it to be changed for a reward paid entirely in the dark currency of ego points, time will inevitably erode the very qualities which made the work popular originally. There’s no known name for this law, but it happens nonetheless, and not just in Python. In fact, it’s the very definition of open source, whose paradigm of reckless change now permeates the computing world.

I also don’t know of a name for such a law of human behaviour, and yet I have surely mentioned such behavioural phenomena previously myself: the need to hustle, demonstrate expertise, audition for some potential job offer, demonstrate generosity through volunteering. In some respects, the cultivation of “open source” as a pragmatic way of writing software collaboratively, marginalising Free Software principles and encouraging some kind of individualistic gift culture coupled to permissive licensing, is responsible for certain traits of what Python has become. But although a work that is intrinsically Free Software in nature may facilitate chaotic, haphazard, antisocial, selfish, and many other negative characteristics in the evolution of that work, it is the social and economic environment around the work that actually promotes those characteristics.

When reflecting on the past, particularly during periods when capabilities were being built up, we can start to appreciate the values that might have been more appreciated at that time than they are now. Python originated at a time when computers in widespread use were becoming capable enough to offer such a higher-level language, one that could offer increased convenience over various systems programming languages whilst building on top of the foundations established by those languages. With considerable effort having been invested in such foundations, a mindset seemed to persist, at least in places, that such foundations might be enduring and be good for a long time.

An interesting example of such attitudes arose at a lower level with the development of the Alpha instruction set architecture. Digital, having responded ineffectively to its competitive threats, embraced the RISC philosophy and eventually delivered a processor range that could be used to support its existing product line-up, emphasising performance and longevity through a “15- to 25-year design horizon” that attempted to foresee the requirements of future systems. Sadly, Digital made some poor strategic decisions, some arguably due to Microsoft’s increasing influence over the company’s strategy, and after a parade of acquisitions, Alpha fell under the control of HP who sacrificed it, along with its own RISC architecture, to commit to Intel’s dead-end Itanium architecture. I suppose this illustrates that the chaos of “open source” is not the only hazard threatening stability and design for longevity.

Such long or distant horizons demand that newer developments remain respectful to the endeavours that have made them possible. Such existing and ongoing endeavours may have their flaws, but recognising and improving those flaws is more constructive and arguably more productive than tearing everything down and demanding that everything be redone to accommodate an apparently new way of thinking. Sadly, we see a lot of the latter these days, but it goes beyond a lack of respect for precedent and achievement, reflecting broader tendencies in our increasingly stressed societies. One such tendency is that of destructive competition, the elimination of competitors, and the pursuit of monopoly. We might be used to seeing such things in the corporate sphere – the likes of Microsoft wanting to be the only ones who provide the software for your computer, no matter where you buy it – but people have a habit of imitating what they see, especially when the economic model for our societies increasingly promotes the hustle for work and the need to carve out a lucrative niche.

So, we now see pervasive attitudes such as the pursuit of the zero-sum game. Where the deficiencies of a technology lead its users to pursue alternatives, defensiveness in the form of utterances such as “no need to invent another language” arises. Never mind that the custodians of the deficient technology – in this case, Python, of course – happily and regularly offer promotional consideration to a company who openly tout their own language for mobile development. Somehow, the primacy of the Python language is a matter for its users to bear, whereas another rule applies amongst its custodians. That is another familiar characteristic of human behaviour, particularly where power and influence accumulates.

And so, we now see hostility towards anything being perceived as competition, even if it is merely an independent endeavour undertaken by someone wishing to satisfy their own needs. We see intolerance for other solutions, but we also see a number of other toxic behaviours on display: alpha-dogging, personality worship and the cultivation of celebrity. We see chest-puffing displays of butchness about Important Matters like “security”. And, of course, the attitude to what went before is the kind of approach that involves boiling the oceans so that it may be populated by precisely the right kind of fish. None of this builds on or complements what is already there, nor does it deliver a better experience for the end-user. No wonder people say that they are jealous of colleagues who are retiring.

All these things make it unappealing to share software or even ideas with others. Fortunately, if one does not care about making a splash, one can just get on with things that are personally interesting and ignore all the “negativity from ignorant, opinionated blowhards”. Although in today’s hustle culture, this means also foregoing the necessary attention that might prompt anyone to discover your efforts and pay you to do such work. On the actual topic that has furnished us with so many links to toxic behaviour, and on the matter of the venue where such behaviour is routine, I doubt that I would want my own language-related efforts announced in such a venue.

Then again, I seem to recall that I stopped participating in that particular venue after one discussion had a participant distorting public health observations by the likes of Hans Rosling to despicably indulge in poverty denial. Once again, broader social, economic and political influences weigh heavily on our industry and communities, with people exporting their own parochial or ignorant views globally, and in the process corrupting and undermining other people’s societies, oblivious to the misery it has already caused in their own. Against this backdrop, simple narcissism is perhaps something of a lesser concern.

At the End of the Tunnel

I suppose I promised some actionable observations at the start of the article, so what might they be?

Respect Users and Investments

First of all, software developers should be respectful towards the users of their software. Such users lend validation to that software, encourage others to use it, and they potentially make it possible for the developers to work on it for a living. Their use involves an investment that, if written off by the developers, is costly for everyone concerned.

And no, the users’ demands for that investment to be protected cannot be disregarded as “entitlement”, even if they paid nothing to acquire the software, at least if the developers are happy to enjoy all the other benefits of the software’s proliferation. As is often said, power and influence bring responsibility. Just as democratically elected politicians have a responsibility towards everyone they represent, regardless of whether those people voted for them or not, software developers have a duty of care towards all of their users, even if it is merely to step out of the way and to let the users take the software in its own direction without seeking to frustrate them as we saw when Python 2 was cast aside.

Respond to User Needs Constructively

Developers should also be responsive to genuine user needs. If you believe all the folklore about the “open source” way, it should have been precisely people’s own genuine needs that persuaded them to initiate their own projects in the first place. It is entirely possible that a project may start with one kind of emphasis and demand one kind of skills only to evolve towards another emphasis or to require other skills. With Python, much of the groundwork was laid in the 1990s, building an interpreter and formulating a capable language. But beyond that initial groundwork, the more pressing challenges lay outside the language design domain and went beyond the implementation of a simple interpreter.

Improved performance and concurrency, both increasingly expected by users, required the application of other skills that might not have been present in the project. And yet, the elaboration of the language continued, with the developers susceptible to persuasion by outsiders engaging in “alpha-dogging” or even insiders with an inferiority complex, being made to feel that the language was not complete or even adequate since it lacked features from the pet languages of those outsiders or of the popular language of the day. Development communities should welcome initiatives to improve their projects in ways that actually benefit the users, and they should resist the urge to muscle in on such initiatives by seeking to demonstrate that they have the necessary solutions when their track record would indicate otherwise. (Or worse still, by reframing user needs in terms of their own narrow agenda as if to say, “Here is what you are really asking for.” Another familiar trait of the “visionary” desktop developer.)

Respect Other Solutions

Developers and commentators more generally should accept and respect the existence of other technologies and solutions. Just because they have their own favourite solution does not de-legitimise something they have just been made aware of. Maybe it is simply not meant for them. After all, not everything that happens in this reality is part of a performance exclusively for any one person’s benefit, despite what some people appear to think. And the existence of other projects doing much the same thing is not necessarily “wasted effort”: another concept introduced from some cult of economics or other.

It is entirely possible to provide similar functionality in different ways, and the underlying implementations may lend those different projects different characteristics – portability, adaptability, and so on – even if the user sees largely the same result on their screen. Maybe we do want to encourage different efforts even for fundamental technologies or infrastructure, not because anyone likes to “waste effort”, but because it gives the systems we build a level of redundancy and resilience. And maybe some people just work better with certain other people. We should let them, as opposed to forcing them to fit in with tiresome, exploitative and time-wasting development cultures, to suffer rudeness and general abuse, simply to go along with an exercise that props up some form of corporate programme of minimal investment in the chosen solution of industry and various pundits.

Develop for the Long Term and for Stability

Developers should make things that are durable so that they may be usable for many years to come. Or they should at least expect that people may want to use them years or even decades from now. Just because something is old does not mean it is bad. Much of what we use today is based on technology that is old, with much of that technology effectively coming of age decades ago. We should be able to enjoy the increased performance of our computers, not have it consumed by inefficient software that drives the hardware and other software into obsolescence. Technological fads come and go (and come back again): people in the 1990s probably thought that virtual reality would be pervasive by now, but experience should permit us to reflect and to recognise that some things were (and maybe always will be) bad ideas and that we shouldn’t throw everything overboard to pander to them, only to regret doing so later.

We live in a world where rapid and uncomfortable change has been normalised, but where consumerism has been promoted as the remedy. Perhaps some old way of doing something mundane doesn’t work any more – buying something, interacting with public agencies, fulfilling obligations, even casting votes in some kinds of elections – perhaps because someone has decided that money can be saved (and, of course, soon wasted elsewhere) if it can be done “digitally” from now on. To keep up, you just need a smartphone, or a newer smartphone, with an “app”, or the new “app”, and a subscription to a service, and another one. And so on. All of that “works” for people as long as they have the necessarily interest, skills, time, and money to spend.

But as the last few years have shown, it doesn’t take much to disrupt these unsatisfactory and fragile arrangements. Nobody advocating fancy “digital” solutions evidently considered that people would not already have everything they need to access their amazing creations. And when, as they say, neither love nor money can get you the gadgets you need, it doesn’t even matter how well-off you are: suddenly you get a downgrade in experience to a level that, as a happy consumer, you probably didn’t even know still existed, even if it is still the reality for whole sections of our societies. We have all seen how narrow the margins are between everything apparently being “just fine” and there being an all-consuming crisis, both on a global level and, for many, on a personal level, too.

Recognise Responsibilities to Others

Change can be a positive thing if it carries everyone along and delivers actual progress. Meanwhile, there are those who embrace disruption as a form of change, claiming it to be a form of progress, too, but that form of change is destructive, harmful and exclusionary. It should not be a surprise that prominent advocates of a certain political movement advocate such disruptive change: for them, it doesn’t matter how many people suffer by the ruinous change they have inflicted on everyone as long as they are the ones to benefit; everyone else can wait fifty years or so to see some kind of consolation for the things taken from them, apparently.

As we deliver technology to others, we should not be the ones deepening any misery already experienced by imposing needless and costly change. We should be letting people catch up with the state of technology and allowing them to be comfortable with it. We should invest in long-term solutions that address people’s needs, and we should refuse to be shamed into playing the games of opportunists and profiteers who ridicule anything old or familiar in favour of what they happen to be promoting today. We should demand that people’s investments in hardware and software be protected, that they are not effectively coerced into constantly buying new things and seeing their their living standards diminished in other ways, with such consumption burdening our planet’s ecosystem and resources.

Just as we all experience that others have power over us, so we might recognise the power we have over other people. And just as we might expect others to consider our interests, so we might consider the interests of those who have to put up with our decisions. Maybe, in the end, all I am doing is asking for people to show some consideration for the experiences of other people, that their lives not be made any harder than they might already be. Is that really too much to ask? Is that so hard to understand?

Some Attention to Detail

Thursday, February 14th, 2019

I spent some time recently looking at my Python-like language, Lichen, and its toolchain. Although my focus was on improving support for floating point numbers and arithmetic, of which more may need to be written in a future article, I ended up noticing a few things that needed correcting and had escaped my attention. One of these probably goes a long way to solving a mystery raised in a previous article.

The investigation into floating point support necessitated some scrutiny of the way floating point numbers are allocated when compiled Lichen programs are run. CPython – the C language implementation of a virtual machine for the Python language – has various strategies for reserving memory for floating point numbers, this not being particularly surprising given what it does for integers, as we previously saw. What bothered me was how much time was being spent allocating space for numbers needed to store computation results.

I spent quite a bit of time looking at the run-time support code for compiled programs, trying different strategies to “preallocate” number instances and other things, but it was when I was considering various other optimisation strategies and running generated programs in the GNU debugger (gdb) that I happened to notice something about the type definitions that are generated for instances. For example, here is what a tuple instance type looks like:

typedef struct {
    const __table * table;
    __pos pos;
    __attr attrs[__csize___builtins___tuple_tuple];
} __obj___builtins___tuple_tuple;

And here is what it should look like:

typedef struct {
    const __table * table;
    __pos pos;
    __attr attrs[__isize___builtins___tuple_tuple];
} __obj___builtins___tuple_tuple;

Naturally, I will excuse you for not necessarily noticing the crucial difference, but it is the size of the attrs array, this defining the attributes that are available via each instance of the tuple type. And here, I had used a constant prefixed with “__csize” meaning class size, as opposed to “__isize” meaning instance size. With so many things to think about when finishing off my toolchain, I had accidentally presented the wrong kind of value to the code generating these type definitions.

So, what was going to happen was that instances were going to be given the wrong number of attributes: a potentially catastrophic fault! But it is in the case of types like the tuple where things get more interesting than that. Such types tend to have lots of methods associated with them, and these methods are, of course, stored as class attributes.

Meanwhile, tuple instances are likely to have far fewer attributes, and even when the tuple data is considered, since tuples frequently have few elements, such instances are also likely to be far smaller than the size of the tuple class’s structure. Indeed, the following definitions are more or less indicative of the sizes of the tuple class and of tuple instances:

__csize___builtins___tuple_tuple = 36
__isize___builtins___tuple_tuple = 2

And I had noticed this because, for some reason unknown to me at the time but obviously known to me now, floating point numbers were being allocated using far more space than I thought appropriate. Here are some definitions of interest:

__csize___builtins___float_float = 43
__isize___builtins___float_float = 1

Evidently, something was very wrong until I noticed my simple mistake: that in the code generating the definitions for program types, I had accidentally used the wrong constant for instance attribute arrays. Fixing this meant that the memory allocator probably only needed to find 16 bytes or so, as opposed to maybe 186 bytes, for each number!

Returning to tuples, though, it becomes interesting to see what effect this fix has on the performance of the benchmark previously discussed. We had previously seen that a program using tuples was inexplicably far slower than one employing objects to represent the same data. But with this unnecessary allocation occurring, it seems possible that this might have been making some extra work for the allocator and garbage collector.

Here is a table of measurements from running the benchmark before and after the fix had been applied:

Program Version Time Maximum Memory Usage
Tuples 24s 122M
Objects 15s 54M
Tuples (fixed) 17s 30M
Objects (fixed) 13s 30M

Although there is still a benefit to using objects to model data in Lichen as opposed to keeping such data in tuples, the benefit is not as pronounced as before, with the memory usage now clearly comparable as we would expect. With this fix applied, both versions of the benchmark are even faster than they were before, but it is especially gratifying that the object-based version is now ten times faster when compiled with the Lichen toolchain than the same program run by the CPython virtual machine.

The Elephant in the Room

Sunday, September 9th, 2018

I recently had my attention drawn towards a blog article about the trials of Free Software development by senior Python core developer, Brett Cannon. Now, I agree with the article’s emphasis on being nice to other people, and I sympathise with those who feel that their community-related activities are wearing them down. However, I would like to point out some aspects of his article that fall rather short of my own expectations about what Free Software, or “open source” as he calls it, should be about.

I should perhaps back up a little and mention where this article was found, which was via the “Planet Python” blog aggregator site. I do not read Planet Python, either in my browser or using a feed reader, any more. Those who would create some kind of buzz or energy around Python have somehow managed to cultivate a channel where it seems that almost every post is promoting something. I might quickly and crudely categorise the posts as follows:

  • “Look at our wonderful integrated development environment which is nice to Python (that is written in Java)! (But wouldn’t you rather use the Java-related language we are heavily promoting instead?)”
  • Stub content featuring someone’s consulting/training/publishing business.
  • Random “beginner” articles either parading the zealotry of the new convert or, of course, promoting someone’s consulting/training/publishing business.

Maybe such themes are merely a reflection of attitudes and preoccupations held amongst an influential section of the Python community, and perhaps there is something to connect those attitudes with the topics discussed below. I do recall other articles exhorting Python enthusiasts to get their name out there by doing work on “open source”, with the aim of getting some company’s attention by improving the software that company has thrown over the wall, and “Python at <insert company name>” blogging is, after all, a common Planet Python theme.

Traces of the Pachyderm

But returning to the article in question, if you read it with a Free Software perspective – that is to say that you consciously refer to “Free Software”, knowing that “open source” was coined by people who, for various reasons, wanted another term to use – then certain things seem to stand out. Most obviously, the article never seems to mention software freedom: it is all about “having fun”, attracting contributors to your projects, giving and receiving “kindnesses”, and participating in “this grand social experiment we call open source”. It is almost as if the Free Software movement and the impetus for its foundation never took place, or if it does have a place in someone’s alternative version of history, then in that false view of reality Richard Stallman was only motivated to start the GNU project because maybe he wanted to “have fun hacking a printer”.

Such omissions are less surprising if you have familiarity with attitudes amongst certain people in various Free Software communities – those typically identifying as “open source”, of course – who bear various grudges against the FSF and Richard Stallman. In the Python core development community, those grudges are sometimes related to some advice given about GPL-compatible licensing back when CPython was changing custodian and there had been concerns, apparently expressed by the entity being abandoned by the core developers, that the original “CWI licence” was not substantial enough. We might wonder whether grudges might be better directed towards those who have left CPython with its current, rather incoherent, licensing paper trail.

A Different Kind of Free

Now, as those of us familiar with the notion of Free Software should know, it is “a matter of freedom, not price”. You can very well sell Free Software, and nobody is actually obliged to distribute their Free Software works at no cost. In fact, the advice from those who formulated the very definition of Free Software is this:

Distributing free software is an opportunity to raise funds for development. Don’t waste it!

Of course, there are obligations about providing the source code for software already distributed in executable form and limitations about the fees or charges to be imposed on recipients, but these do not compel no-cost sharing, publication or distribution. Meanwhile, the Open Source Definition, for those who need an “open source” form of guidance, states the following:

The license shall not require a royalty or other fee for such sale.

This appearing, rather amusingly, in a section entitled “Free Redistribution” where “Free” apparently has the same meaning as the “Free” in Free Software: the label that the “open source” crowd were so vehemently opposed to. It is also interesting that the “Source Code” section of the definition also stipulates similar obligations to those upheld by copyleft licences.

So, in the blog article in question, it is rather interesting to see the following appear:

While open source, by definition, is monetarily free, that does not mean that the production of it is free.

Certainly, the latter part of the sentence is true, and we will return to that in a moment, but the former part is demonstrably false: the Open Source Definition states no such thing. In fact, it states that “open source” is not obliged to cost anything, which as the practitioners of logic amongst us will note is absolutely not the same thing as obliging it to always be cost-free.

Bearing the Cost

Much of the article talks about the cost of developing Free Software from the perspective of those putting in the hours to write, test, maintain and support the code. These “production” costs are acknowledged while the producers are somehow shackled to an economic model – one that is apparently misinformed, as noted above – that demands that the cost of all this work be zero to those wanting to acquire it.

So, how exactly are the production costs going to be met? One of the most useful instruments for doing so has apparently been discarded, and I imagine that a similarly misguided attitude lingers with regard to supporting Free Software produced under such misconceptions. Indeed, much of the article focuses on doing “free work”, that of responding to requests, dealing with feedback, shepherding contributions, and the resulting experience of being “stressed by strangers”.

Normally, when one hears of something of this nature taking place, when the means to live decently and to control one’s own life is being taken away from people, there is a word that springs to mind: exploitation. From what we know about certain perspectives about Free Software and “open source”, it is hardly a surprise that the word “exploitation” does not appear in the article because such words are seen by some as “political”, where “political” takes on the meaning of “something raising awkward ethical questions” that if acknowledged and addressed appropriately would actually result in people not being exploited.

But there is an ideological condition which prevents people from being “political”. According to those with such a condition, we are not supposed to embarrass those who could help us deal with the problems that trouble us because that might be “impolite”, and it might also be questioning just how they made their money, how badly they may have treated people on their way to the top, and whether personal merit had less to do with their current status than good fortune and other things that undermine the myth of their own success. We are supposed to conflate money and power with merit or to play along convincingly enough at least as long as the wallets of such potential benefactors are open.

So there is this evasion of the “political” and a pandering to those who might offer validation and maybe even some rewards for all the efforts that are being undertaken as long as their place is not challenged. And what that leaves us with is a catalogue of commiseration, where one can do no more than appeal to those in the same or similar circumstances to be nicer to each other – not a bad idea, it must be said – but where the divisive and exploitative forces at work will result in more conflict over time as people struggle even harder to keep going.

When the author writes this…

Remember, open source is done by people giving something away for free because they choose to; you could say you’re dealing with a bunch of digital hippies.

…we should also remember that “open source” is also done by people who will gladly take those things and still demand more things for free, these being people who will turn a nice profit for themselves while treating others so abominably.

Selective Sustainability

According to the article “the overall goal of open source is to attract and retain people to help maintain an open source project while enjoying the experience”. I cannot speak for those who advocate “open source”, but this stated goal is effectively orthogonal to the aim of Free Software, which is to empower users by presenting them with the means to take control of the software they use. By neglecting software freedom, the article contemplates matters of sustainability whilst ignoring crucial factors that provide the foundations of sustainability.

For instance, there seems to be some kind of distinction being drawn between Free Software projects that people are paid to work on (“corporate open source”) and those done in their own time (“community open source”). This may be a reflection of attitudes within companies: that there are some things that they may use but which, beyond “donations” and letting people spend a portion of their work time on it, they will never pay for. Maybe such software does not align entirely with the corporate goals and is therefore “something someone else can pay for”, like hospitals, schools, public services, infrastructure, and all the other things that companies of a certain size often seem to be unwilling to fund as they reduce their exposure to taxation.

Free Software, then, becomes almost like the subject of charity. Maybe the initiator of a project will get recognised and hired for complementing and enhancing a company’s proprietary product, just like the individual described in the article’s introduction whose project was picked up by the author’s employer. I find it interesting that the author notes how important people are to the sustainability of a project but then acknowledges that the project illustrating his employer’s engagement with “open source” could do just fine without other people getting involved. Nothing is said about why that might be the case.

So, with misapprehensions about whether anyone can ask for money for their Free Software work, plus cultural factors that encourage permissive licensing, “building a following” and doing things “for exposure”, and with Free Software being seen as something needing “donations”, an unsustainable market is cultivated. Those who wish to find some way of funding their activities must compete with people being misled into working for free. And it goes beyond whether people can afford the time: time is money, as they say, and the result may well be that people who have relatively little end up subsidising “gifts” for people who are substantially better off.

One may well be reminded of other exploitative trends in society where the less well-off have to sacrifice more and work harder for the causes of “productivity” and “the economy”, with the real beneficiaries being the more well-off looking to maximise their own gains and optimise their own life-enriching experiences. Such trends are generally not regarded as sustainable in any way. Ultimately, something has to give, as history may so readily remind us.

Below the Surface

It is certainly important to make sure people keep wanting to do an activity, whether that is Free Software development or anything else, but having enough people who “enjoy doing open source” is far from sufficient to genuinely sustain a Free Software project. It is certainly worthwhile investigating the issues that determine whether people derive enjoyment from such work or not, along with the issues that cause stress, dissatisfaction, disillusionment and worse.

But what good is it if no-one deals with these issues? When taking “a full month off annually from volunteering” is seen as the necessary preventative medicine to avoid “potential burnout”, and when there is even such a notion as “open source detox”, does it not indicate that the symptoms may be seeing some relief but the cause remains untreated? The author of the article seems to think that the main source of misery is the way people treat each other:

It all comes down to how people treat each other in open source.

That in itself is something of a superficial diagnosis given that some people may not experience random angry people criticising their work at all, and yet they may be dissatisfied with their situation nevertheless. Others may experience bad interactions, but these might be the result of factors that are not so readily acknowledged. I do not condone behaviour that might be offensive, let alone abusive, but when people react strongly in their interactions with others, they may be doing so as the consequence of what they perceive as ill-treatment or even a form of oppression or betrayal.

There is much talk of kindness, and I cannot exactly disagree with the recommendation that people be kind to each other. But I also have the feeling that another story is not being told, one of how people with a level of power and influence choose to discharge their responsibilities. And in the context of Python, the matter of Python 3 is never far away. People may have received the “gift” of Python, but they have invested in it, too. In a way, this goes beyond any reciprocation of a mere gift because this investment is also a form of submission to the governance of the technology, as well as a form of validation of it that persuades others of its viability and binds those others to its governance, too.

What then must someone with a substantial investment in that technology think when presented with something like the “Python 2.7 Countdown” clock? Is it a helpful tool for technological planning or a way of celebrating and trivialising disruption to widespread investment in, and commitment to, a mature technology? What about the “Python 3 Statement” with projects being encouraged to pledge to drop support for Python 2 and to deliberately not maintain any such support beyond the official “end of life” date? Is it an encouraging statement of enthusiasm or another way of passive-aggressively shaming those who would continue to use and support Python 2?

I accept that it would be unfair to demand that the Python core developers be made to continue to support Python 2. But I also think it is unfair to see destructive measures being taken to put Python 2 “beyond use”, the now-familiar campaigns of inaccurate or incorrect information to supposedly stir people into action to port their software to Python 3, the denial of the name “Python” to anyone who might step up and continue to support Python 2, the atmosphere of hostility to those who might take on that role. And, well, excuse me if I cannot really take the following statement seriously based on the strategic choices of the Python core developers:

And then there’s the fact that your change may have just made literally tons of physical books in bookstores around the world obsolete; something else I have to consider.

It is intriguing that there is an insistence that people not expect anything when they do something for the benefit of another, that “kindnesses” are more appropriate than “favours”:

I switched to using kindnesses because being kind in the cultures I’m familiar with has no expectation of something in return.

Aside from the fact that it becomes pretty demotivating to send fixes to projects and expect nothing to ever happen to them, to take an example of the article’s author, which after a while amounts to a lot of wasted time and effort, I cannot help but observe that returning the favour was precisely what the Python core developers expected when promoting Python 3. From there, one cannot help but observe that maybe there is one rule for one group and one rule for another group in the stratified realm of “open source”.

The Role of the Elephant

In developing Free Software and acknowledging it as such, we put software freedom – the elephant in this particular room – at the forefront of our efforts. Without it, as we have seen, the narrative is weaker, people’s motivations seem less convincing or unfathomable, and suggestions for improving everybody’s experience, although welcome, fail to properly grasp some of the actual causes of dissatisfaction and unhappiness. This because the usual myths of efficiency, creative exuberancy, and an idealised “gift culture” need to be conjured up to either explain people’s behaviour or to motivate it, the latter often in a deliberately exploitative way.

It is, in fact, software freedom that gives Python 2 users any hope for their plight, even though many of them may be dissatisfied and some of them may end up committing to other languages instead of Python 3 in future. By emphasising software freedom, they and others may be educated about their right to control their technological investment, and they may be reminded that in seeking assistance to exercise that control, they might be advised to pay others to sustain their investment. At no point does the narrative need to slip off into “free stuff”, “gifts” and the like.

Putting software freedom at the centre of Free Software activities might also promote a more respectful environment. When others are obliged to uphold end-user freedoms, they might already be inclined to think about how they treat other people. We have seen a lot written about interpersonal interactions, and it is right to demand that people treat each other with respect, but maybe such respect needs to be cultivated by having people think about higher goals. And maybe such respect is absent if those goals are deliberately ignored, focusing people to consider only each individual transaction in isolation and to wonder why everyone acts so selfishly.

So instead of having an environment where a company might be looking for people to do free work so that they can seal it up, sell a proprietary product to hapless end-users, treat the workers like “digital hippies”, and thus exploit everyone involved, we invoke software freedom to demand fairness and respect. A culture of respecting the rights of others should help participants realise that they have a collective responsibility, that everyone is in it together, that the well-being of others does not come at the cost of each participant’s own well-being.

I realise that some of the language used above is “political” for some, but when those who object to “political” language perpetuate ignorance of the roots of Free Software and marginalise such movements for social change, they also perpetuate a culture of exploitation, whether they have this as their deliberate goal or not. This elephant has been around for some time, and having a long memory as one might expect, it stands as a witness to the perils of neglecting the ethical imperatives for what we do as Free Software developers.

It is, of course, possible to form a more complete, more coherent picture of how Free Software development occurs and how sustainability in such endeavours might be achieved, but evidently this remains out of reach for those still wishing to pretend that there is no elephant in the room.

Tuple Performance Optimisations in CPython and Lichen

Sunday, July 22nd, 2018

One of the nice things about the Python programming language is the selection of built-in data types it offers for common activities. Things like lists (collections of values) and dictionaries (key-value mappings) are very convenient and do not need much further explanation, but there is also the concept of the tuple, which is also a collection of values, like a list, but whose size and values are fixed for its entire lifespan, unlike a list. Here is what a very simple tuple looks like:

(123, "abc")

Normally, the need for a data type like this becomes apparent when programming and needing to return multiple values from a function. In languages that do not support such a convenient way of bundling things together, some extra thought is usually required to send data back to the caller, and in languages like C the technique of mutating function arguments and thus communicating such data via a function’s parameters is often used instead.

Lichen, being a Python-like language, also supports tuples. The parsing of source code, employing various existing Python libraries, involves the identification of tuple literals: occurrences of tuples written directly in the code, as seen above. For such values to have any meaning, they must be supported by a particular program representation of the tuple plus the routines that provide each tuple with their familiar characteristics. In other words, we need to provide a way of translating these values into code that makes them tangible things within a running program.

Here, Lichen differs from various Python implementations somewhat. CPython, for instance, defines practically all of the nature of its tuples in C language code (the “C” in CPython), with the pertinent file in CPython versions from 1.0 all the way to 2.7 being found as Objects/tupleobject.c within the source distribution. Meanwhile, Jython defines its tuples in Java language code, with the pertinent file being found as src/org/python/core/PyTuple.java (without the outermost src directory in very early versions). Lichen, on the other hand, implements the general form of tuples in the Lichen language itself.

Tuples All The Way Down

This seems almost nonsensical! How can Lichen’s tuples be implemented in the language itself? What trickery is involved to pull off such an illusion? Well, it might be worth clarifying what kind of code is involved and which parts of the tuple functionality are really provided by the language framework generally. The Lichen code for tuples is found in lib/__builtins__/tuple.py and has the following outline:

class tuple(sequence, hashable):
    "Implementation of tuple."

    def __init__(self, args=None):
        "Initialise the tuple."

    def __hash__(self):
        "Return a hashable value for the tuple."

    def __add__(self, other):
        "Add this tuple to 'other'."

    def __str__(self):
        "Return a string representation."

    def __bool__(self):
        "Tuples are true if non-empty."

    def __iter__(self):
        "Return an iterator."

    def __get_single_item__(self, index):
        "Return the item at the normalised (positive) 'index'."

Here, the actual code within each method has been omitted, but the outline itself defines the general structure of the data type, described by a class, representing the behaviour of each tuple. As in Python, a collection of special methods are provided to support standard operations: __hash__ supports the hash built-in function and is used when using tuples as dictionary keys; __bool__ permits the truth value testing of tuples so that they may be considered as “true” or not; and so on.

Since this definition of classes (data types) is something that needs to be supported generally, it makes sense to use the mechanisms already in place to allow us to define the tuple class in this way. Particularly notable here is the way that the tuple class inherits from other “base classes” (sequence and hashable). Indeed, why should the tuple class be different from any other class? It still needs to behave like any other class with regard to supporting things like methods, and in Lichen its values (or instances) are fundamentally just like instances of other classes.

It would, of course, be possible for me to define the tuple class in C (it being the language to which Lichen programs are compiled), but another benefit of just using the normal process of parsing and compiling the code written in the Lichen language is that it saves me the bother of having to work with such a lower-level representation and the accompanying need to update it carefully when changing its functionality. The functionality itself, being adequately expressed as Lichen code, would need to be hand-compiled to C: a tedious exercise indeed.

One can turn such questions around and ask why tuples are special things in various Python implementations. A fairly reasonable response is that CPython, at least, has evolved its implementation of types and objects over the years, starting out as a “scripting language” offering access to convenient data structures implemented in C and a type system built using those data structures. It was not until Python 2.2 that “type/class unification” became addressed, meaning that the built-in types implemented at the lowest levels – tuples amongst them – could then be treated more like “user-defined classes”, these classes being implemented in Python code.

Although the outline of a tuple class can be defined in the Lichen language itself, and although operations defining tuple behaviour are provided as Lichen code, this does not mean that everything can be implemented at this level. For example, programs written in Lichen do not manage the memory their objects use but instead delegate this task to “native” code. Moreover, some of the memory being managed may have representations that only sensibly exist at a lower level. We can start to investigate this by considering the method returning the size or length of a tuple, invoked when the len built-in function is called on a tuple:

    def __len__(self):
        "Return the length of the tuple."
        return list_len(self.__data__)

Here, the method delegates practically everything to another function, presenting the __data__ attribute of the instance involved in the method call (self). This other function actually isn’t implemented in Lichen: it is a native function that knows about memory and some low-level structures that support the tuple and list abstractions. It looks like this:

__attr __fn_native_list_list_len(__attr self, __attr _data)
{
    unsigned int size = _data.seqvalue->size;
    return __new_int(size);
}

And what it does is to treat the __data__ attribute as a special sequence structure, obtaining its size and passing that value back as an integer usable within Lichen code. The sequence structure is defined as part of the support library for compiled Lichen programs, along with routines to allocate such structures and to populate them. Other kinds of values are also represented at the native level, such as integers and character strings.

To an extent, such native representations are not so different from the special data types implemented in C within CPython and in Java within Jython. However, the Lichen implementation seeks to minimise the amount of native code dedicated to providing abstractions. Where functionality supporting a basic abstraction such as a tuple does not need to interact directly with native representations or perform “machine-level” operations, it is coded in Lichen, and this code can remain happily oblivious to the nature of the data passing through it.

There are interesting intellectual challenges involved here. One might wonder how minimal the layer of native code might need to be, for instance. With a suitable regime in place for translating Lichen code into native operations, might it be possible to do memory management, low-level arithmetic, character string operations, system calls, and more, all in the same language, not writing any (or hardly writing any) native code by hand? It is an intriguing question but also a distraction, and that leads me back towards the main topic of the article!

The Benchmarking Game

Quite a few years ago now, there was a project being run to benchmark different programming languages in order to compare their performance. It still exists, it would seem. But in the early days of this initiative, the programs were fairly simple translations between languages and the results relatively easy to digest. Later on, however, there seemed to be a choice of results depending on the hardware used to create them, and the programs became more complicated, perhaps as people saw their favourite language falling down the result tables and felt that they needed to employ a few tricks to boost their language’s standing.

I have been interested in Python implementations and their performance for a long time, and one of the programs that I have used from time to time has been the “binary trees” benchmark. You can find a more complicated version of this on the Python Interpreters Benchmarks site as well as on the original project’s site. It would appear that on both these sites, different versions are being run even for the same language implementation, presumably to showcase optimisations.

I prefer to keep things simple, however. As the Wikipedia page notes, the “binary trees” benchmark is presumably a test of memory allocation performance. What I discovered when compiling a modified version of this program, one that I had originally obtained without the adornments of multiprocessing module and generator usage, was perhaps more interesting in its own right. The first thing I found was that my generated C program was actually slower than the original program run using CPython: it took perhaps 140% of the CPython running time (48 seconds versus 34 seconds).

My previous article described various realisations that I had around integer performance optimisations in CPython. But when I first tried to investigate this issue, I was at a loss to explain it. It could be said that I had spent so much effort getting the toolchain and supporting library code into some kind of working order that I had little energy left for optimisation investigations, even though I had realised one of my main objectives and now had the basis for such investigations available to me. Perhaps a quick look at the “binary trees” code is in order, so here is an extract:

def make_tree(item, depth):
    if depth > 0:
        item2 = 2 * item
        depth -= 1
        return (item, make_tree(item2 - 1, depth), make_tree(item2, depth))
    else:
        return (item, None, None)

So, here we have some tuples in action, and in the above function, recursion takes place – the function calls itself – to make the tree, hence the function name. Consequently, we have a lot of tuples being created and can now understand what the Wikipedia page was claiming about the program. The result of this function gets presented to another function which unpacks the return value, inspects it, and then calls itself recursively, too:

def check_tree(tree):
    (item, left, right) = tree
    if left is not None:
        return item + check_tree(left) - check_tree(right)
    else:
        return item

I did wonder about all these tuples, and in the struggle to get the language system into a working state, I had cobbled together a working tuple representation, in which I didn’t really have too much confidence. But I wondered about what the program would look like in the other languages involved in the benchmarking exercise and whether tuples (or some equivalent) were also present in whichever original version that had been written for the exercise, possibly in a language like Java or C. Sure enough, the Java versions (simple version) employ class instances and not things like arrays or other anonymous data structures comparable to tuples.

So I decided to change the program to also use classes and to give these tree nodes a more meaningful form:

class Node:
    def __init__(self, item, left, right):
        self.item = item
        self.left = left
        self.right = right

def make_tree(item, depth):
    if depth > 0:
        item2 = 2 * item
        depth -= 1
        return Node(item, make_tree(item2 - 1, depth), make_tree(item2, depth))
    else:
        return Node(item, None, None)

In fact, this is a somewhat rudimentary attempt at introducing object orientation since we might also make the function a method. Meanwhile, in the function handling the return value of the above function, the tuple unpacking was changed to instead access the attributes of the returned Node instances seen above.

def check_tree(tree):
    if tree.left is not None:
        return tree.item + check_tree(tree.left) - check_tree(tree.right)
    else:
        return tree.item

Now, I expected this to be slower in CPython purely because there is more work being done, and instance creation is probably more costly than tuple creation, but I didn’t expect it to be four times slower (at around 2 minutes 15 seconds), which it was! And curiously, running the same program compiled by Lichen was actually quicker (22 seconds), which is about 65% of the original version’s running time in CPython, half the running time of the original version compiled by Lichen, and nearly an sixth of the revised version’s running time in CPython.

One may well wonder why CPython is so much slower when dealing with instances instead of tuples, and this may have been a motivation for using tuples in the benchmarking exercise, but what was more interesting to me at this point was how the code generated by the Lichen toolchain was managing to be faster for instances, especially since tuples are really just another kind of object in the Lichen implementation. So why were tuples slower, and could there be a way of transferring some of the performance of general objects to tuples?

Unpacking More Performance

The “binary trees” benchmark is supposed to give memory allocation a workout, but after the integer performance investigation, I wasn’t about to fall for the trick of blaming the allocator (provided by the Boehm-Demers-Weiser garbage collection library) whose performance is nothing I would complain about. Instead, I considered how CPython might be optimising tuple operations and paid another visit to the interpreter source code (found in Python/ceval.c in the sources for all the different releases of Python 1 and 2) and searched for tuple-related operations.

My experiments with Python over the years have occasionally touched upon the bytecode employed by CPython to represent compiled programs, each bytecode instruction being evaluated by the CPython interpreter. I already knew that some program operations were supported by specific bytecodes, and sure enough, it wasn’t long before I encountered a tuple-specific example: the UNPACK_SEQUENCE instruction (and its predecessors in Python 1.5 and earlier, UNPACK_TUPLE and UNPACK_LIST). This instruction is generated when source code like the following is used:

(item, left, right) = tree

The above would translate to something like this:

              0 LOAD_FAST                0 (tree)
              3 UNPACK_SEQUENCE          3
              6 STORE_FAST               1 (item)
              9 STORE_FAST               2 (left)
             12 STORE_FAST               3 (right)

In CPython, you can investigate the generated bytecode instructions using the dis module, putting the code of interest in a function, and running the dis.dis function on the function object, which is how I generated the above output. Here, UNPACK_SEQUENCE makes an appearance, accessing the items in the tree sequence one by one, pushing them onto the evaluation stack, CPython’s interpreter being a stack-based virtual machine. And sure enough, the interpreter capitalises on the assumption that the operand of this instruction will most likely be a tuple, testing it and then using tuple-specific operations to get at the tuple’s items.

Meanwhile, the translation of the same source code by the Lichen toolchain was rather less optimal. In the translation code, the unpacking operation from the input program is rewritten as a sequence of assignments, and something like the following was being generated:

item = tree[0]
left = tree[1]
right = tree[2]

This in turn gets processed, rewriting the subscript operations (indicated by the bracketing) to the following:

item = tree.__getitem__(0)
left = tree.__getitem__(1)
right = tree.__getitem__(2)

This in turn was being translated to C for the output program. But this is not particularly efficient: it uses a generic mechanism to access each item in the tree tuple, since it is possible that the only thing we may generally assert about tree is that it may provide the __getitem__ special method. The resulting code has to perform extra work to eventually arrive at the code that actually extracts an item from the tuple, and it will be doing this over and over again.

So, the first thing to try was to see if there was any potential for a speed-up by optimising this unpacking operation. I changed the generated C code emitted for the operations above to use the native tuple-accessing functions instead and re-ran the program. This was promising: the running time decreased from 48 seconds to 23 seconds; I had struck gold! But it was all very well demonstrating the potential. What now needed to be done was to find a general way of introducing something similarly effective that would work automatically for all programs.

Of course, I could change the initial form of the unpacking operations to use the __getitem__ method directly, but this is what was being produced anyway, so there would be no change whatsoever in the resulting program. However, I had introduced a Lichen-specific special method, used within the standard library, that accesses individual items in a given sequence instance. (It should be noted that in Python and Lichen, the __getitem__ method can accept a slice object and thus return a collection of values, not just one.) Here is what the rewritten form of the unpacking would now look like:

item = tree.__get_single_item__(0)
left = tree.__get_single_item__(1)
right = tree.__get_single_item__(2)

Compiling the program and running it gave a time of 34 seconds. We were now at parity with CPython. Ostensibly, the overhead in handling different kinds of item index (integers or slice objects) was responsible for 30% of the original version’s running time. What other overhead might there be, given that 34 seconds is still rather longer than 23 seconds? What does this other special method do that my quick hack does not?

It is always worth considering what the compiler is able to know about the program in these cases. Looking at the __get_single_item__ method for a tuple reveals something of interest:

    def __get_single_item__(self, index):
        "Return the item at the normalised (positive) 'index'."
        self._check_index(index)
        return list_element(self.__data__, index)

In the above, the index used to obtain an item is checked to see if it is valid for the tuple. Then, the list_element native function (also used on tuples) obtains the item from the low-level data structure holding all the items. But is there a need to check each index? Although we do need to make sure that accesses to do not try and read “off the end” of the collection of items, accessing items that do not exist, we do not actually need to “normalise” the index.

Such normalisation is the process of interpreting negative index values as referring to items relative to the end of the collection, with -1 referring to the last item, -2 to the next last item, and so on, all the way back to -n referring to the first item (with n being the number of items in the collection). However, the code being generated does not use negative index values, and if we introduce a test to make sure that the tuple is large enough, then we should be able to get away with operations that use the provided index values directly. So I resolved to introduce another special method for this purpose, now rewriting the code as follows:

__builtins__.sequence._test_length(tree, 3)
item = tree.__get_single_item_unchecked__(0)
left = tree.__get_single_item_unchecked__(1)
right = tree.__get_single_item_unchecked__(2)

The _test_length function will raise an exception if the length is inappropriate. Meanwhile, the newly-introduced special method is implemented in a base class of both tuples and lists, and it merely employs a call to list_element for the provided index. Compiling the code with these operations now being generated and running the result yielded a running time of 27 seconds. Some general changes to the code generation, not specific to tuples, brought this down to 24 seconds (and the original version down to 44 seconds, with the object-based version coming down to 16 seconds).

So, the progression in performance looks like this:

Program Version (Lichen Strategy) Lichen CPython
Objects 135 seconds
Tuples (__getitem__) 48 seconds

44 seconds
Tuples (__get_single_item__) 34 seconds  34 seconds
Tuples (__get_single_item_unchecked__) 27 seconds

24 seconds
Objects 22 seconds

16 seconds

Here, the added effect of these other optimisations is also shown where measured.

Conclusions

As we saw with the handling of integers in CPython, optimisations also exist to tune tuple performance in that implementation of Python, and these also exist in other implementations such as Jython (see the unpackSequence method in the org.python.core.Py class, found in the org/python/core/Py.java file). Taking advantage of guarantees about accesses to tuples that are written explicitly into the program source, the generated code can avoid incurring unnecessary overhead, thus considerably speeding up the running time of programs employing tuple unpacking operations.

One might still be wondering why the object-based version of the program is faster than the tuple-based version for Lichen. This is most likely due to the ability of the compiler to make the attribute accesses on the tree object efficient based on deductions it has performed. Fewer low-level operations are performed to achieve the same result, and time is saved as a consequence. One might also wonder why the object-based version is slower when run by CPython. That would probably be due to the flexible but costly way objects are represented and accessed in that language implementation, and this was indeed one of my motivations for exploring other language design and implementation approaches with Lichen.

Investigating CPython’s Optimisation Trickery for Lichen

Saturday, July 7th, 2018

For those of us old enough to remember how the Python programming language was twenty or so years ago, nostalgic for a simpler and kinder time, looking to escape the hard reality of today’s feature enhancement arguments, controversies, general bitterness and recriminations, it can be informative to consider what was appealing about Python all those years ago. Some of us even take this slightly further and attempt to formulate our own take on the language, casting aside things that do not appeal or that seem superfluous, needlessly confusing, or redundant.

My own Python variant, called Lichen, strips away quite a few things from today’s Python but probably wouldn’t seem so different to twentieth century Python. Since my primary objective with Lichen is to facilitate static analysis so that observations can be made about program behaviour before running the program, certain needlessly-dynamic features have been eliminated. Usually, when such statements about feature elimination are made, people seize upon them to claim that the resulting language is statically typed, but this is deliberately not the case here. Indeed, “duck typing” is still as viable as ever in Lichen.

Ancient Dynamism

An example of needless dynamism in Python can arguably be found with the __getattr__ and __setattr__ methods, introduced as far back as Python 1.1. They allow accesses to attributes via instances to be intercepted and values supposedly provided by these attributes to be computed on demand. In effect, these methods support virtual or dynamic attributes that are not really present on an object. Here’s an extract from one of the Python 1.2 demonstration programs (Demo/pdist/client.py):

        def __getattr__(self, name):
                if name in self._methods:
                        method = _stub(self, name)
                        setattr(self, name, method) # XXX circular reference
                        return method
                raise AttributeError, name

In this code, if an instance of the Client class (from which this method is taken) is used to access an attribute called hello, then this method will see if the string “hello” is found in the instance’s _methods attribute, and if so it produces a special object that is then returned as the value for the hello attribute. Otherwise, it raises an exception to indicate that this virtual attribute is not recognised. (Here, the setattr call stores the special object as a genuine attribute in order to save this method from being called again for the same attribute.)

Admittedly, this is quite neat, and it quickly becomes tempting to use such facilities everywhere – this is very much the story of Python and its development – but such things make reasoning about programs more difficult. We cannot know what attributes the instances of this Client class may have without running the program. Indeed, to find out in this case, running the program is literally unavoidable since the _methods attribute is actually populated using the result of a message received over the network!

But even in simpler cases, it can readily be intuitively understood that finding out the supported attributes of instances whose class offers such a method might involve a complicated exercise looking at practically all the code in a program. Despite all the hard work, this exercise will nevertheless produce unreliable or imprecise results. It says something about the fragility of such facilities that properties were later added to Python 2.2 to offer a more declarative alternative.

(It also says something about Python 3 that the cornucopia of special mechanisms for dynamically exposing attributes are apparently still present, despite much having been said about Python 3 remedying such Python 1 and 2 design artefacts.)

Hidden Motives

With static analysis, we might expect to be able to deduce which attributes are provided by class instances without running a program, this potentially allowing us to determine the structure of program objects and to detect errors around their use. But another objective with Lichen is to see how constraints on the language may be used to optimise the performance of programs. I will not deny that performance has always been an interest of mine with respect to Python and its implementations, and I imagine that many compiler and virtual machine implementers have been motivated by such concerns throughout the years.

The deductions made during static analysis can potentially allow us to generate executable programs that perform the same work more efficiently. For example, if it is known that a collection of method calls on an object identify that object as being of a certain type, we can then employ more efficient ways of calling those methods. So, for the following code…

        while number:
            digits.append(_hexdigits[number % base])
            number = number / base

…if we can assert that digits is a list, then considering that we might normally generate code for the append method call as something like this…

__load_via_class(digits, append)(...)

…where the __load_via_class operation has to go and find the append method via the class of digits (or, in some cases, even look for the append attribute on the object first), we might instead be able to generate code like this…

__fn_list_append(digits, ...)

…where __fn_list_append is a genuine C function and the digits instance is passed directly to it, together with the elided arguments. When we can get this kind of thing to happen, it can obviously be very satisfying. Various Python implementations and tools also attempt to make method calls efficient in their own ways, some possibly relying on run-time caches that short-circuit the exercise of finding the method.

Magic Numbers

It can be informative to compare the performance of code generated by the Lichen toolchain and the performance of the same program running in the CPython virtual machine, Python and Lichen being broadly compatible (but not identical). As I noted in my summary of 2017, the performance of generated programs was rather disheartening to see at first. I needed to employ profiling to discover where the time was being spent in my generated code that seemed not to be a comparable burden on CPython.

The practicalities of profiling are definitely beyond the scope of this article, but what I did notice was just how much time was being spent allocating space in memory for integers used by programs. I recalled that Python does some special things with integers itself, and so I set about looking for the details of its memory allocation strategies.

It turns out that integers are allocated in a simplified fashion for performance reasons, instead of using the more general allocator that is compatible with garbage collection. And not just that: a range of “small” integers is also allocated in advance when programs run, so that no time is wasted repeatedly allocating objects for numbers that would likely see common use. The details of this can be found in the Objects/intobject.c file in CPython 1.x and 2.x source distributions. Even CPython 1.0 employs this technique.

At first, I thought I had discovered the remedy for my performance problems, but replicating similar allocation arrangements in my run-time code demonstrated that such a happy outcome was not to be so easily achieved. As I looked around for what other special treatment CPython does, I took a closer look at the bytecode interpreter (found in Python/ceval.c), which is the mechanism that takes the compiled form of Python programs (the bytecode) and evaluates the instructions encoded in this form.

My test programs involved simple loops like this:

i = 0
while i < 100000:
    f(i)
    i += 1

And I had a suspicion that apart from allocating new integers, the operations involved in incrementing them were more costly than they were in CPython. Now, in Python versions from 1.1 onwards, the special operator methods are supported for things like the addition, subtraction, multiplication and division operators. This could conceivably lead to integer addition being supported by the following logic in one of the simpler cases:

# c = a + b
c = a.__add__(b)

But from Python 1.5 onwards, some interesting things appear in the CPython source code:

                case BINARY_ADD:
                        w = POP();
                        v = POP();
                        if (PyInt_Check(v) && PyInt_Check(w)) {
                                /* INLINE: int + int */
                                register long a, b, i;
                                a = PyInt_AS_LONG(v);
                                b = PyInt_AS_LONG(w);
                                i = a + b;

Here, when handling the bytecode for the BINARY_ADD instruction, which is generated when the addition operator (plus, “+”) is used in programs, there is a quick test for two integer operands. If the conditions of this test are fulfilled, the result is computed directly (with some additional tests being performed for overflows not shown above). So, CPython was special-casing integers in two ways: with allocation tricks, and with “fast paths” in the interpreter for cases involving integers.

The Tag Team

My response to this was similarly twofold: find an efficient way of allocating integers, and introduce faster ways of handling integers when they are presented to operators. One option that the CPython implementers actually acknowledge in their source code is that of employing a different representation for integers entirely. CPython may have too much legacy baggage to make this transition, and Python 3 certainly didn’t help the implementers to make the break, it would seem, but I have a bit more flexibility.

The option in question is the so-called tagged pointer approach where instead of having a dedicated object for each integer, with a pointer being used to reference that object, the integers themselves are represented by a value that would normally act as a pointer. But this value is not actually a valid pointer at all since it has its lowest bit set, which violates a restriction that is imposed by some processor architectures, but it can be a self-imposed restriction on other systems as well, merely ruling out the positioning of objects at odd-numbered addresses.

So, we might have the following example representations on a 32-bit architecture:

hex value      31..............................0 (bits)
0x12345678 == 0b00010010001101000101011001111000 => pointer
0x12345679 == 0b00010010001101000101011001111001 => integer

Clearing bit 0 and shifting the other bits one position to the right yields the actual integer value, which in the above case will be 152709948. It is conceivable that in future I might sacrifice another bit for encoding other non-pointer values, especially since various 32-bit architectures require word-aligned addresses, where words are positioned on boundaries that are multiples of four bytes, meaning that the lowest two bits would have to be zero for a pointer to be valid.

Albeit with some additional cost incurred when handling pointers, we can with such an approach distinguish integers from other types rapidly and conveniently, which makes the second part of our strategy more efficient as well. Here, we need to identify and handle integers for the arithmetic operators, but unlike CPython, where this happens to be done in an interpreter loop, we have no such loop. Instead we generate code for such operators that simply invokes some existing functions (written in the Lichen language and compiled to C code, another goal being to write as much of the language system in Lichen itself, not C).

It would be rather wasteful to generate tests for integers in addition to these operator function calls every time such a call is made, but the tests can certainly reside within those functions instead. So, here is what we might do for the addition operator:

def add(a, b):
    if is_int(a) and is_int(b):
        return int_add(a, b)
    return binary_op(a, b, lambda a: a.__add__, lambda b: b.__radd__)

This code leaves me with a bit of explaining to do! Last things first: the final statement is the general case of dispatching to the operands and calling an appropriate operator method, with the binary_op function performing the logic in conjunction with the operands and some lambda functions that defer access to the special methods until they are really needed. It is probably best just to trust me that this does the job!

Before the generic operator method dispatch, however, is the test of the operands to see if they are both integers, and this should be vaguely familiar from the CPython source code. A special function is then called to add them efficiently. Note that we couldn’t use the addition (plus, “+”) operator because this code is meant to be handling that, and it would most likely send us on an infinitely recursive loop that never gets round to performing the addition! (I don’t treat the operator as a special case in this code, either. This code is compiled exactly like any other code written in the language.)

The is_int function is what I call “native”, meaning that it is implemented using low-level operations, in this case ones that test the representation of the argument to see if it has its lowest bit set, returning a true value if so. Meanwhile, int_add is largely equivalent to the addition operation seen in the CPython source code above, just with different details involved.

Progress and Reflections

Such adjustments made quite a difference to the performance of my generated code. They do also make some sense, too. Integers are used a lot in programs, being used not only for general arithmetic, but also for counters, index values for things like lists, tuples, strings and other collections, plus a range of other mundane things whose performance can be overlooked until it proves to be suboptimal. Python has something of a reputation for having slow implementations, but CPython’s trickery here optimises in favour of fast results where they can be obtained, falling back on the slower, general mechanisms when these are required.

But I discovered that this is not the only optimisation trickery CPython does, as another program with interesting representation choices and some wildly varying running times was to demonstrate. More on that in the next article on this topic!

The Noble Volunteer (Again)

Sunday, March 11th, 2018

I saw that the usual refrain of “we’re all volunteers here” had another outing on a recent LWN article about the Python 2 to 3 transition, specifically referring to who it is that supposedly does all the core development work on CPython (as well as constantly changing what the Python language is meant to be). There are a few different observations to be made here, so let me establish three main topics:

  1. The funding of Python implementation development.
  2. The hiring of various Python core development contributors.
  3. Python and Free Software as a hobby or spare time effort.

I have written about how the Python Software Foundation raises and spends money before. For the most part, nothing has changed since then: the PSF appears to raise and then spend hundreds of thousands of dollars every year (apparently down from over $300000 in 2016 to under $250000 in 2017, though), directing this money mostly towards events and promotion. In fact, the largest contribution to core-related Python software development in 2017 was actually from the Mozilla Open Source Support programme, with a $170000 grant to fix up the Python Package Index infrastructure. So the PSF is clearly comfortable leaving it to others to fund the P in PSF.

Lots of people depend on the Python Package Index, but like with Free Software in general, the people making good money while leaning on these common, volunteer-run resources never seem to pitch in significantly themselves. It is true that the maintainer of this resource was allowed to work on it as his day job, but then got “downsized”, and now works in a role where he can work on it again but only as part of his day job. But I imagine that the people at Mozilla, some of whom have connections to the world of Python packaging, quite possibly relying on the package infrastructure to get their own stuff done, were getting fed up with “volunteers” as being the usual excuse for nothing getting done.

Now there certainly are Python core developers who are employed in work that influences CPython development or that has some connection to Python, perhaps related to other implementations of Python. Notably, Pyston and Pyjion were both developed by core developers working at Dropbox and Microsoft respectively. Famously, Guido van Rossum, Python’s originator, was hired by Google and then Dropbox, seemingly being able to dedicate some of his time on Python topics as part of his day job at both places. After all, it was during Van Rossum’s time at Google, accompanied by other Google-employed Python core contributors, that Python 3 started to take shape.

So it seems that some very large companies recognise the value that Python brings, they even hire influential people in the Python core development community, but maybe this does not translate to proper corporate support for Python core development. It could very well be the case that most of these people really do have to write Python code in their day jobs but cannot direct much or any time towards developing Python – the implementations or the language – in their working hours. They would be volunteers in their own time, albeit volunteers facilitated by their employment, having the stability of a relatively well-paid job and the good fortune of having Python core development as a productive and hopefully rewarding hobby.

Maybe it suits everyone being paid as a result of their reputation in the Python community to indulge in core development as a hobby. But what about everyone else? All those other volunteers who are doing the donkey work of testing and fixing the code when it stops working for them, implementing things that others have deemed a good idea, making Python 3 a reality, or whatever? Well, I suppose they get “pizza and beer soda” paid for by the PSF at their sprints.

In certain circles, it seems that a lot of effort is spent promoting a lifestyle that involves feel-good “volunteerism” and getting your name known through selfless volunteering. If you are one of those “other” volunteers, maybe the ultimate goal is to have the senior hobbyists in the community recommending you to their employers, which would explain how Python core developers seem to cluster in various companies. Maybe this is the new “open source” dream: not actually being paid to work on Free Software but merely pursuing it as a hobby, dependent on an employer for the lifestyle but not influenced by them, at least not conspicuously, retaining the ability to play the volunteer card.

And this leads me to a more general observation that came to mind when reading a remark by someone trying to establish a viable enterprise, all for the benefit of Free Software and open hardware. It was about how he was on the ground, doing all the legwork, opening up new opportunities the hard way while people in their comfortable jobs let him get on with it, throwing pennies his way and waiting for their substantial but cheaply-acquired rewards. Now, in that particular instance my sympathy is muted, for various reasons that hopefully do not need a public airing, but I see the point being made and, once you are aware of it, it is an annoyingly familiar one.

You will often see people inviting others to contribute to their projects, writing things like “how about someone fix this, make this better, implement this, do this?” It sounds so constructive, so worthy, like you can make a difference. In Norwegian, there’s even a word for the spirit of this kind of thing – “dugnad” – which is awkward to translate to English, but it effectively denotes an event or general activity where everyone pitches in collectively to get something done in a way that is relatively painless for each participant. Being a cynic, I would often translate “dugnad” as to be too cheap to pay to get something done properly.

What can be even more galling is that people “howabouting” potential contributors are not only comfortable hobbyists, but some of them also solicit donations for their hobby, not because they need the money but because it might cover a few beers or pizzas, some entertainment, or whatever. And so, a notion is cultivated that everything can be done by voluntary effort, that the value of such work is effectively “beer money”, and with the likes of the PSF not willing to put its own money the way of its own technology, people start to think that if “pizza and beer soda” is enough to improve a Free Software product, why would anyone want to pay people real money to improve it?

And so the notion of the volunteer, so noble and selfless, actually cheapens the value of the work that has to be done. Why bother paying for Free Software or for anyone to work on it when the noble volunteers will get it done? The answer, of course, is that people typically don’t and so the important things typically don’t get done, either. Still, at least the hobbyists get to have some fun.

A Timely Example

In another comment on the referenced article, discussing the general Python 3 strategy and whether anyone who had criticised it might have been worth listening to, it was noted that such critics might be like a “broken clock”: wrong most of the time but coincidentally right on certain occasions. I guess that for those who don’t like to hear criticism of the Python 3 masterplan, I could be one of those broken clocks, having criticised the introduction of Python 3. But if as the saying goes “a broken clock is right twice a day”, maybe some of my other criticisms are also worth taking a look at: one of them is probably good.

Of course, it hardly requires special predictive powers to note that people with large investments in existing code might not like being told that it is “good for them” to have to rewrite it all. And it is hardly a surprise that people have been motivated to look at other languages partly as a consequence of that, partly because of Python’s lack of direction or progress on other fronts, as language evolution dominates over all other concerns.

Spare a thought for Guido van Rossum whose colleagues, no matter where he works, always seem to end up writing software in Go instead of in the language that presumably got him through the door. Perhaps things wouldn’t have played out that way if those benefiting from Python had also properly invested in it, instead of leaving it for the hobbyists or using “we’re all volunteers” as an excuse for not keeping Python competitive with other emerging languages and technologies.

Some Updates

I was recently contacted by Sumana Harihareswara who asked for me to clarify that the proposal for improving the Python packaging infrastructure was initiated within the PSF’s Packaging Working Group, not by Mozilla, at least as far as available information would suggest. As someone involved with this working group, Sumana appears to be in a position to make claims about this more authoritatively than I can.

Meanwhile, an invitation to a PSF-related sprint that I happened to see today advertises “an amazing evening of coding, pizza and beer”. Having read a gushing endorsement of “dugnad” culture only recently – a classic promotional piece for readers outside Norway – I cannot help but observe that putting the burden for things onto the voluntary sector, so that the state can save money (to give as tax cuts to the wealthy) and so that the private sector can get something for nothing (to maximise shareholder returns), is rather a pervasive and not-so-noble phenomenon that will readily document itself to anyone paying enough attention.

Concise Attribute Initialisation in Lichen… and Python?

Monday, January 22nd, 2018

In my review of 2017, I mentioned a project of mine to make a Python-like language called Lichen that is more amenable to compile-time analysis than Python is, while still having a feature set I might actually be able to use in “real” programs one day. There are a lot of different “moving parts” in the Lichen toolchain, and being preoccupied with various other projects and activities, I haven’t been able to get back into working on it properly in the last few months.

Recently, as I found myself writing Python code for another of my projects, I got to wondering about something in Python that can occur a lot: the initialisation of instance attributes. Here is a classic example:

class Point:
    def __init__(self, x, y):
        self.x = x
        self.y = y

# For illustration, here is how the class is used...
p = Point(640, 512)
print p.x, p.y # 640 512

In this example, having to assign the parameter values to the instance attributes is not much of a hardship. But with more verbose initialisation methods with more parameters and more attributes involved, writing everything out can be tiresome. Moreover, mistakes can be made, particularly if the interfaces and structures are evolving. Naturally, there are a range of improvements and measures that attempt to alleviate the problem. Here is the most obvious:

class Point:
    def __init__(self, x, y):
        self.x = x; self.y = y

This just puts the same statements on one line, so let us move beyond it to the next attempt:

class Point:
    def __init__(self, x, y):
        self.x, self.y = x, y

Here, we are actually performing “tuple assignment”, with the parameter values being placed in a tuple whose elements are then assigned to the names in the corresponding positions on the left-hand side of the assignment.

Now, without any Python “magic”, this is probably as far as you can get. The “magic” involves introspection and a feature known as “decorators” (which Lichen doesn’t support) to let us use something like this:

class Point:
    @initialising("x", "y")
    def __init__(self, x, y):
        pass

Here, I am taking inspiration from a collection of actual suggestions and solutions, but none of them look like the above. Indeed, many of them take the approach of initialising attributes using every parameter in the method signature which isn’t always what you want, although it does seem to be requested every now and again.

Although the above example looks quite nice, the mechanism responsible for performing the attribute assignments will not look as nice, and so I won’t show it here. And unless a mode is supported where the names can be omitted, thus initialising attributes using all parameters (except self) when you do want to, it is perhaps tiresome to have to write the names out again somewhere else, even more so as strings.

You will also find people advocating more transparent use of the ** catch-all parameter (also not supported by Lichen), sometimes in response to people worried that writing out lots of assignments is a sign of bad code. This yields solutions like this one:

class Point:
    def __init__(self, **kw):
        for name in ("x", "y"):
            setattr(self, name, kw.get(name))

But keeping named parameters in the signature helps to prevent certain kinds of errors, which is one reason why I don’t intend to support catch-all parameters in Lichen.

But what I wondered is why Python never supported something closer to C++’s initialisation lists. In C++, we might write the code somewhat as follows:

class Point
{
    Number x, y;
public:
    Point(Number x, Number y) : x(x), y(y) {};
}

Here, it is evident that repetition occurs just as in the “magic” Python example, which is something I might want to eliminate. Maybe we would want to have a shorthand for attribute initialisation within the parameter list itself. And then I thought of a possible syntax:

class Point:
    def __init__(self, .x, .y):
        pass

So, any parameter employing a dot before its name would result in the assignment of its value to the instance attribute having the same name. Of course, this wouldn’t support a parameter with one name having its value assigned to an attribute with another name, but I thought it best to stick to the simple cases. “Why not add this to Lichen?” I thought.

And in line with not getting too immersed in the toolchain straight away after such a long break, I decided on some rather simple semantics for this feature: dot-prefixed names would still exist as local names; dot-prefixing would just be a form of shorthand meaning that an assignment would be generated at the very start of the function body. So, the above would really translate to the very first example given at the start of this article or, indeed, the second one which is equivalent and is reproduced below:

# Lichen-only...                   # Python and Lichen...
class Point:                       class Point:
    def __init__(self, .x, .y):        def __init__(self, x, y):
        pass                               self.x = x; self.y = y

Keeping the sophistication of the feature at an unambitious level, besides letting me slowly familiarise myself again with the code, also helps to deal with potential conflicts with other mechanisms. For example, what if someone wanted to employ a name twice – once dot-prefixed, once unprefixed – like this…?

class Point:
    def __init__(self, .x, .y, x):
        self.intensity = x ** 2

By asserting that the dot-prefixed x is really just x that also initialises the attribute of the same name, we can fall back on the normal rules around parameters and forbid such duplicate names without having to think very hard about temporary names or more exotic mechanisms that might be used to initialise attributes directly. One other thing worth mentioning is that I don’t reserve the use of such parameters for the exclusive use of initialiser methods, so other applications are possible. For example:

class Point:
    def __init__(.x, .y): pass
    def update(.x, .y): pass

Here, I also omit self because Lichen defines it as always being present in methods, anyway. And we could actually make the update method an alias of the initialiser method, too, but let us not get too carried away!

Fortunately, I adopted a parser framework in Lichen that was originally written for PyPy that allows relatively straightforward modification of the language grammar. Conveniently, the grammar changes required for this feature are minimal and I don’t even have to add any extra tokens. That made me wonder whether such a syntax had been suggested for Python at some point or other. Some quick searches haven’t yielded any results, and I can’t be bothered to trawl the different mailing list archives to find mentions of such features. I can easily imagine that such a feature might have been discussed rather early in Python’s lifetime, possibly in the mid-1990s.

Arguments for new syntax in Python are often met with arguments against “syntactic sugar”, with such “sugar” introducing more convenient notation or a form of shorthand for particular operations. Over the years, people have argued for more concise ways of referencing instance attributes and class attributes instead of using the almost-special self name (that is rather more special in Lichen). Compound assignments to instance attributes have probably been discussed, too, maybe proposing things like this:

# Compound assignment idea...      # Equivalent assignment...
self.(x, y) = x, y                 self.x, self.y = x, y

In response to such suggestions, people seem to be asked how often they need to write such things, whether it is really such a burden to do so, and whether their programming tools cannot help them write out the conventional assignments semi-automatically instead. Proposed general language constructs may well risk introducing conflicts with other language features in unanticipated ways, and if such constructs only ever get used in certain, rather limited, circumstances then one can justifiably ask whether it is really worth the effort to support them. They will, after all, need people to implement them, test them, maintain them, and keep fixing them long into the future.

As is evident from the discussion of the problem of concise initialisation, Python’s community has grown accustomed to solving simple problems in fairly complicated ways using general mechanisms introduced to support broad classes of functionality. Decorators were introduced into Python as a way of inserting extra code around methods and functions to modify or extend their behaviour, allowing people to tackle such problems by getting that extra code to initialise attributes or to do many other weird, wild and wonderful things. Providing such mechanisms lets the language designers send people elsewhere when those people descend on the designers demanding a quick syntactic fix for a specific problem they might be having.

But it really does surprise me that something as simple as dot-prefixing parameter names never managed to get suggested and quickly introduced into an early version of Python. I did wonder whether other Python-inspired languages might have subconsciously inspired me, but a brief perusal of the Boo, Cobra, Delight and Genie documentation turned up nothing. And so, without any more insight into my inspiration, that is the tale of my first experiment in extending Lichen’s syntax beyond that of Python.

Update

I finally remembered where I had seen the dot-prefixed name notation before. When initialising structures in C, you can explicitly indicate a structure member when specifying a value, and I do this all the time in the code generated for Lichen programs. I even define macros that use this feature. For example:

#define __INTVALUE(VALUE) ((__attr) {.intvalue=((VALUE) << 1) | 1})

So I suppose it shows how long it has been since I had to look at that part of the toolchain! Of course, this is directly initialising a structure member by indicating a value, whereas the Lichen syntax enhancement associates an attribute, which is similar to a member, with a parameter received in a method call. But there are some similarities in purpose, nevertheless.

2017 in Review

Thursday, December 7th, 2017

On Planet Debian there seems to be quite a few regularly-posted articles summarising the work done by various people in Free Software over the month that has most recently passed. I thought it might be useful, personally at least, to review the different things I have been doing over the past year. The difference between this article and many of those others is that the work I describe is not commissioned or generally requested by others, instead relying mainly on my own motivation for it to happen. The rate of progress can vary somewhat as a result.

Learning KiCad

Over the years, I have been playing around with Arduino boards, sensors, displays and things of a similar nature. Although I try to avoid buying more things to play with, sometimes I manage to acquire interesting items regardless, and these aren’t always ready to use with the hardware I have. Last December, I decided to buy a selection of electronics-related items for interfacing and experimentation. Some of these items have yet to be deployed, but others were bought with the firm intention of putting different “spare” pieces of hardware to use, or at least to make them usable in future.

One thing that sits in this category of spare, potentially-usable hardware is a display circuit board that was once part of a desk telephone, featuring a two-line, bitmapped character display, driven by the Hitachi HD44780 LCD controller. It turns out that this hardware is so common and mundane that the Arduino libraries already support it, but the problem for me was being able to interface it to the Arduino. The display board uses a cable with a connector that needs a special kind of socket, and so some research is needed to discover the kind of socket needed and how this might be mounted on something else to break the connections out for use with the Arduino.

Fortunately, someone else had done all this research quite some time ago. They had even designed a breakout board to hold such a socket, making it available via the OSH Park board fabricating service. So, to make good on my plan, I ordered the mandatory minimum of three boards, also ordering some connectors from Mouser. When all of these different things arrived, I soldered the socket to the board along with some headers, wired up a circuit, wrote a program to use the LiquidCrystal Arduino library, and to my surprise it more or less worked straight away.

Breakout board for the Molex 52030 connector

Breakout board for the Molex 52030 connector

Hitachi HD44780 LCD display boards driven by an Arduino

Hitachi HD44780 LCD display boards driven by an Arduino

This satisfying experience led me to consider other boards that I might design and get made. Previously, I had only made a board for the Arduino using Fritzing and the Fritzing Fab service, and I had held off looking at other board design solutions, but this experience now encouraged me to look again. After some evaluation of the gEDA tools, I decided that I might as well give KiCad a try, given that it seems to be popular in certain “open source hardware” circles. And after a fair amount of effort familiarising myself with it, with a degree of frustration finding out how to do certain things (and also finding up-to-date documentation), I managed to design my own rather simple board: a breakout board for the Acorn Electron cartridge connector.

Acorn Electron cartridge breakout board (in 3D-printed case section)

Acorn Electron cartridge breakout board (in 3D-printed case section)

In the back of my mind, I have vague plans to do other boards in future, but doing this kind of work can soak up a lot of time and be rather frustrating: you almost have to get into some modified mental state to work efficiently in KiCad. And it isn’t as if I don’t have other things to do. But at least I now know something about what this kind of work involves.

Retro and Embedded Hardware

With the above breakout board in hand, a series of experiments were conducted to see if I could interface various circuits to the Acorn Electron microcomputer. These mostly involved 7400-series logic chips (ICs, integrated circuits) and featured various logic gates and counters. Previously, I had re-purposed an existing ROM cartridge design to break out signals from the computer and make it access a single flash memory chip instead of two ROM chips.

With a dedicated prototyping solution, I was able to explore the implementation of that existing board, determine various aspects of the signal timings that remained rather unclear (despite being successfully handled by the existing board’s logic), and make it possible to consider a dedicated board for a flash memory cartridge. In fact, my brother, David, also wanting to get into board design, later adapted the prototyping cartridge to make such a board.

But this experimentation also encouraged me to tackle some other items in the electronics shipment: the PIC32 microcontrollers that I had acquired because they were MIPS-based chips, with somewhat more built-in RAM than the Atmel AVR-based chips used by the average Arduino, that could also be used on a breadboard. I hoped that my familiarity with the SoC (system-on-a-chip) in the Ben NanoNote – the Ingenic JZ4720 – might confer some benefits when writing low-level code for the PIC32.

PIC32 on breadboard with Arduino programming circuit

PIC32 on breadboard with Arduino programming circuit (and some LEDs for diagnostic purposes)

I do not need to reproduce an account of my activities here, given that I wrote about the effort involved in getting started with the PIC32 earlier in the year, and subsequently described an unusual application of such a microcontroller that seemed to complement my retrocomputing interests. I have since tried to make that particular piece of work more robust, but deducing the actual behaviour of the hardware has been frustrating, the documentation can be vague when it needs to be accurate, and much of the community discussion is focused on proprietary products and specific software tools rather than techniques. Maybe this will finally push me towards investigating programmable logic solutions in the future.

Compiling a Python-like Language

As things actually happened, the above hardware activities were actually distractions from something I have been working on for a long time. But at this point in the article, this can be a diversion from all the things that seem to involve hardware or low-level software development. Many years ago, I started writing software in Python. Over the years since, alternative implementations of the Python language (the main implementation being CPython) have emerged and seen some use, some continuing to be developed to this day. But around fifteen years ago, it became a bit more common for people to consider whether Python could be compiled to something that runs more efficiently (and more quickly).

I followed some of these projects enthusiastically for a while. Starkiller promised compilation to C++ but never delivered any code for public consumption, although the associated academic thesis might have prompted the development of Shed Skin which does compile a particular style of Python program to C++ and is available as Free Software. Meanwhile, PyPy elevated to prominence the notion of writing a language and runtime library implementation in the language itself, previously seen with language technologies like Slang, used to implement Squeak/Smalltalk.

Although other projects have also emerged and evolved to attempt the compilation of Python to lower-level languages (Pyrex, Cython, Nuitka, and so on), my interests have largely focused on the analysis of programs so that we may learn about their structure and behaviour before we attempt to run them, this alongside any benefits that might be had in compiling them to something potentially faster to execute. But my interests have also broadened to consider the evolution of the Python language since the point fifteen years ago when I first started to think about the analysis and compilation of Python. The near-mythical Python 3000 became a real thing in the form of the Python 3 development branch, introducing incompatibilities with Python 2 and fragmenting the community writing software in Python.

With the risk of perfectly usable software becoming neglected, its use actively (and destructively) discouraged, it becomes relevant to consider how one might take control of one’s software tools for long-term stability, where tools might be good for decades of use instead of constantly changing their behaviour and obliging their users to constantly change their software. I expressed some of my thoughts about this earlier in the year having finally reached a point where I might be able to reflect on the matter.

So, the result of a great deal of work, informed by experiences and conversations over the years related to previous projects of my own and those of others, is a language and toolchain called Lichen. This language resembles Python in many ways but does not try to be a Python implementation. The toolchain compiles programs to C which can then be compiled and executed like “normal” binaries. Programs can be trivially cross-compiled by any available C cross-compilers, too, which is something that always seems to be a struggle elsewhere in the software world. Unlike other Python compilers or implementations, it does not use CPython’s libraries, nor does it generate in “longhand” the work done by the CPython virtual machine.

One might wonder why anyone should bother developing such a toolchain given its incompatibility with Python and a potential lack of any other compelling reason for people to switch. Given that I had to accept some necessary reductions in the original scope of the project and to limit my level of ambition just to feel remotely capable of making something work, one does need to ask whether the result is too compromised to be attractive to others. At one point, programs manipulating integers were slower when compiled than when they were run by CPython, and this was incredibly disheartening to see, but upon further investigation I noticed that CPython effectively special-cases integer operations. The design of my implementation permitted me to represent integers as tagged references – a classic trick of various language implementations – and this overturned the disadvantage.

For me, just having the possibility of exploring alternative design decisions is interesting. Python’s design is largely done by consensus, with pronouncements made to settle disagreements and to move the process forward. Although this may have served the language well, depending on one’s perspective, it has also meant that certain paths of exploration have not been followed. Certain things have been improved gradually but not radically due to backwards compatibility considerations, this despite the break in compatibility between the Python 2 and 3 branches where an opportunity was undoubtedly lost to do greater things. Lichen is an attempt to explore those other paths without having to constantly justify it to a group of people who may regard such exploration as hostile to their own interests.

Lichen is not really complete: it needs floating point number and other useful types; its library is minimal; it could be made more robust; it could be made more powerful. But I find myself surprised that it works at all. Maybe I should have more confidence in myself, especially given all the preparation I did in trying to understand the good and bad aspects of my previous efforts before getting started on this one.

Developing for MIPS-based Platforms

A couple of years ago I found myself wondering if I couldn’t write some low-level software for the Ben NanoNote. One source of inspiration for doing this was “The CI20 bare-metal project“: a series of blog articles discussing the challenges of booting the MIPS Creator CI20 single-board computer. The Ben and the CI20 use CPUs (or SoCs) from the same family: the Ingenic JZ4720 and JZ4780 respectively.

For the Ben, I looked at the different boot payloads, principally those written to support booting from a USB host, but also the version of U-Boot deployed on the Ben. I combined elements of these things with the framebuffer driver code from the Linux kernel supporting the Ben, and to my surprise I was able to get the device to boot up and show a pattern on the screen. Progress has not always been steady, though.

For a while, I struggled to make the CPU leave its initial exception state without hanging, and with the screen as my only debugging tool, it was hard to see what might have been going wrong. Some careful study of the code revealed the problem: the code I was using to write to the framebuffer was using the wrong address region, meaning that as soon as an attempt was made to update the contents of the screen, the CPU would detect a bad memory access and an exception would occur. Such exceptions will not be delivered in the initial exception state, but with that state cleared, the CPU will happily trigger a new exception when the program accesses memory it shouldn’t be touching.

Debugging low-level code on the Ben NanoNote (the hard way)

Debugging low-level code on the Ben NanoNote (the hard way)

I have since plodded along introducing user mode functionality, some page table initialisation, trying to read keypresses, eventually succeeding after retracing my steps and discovering my errors along the way. Maybe this will become a genuinely useful piece of software one day.

But one useful purpose this exercise has served is that of familiarising myself with the way these SoCs are organised, the facilities they provide, how these may be accessed, and so on. My brother has the Letux 400 notebook containing yet another SoC in the same family, the JZ4730, which seems to be almost entirely undocumented. This notebook has proven useful under certain circumstances. For instance, it has been used as a kind of appliance for document scanning, driving a multifunction scanner/printer over USB using the enduring SANE project’s software.

However, the Letux 400 is already an old machine, with products based on this hardware platform being almost ten years old, and when originally shipped it used a 2.4 series Linux kernel instead of a more recent 2.6 series kernel. Like many products whose software is shipped as “finished”, this makes the adoption of newer software very difficult, especially if the kernel code is not “upstreamed” or incorporated into the official Linux releases.

As software distributions such as Debian evolve, they depend on newer kernel features, but if a device is stuck on an older kernel (because the special functionality that makes it work on that device is specific to that kernel) then the device, unable to run the newer kernels, gradually becomes unable to run newer versions of the distribution as well. Thus, Debian Etch was the newest distribution version that would work on the 2.4 kernel used by the Letux 400 as shipped.

Fortunately, work had been done to make a 2.6 series kernel work on the Letux 400, and this made Debian Lenny functional. But time passes and even this is now considered ancient. Although David was running some software successfully, there was other software that really needed a newer distribution to be able to run, and this meant considering what it might take to support Debian Squeeze on the hardware. So he set to work adding patches to the 2.6.24 kernel to try and take it within the realm of Squeeze support, making it beyond the bare minimum of 2.6.29 and into the “release candidate” territory of 2.6.30. And this was indeed enough to run Squeeze on the notebook, at least supporting the devices needed to make the exercise worthwhile.

Now, at a much earlier stage in my own experiments with the Ben NanoNote, I had tried without success to reproduce my results on the Letux 400. And I had also made a rather tentative effort at modifying Ben NanoNote kernel drivers to potentially work with the Letux 400 from some 3.x kernel version. David’s success in updating the kernel version led me to look again at the tasks of familiarising myself with kernel drivers, machine details and of supporting the Letux 400 in even newer kernels.

The outcome of this is uncertain at present. Most of the work on updating the drivers and board support has been done, but actual testing of my work still needs to be done, something that I cannot really do myself. That might seem strange: why start something I cannot finish by myself? But how I got started in this effort is also rather related to the topic of the next section.

The MIPS Creator CI20 and L4/Fiasco.OC

Low-level programming on the Ben NanoNote is frustrating unless you modify the device and solder the UART connections to the exposed pads in the battery compartment, thereby enabling a serial connection and allowing debugging information to be sent to a remote display for perusal. My soldering skills are not that great, and I don’t want to damage my device. So debugging was a frustrating exercise. Since I felt that I needed a bit more experience with the MIPS architecture and the Ingenic SoCs, it occurred to me that getting a CI20 might be the way to go.

I am not really a supporter of Imagination Technologies, producer of the CI20, due to the company’s rather hostile attitude towards Free Software around their PowerVR technologies, meaning that of the different graphics acceleration chipsets, PowerVR has been increasingly isolated as a technology that is consistently unsupportable by Free Software drivers. However, the CI20 is well-documented and has been properly supported with Free Software, apart from the PowerVR parts of the hardware, of course. Ingenic were seemingly persuaded to make the programming manual for the JZ4780 used by the CI20 publicly available, unlike the manuals for other SoCs in that family. And the PowerVR hardware is not actually needed to be able to use the CI20.

The MIPS Creator CI20 single-board computer

The MIPS Creator CI20 single-board computer

I had hoped that the EOMA68 campaign would have offered a JZ4775 computer card, and that the campaign might have delivered such a card by now, but with both of these things not having happened I took the plunge and bought a CI20. There were a few other reasons for doing so: I wanted to see how a single-board computer with a decent amount of RAM (1GB) might perform as a working desktop machine; having another computer to offload certain development and testing tasks, rather than run virtual machines, would be useful; I also wanted to experiment with and even attempt to port other operating systems, loosening my dependence on the Linux monoculture.

One of these other operating systems involves two components: the Fiasco.OC microkernel and the L4 Runtime Environment (L4Re). Over the years, microkernels in the L4 family have seen widespread use, and at one point people considered porting GNU Hurd to one of the L4 family microkernels from the Mach microkernel it then used (and still uses). It seems to me like something worth looking at more closely, and fortunately it also seemed that this software combination had been ported to the CI20. However, it turned out that my expectations of building an image, testing the result, and then moving on to developing interesting software were a little premature.

The first real problem was that GCC produced position-independent code that was not called correctly. This meant that upon trying to get the addresses of functions, the program would end up loading garbage addresses and trying to call any code that might be there at those addresses. So some fixes were required. Then, it appeared that the JZ4780 doesn’t support a particular MIPS instruction, meaning that the CPU would encounter this instruction and cause an exception. So, with some guidance, I wrote a handler to decode the instruction and generate the rather trivial result that the instruction should produce. There were also some more generic problems with the microkernel code that had previously been patched but which had not appeared in the upstream repository. But in the end, I got the “hello” program to run.

With a working foundation I tried to explore the hardware just as I had done with the Ben NanoNote, attempting to understand things like the clock and power management hardware, general purpose input/output (GPIO) peripherals, and also the Inter-Integrated Circuit (I2C) peripherals. Some assistance was available in the form of Linux kernel driver code, although the style of code can vary significantly, and it also takes time to “decode” various mechanisms in the Linux code and to unpick the useful bits related to the hardware. I had hoped to get further, but in trying to use the I2C peripherals to talk to my monitor using the DDC protocol, I found that the data being returned was not entirely reliable. This was arguably a distraction from the more interesting task of enabling the display, given that I know what resolutions my monitor supports.

However, all this hardware-related research and detective work at least gave me an insight into mechanisms – software and hardware – that would inform the effort to “decode” the vendor-written code for the Letux 400, making certain things seem a lot more familiar and increasing my confidence that I might be understanding the things I was seeing. For example, the JZ4720 in the Ben NanoNote arranges its hardware registers for GPIO configuration and access in a particular way, but the code written by the vendor for the JZ4730 in the Letux 400 accesses GPIO registers in a different way.

Initially, I might have thought that I was missing some important detail: are the two products really so different, and if not, then why is the code so different? But then, looking at the JZ4780, I encountered another scheme for GPIO register organisation that is different again, but which does have similarities to the JZ4730. With the JZ4780 being publicly documented, the code for the Letux 400 no longer seemed quite so bizarre or unfathomable. With more experience, it is possible to have a little more confidence in one’s understanding of the mechanisms at work.

I would like to spend a bit more time looking at microkernels and alternatives to Linux. While many people presumably think that Linux is running on everything and has “won”, it is increasingly likely that the Linux one sees on devices does not completely control the hardware and is, in fact, virtualised or confined by software systems like L4/Fiasco.OC. I also have reservations about the way Linux is developed and how well it is able to handle the demands of its proliferation onto every kind of device, many of them hooked up to the Internet and being left to fend for themselves.

Developing imip-agent

Alongside Lichen, a project that has been under development for the last couple of years has been imip-agent, allowing calendar-based scheduling activities to be integrated with mail transport agents. I haven’t been able to spend quite as much time on imip-agent this year as I might have liked, although I will also admit that I haven’t always been motivated to spend much time on it, either. Still, there have been brief periods of activity tidying up, fixing, or improving the code. And some interest in packaging the software led me to reconsider some of the techniques used to deploy the software, in particular the way scheduling extensions are discovered, and the way the system configuration is processed (since Debian does not want “executable scripts” in places like /etc, even if those scripts just contain some simple configuration setting definitions).

It is perhaps fairly typical that a project that tries to assess the feasibility of a concept accumulates the necessary functionality in order to demonstrate that it could do a particular task. After such an initial demonstration, the effort of making the code easier to work with, more reliable, more extensible, must occur if further progress is to be made. One intervention that kept imip-agent viable as a project was the introduction of a test suite to ensure that the basic functionality did indeed work. There were other architectural details that I felt needed remedying or improving for the code to remain manageable.

Recently, I have been refining the parts of the code that support editing of calendar objects and the exchange of updates caused by changes to calendar events. Such work is intended to make the Web client easier to understand and to expose such functionality to proper testing. One side-effect of this may be the introduction of a text-based client for people using e-mail programs like Mutt, as well as a potentially usable library for other mail clients. Such tidying up and fixing does not show off fancy new features or argue the case for developing such software in the first place, but I suppose it makes me feel better about the software I have written.

Whither Moin?

There are probably plenty of other little projects of my own that I have started or at least contemplated this year. And there are also projects that are not mine but which I use and which have had contributions from me over the years. One of these is the MoinMoin wiki software that powers a number of Free Software and other Web sites where collaborative editing is made available to the communities involved. I use MoinMoin – or Moin for short – to publish content on the Web myself, and I have encouraged others to use it in the past. However, it worries me now that the level of maintenance it is receiving has fallen to a level where updates for faults in the software are not likely to be forthcoming and where it is no longer clear where such updates should be coming from.

Earlier in the year, having previously read queries about the static export output from Moin, which can be rather basic and not necessarily resemble the appearance of the wiki such output has come from, I spent some time considering my own use of Moin for documentation publishing. For some of my projects, I don’t take advantage of the “through the Web” editing of the solution when publishing the public documentation. Instead, I use Moin locally, store the pages in a separate repository, and then make page packages that get installed on a public instance of Moin. This means that I do not have to worry about Web-based authentication and can just have a wiki as a read-only resource.

Obviously, the parts of Moin that I really need here are just the things that parse the wiki formatting (which I regard as more usable than other document markup formats in various respects) and that format the content as HTML. If I could format it as static content with some pages, some stylesheets, some images, with some Web server magic to make the URLs look nice, then that would probably be sufficient. For some things like the automatic generation of SVG from Graphviz-format files, I would also need to have the relevant parsers available, too. Having a complete Web framework, which is what Moin really is, is rather unnecessary with these diminished requirements.

But I do use Moin as a full wiki solution as well, and so it made me wonder whether I shouldn’t try and bring it up to date. Of course, there is already the MoinMoin 2.0 effort that was intended to modernise and tidy up the software, but since this effort made a clean break from Moin 1.x, it was never an attractive choice for those people already using Moin in anything more than a basic sense. Since there wasn’t an established API for extensions, it was not readily usable for many existing sites that rely on such extensions. In a way, Moin 2 has suffered from something that Python 3 only avoided by having a lot more people working on it, including people being paid to work on it, together with a policy of openly shaming those people who had made Python 2 viable – by developing software for it – into spending time migrating their code to Python 3.

I don’t have an obvious plan of action here. Moin perhaps illustrates the fundamental problem facing many Free Software projects, this being a theme that I have discussed regularly this year: how they may remain viable by having people able to dedicate their time to writing and maintaining Free Software without this work being squeezed in around the edges of people’s “actual work” and thus burdening them with yet another obligation in their lives, particularly one that is not rewarded by a proper appreciation of the sacrifice being made.

Plenty of individuals and organisations benefit from Moin, but we live in an age of “comparison shopping” where people will gladly drop one thing if someone offers them something newer and shinier. This is, after all, how everyone ends up using “free” services where the actual costs are hidden. To their credit, when Moin needed to improve its password management, the Python Software Foundation stepped up and funded this work rather than dropping Moin, which is what I had expected given certain Python community attitudes. Maybe other, more well-known organisations that use Moin also support its development, but I don’t really see much evidence of it.

Maybe they should consider doing so. The notion that something else will always come along, developed by some enthusiastic developer “scratching their itch”, is misguided and exploitative. And a failure to sustain Free Software development can only undermine Free Software as a resource, as an activity or a cause, and as the basis of many of those organisations’ continued existence. Many of us like developing Free Software, as I hope this article has shown, but motivation alone does not keep that software coming forever.

Some Thoughts on Python-Like Languages

Tuesday, June 6th, 2017

A few different things have happened recently that got me thinking about writing something about Python, its future, and Python-like languages. I don’t follow the different Python implementations as closely as I used to, but certain things did catch my attention over the last few months. But let us start with things closer to the present day.

I was neither at the North American PyCon event, nor at the invitation-only Python Language Summit that occurred as part of that gathering, but LWN.net has been reporting the proceedings to its subscribers. One of the presentations of particular interest was covered by LWN.net under the title “Keeping Python competitive”, apparently discussing efforts to “make Python faster”, the challenges faced by different Python implementations, and the limitations imposed by the “canonical” CPython implementation that can frustrate performance improvement efforts.

Here is where this more recent coverage intersects with things I have noticed over the past few months. Every now and again, an attempt is made to speed Python up, sometimes building on the CPython code base and bolting on additional technology to boost performance, sometimes reimplementing the virtual machine whilst introducing similar performance-enhancing technology. When such projects emerge, especially when a large company is behind them in some way, expectations of a much faster Python are considerable.

Thus, when the Pyston reimplementation of Python became more widely known, undertaken by people working at Dropbox (who also happen to employ Python’s creator Guido van Rossum), people were understandably excited. Three years after that initial announcement, however, and those ambitious employees now have to continue that work on their own initiative. One might be reminded of an earlier project, Unladen Swallow, which also sought to perform just-in-time compilation of Python code, undertaken by people working at Google (who also happened to employ Python’s creator Guido van Rossum at the time), which was then abandoned as those people were needed to go and work on other things. Meanwhile, another apparently-broadly-similar project, Pyjion, is being undertaken by people working at Microsoft, albeit as a “side project at work”.

As things stand, perhaps the most dependable alternative implementation of Python, at least if you want one with a just-in-time compiler that is actively developed and supported for “production use”, appears to be PyPy. And this is only because of sustained investment of both time and funding over the past decade and a half into developing the technology and tracking changes in the Python language. Heroically, the developers even try and support both Python 2 and Python 3.

Motivations for Change

Of course, Google, Dropbox and Microsoft presumably have good reasons to try and get their Python code running faster and more efficiently. Certainly, the first two companies will be running plenty of Python to support their services; reducing the hardware demands of delivering those services is definitely a motivation for investigating Python implementation improvements. I guess that there’s enough Python being run at Microsoft to make it worth their while, too. But then again, none of these organisations appear to be resourcing these efforts at anything close to what would be marshalled for their actual products, and I imagine that even similar infrastructure projects originating from such companies (things like Go, for example) have had many more people assigned to them on a permanent basis.

And now, noting the existence of projects like Grumpy – a Python to Go translator – one has to wonder whether there isn’t some kind of strategy change afoot: that it now might be considered easier for the likes of Google to migrate gradually to Go and steadily reduce their dependency on Python than it is to remedy identified deficiencies with Python. Of course, the significant problem remains of translating Python code to Go and still have it interface with code written in C against Python’s extension interfaces, maintaining reliability and performance in the result.

Indeed, the matter of Python’s “C API”, used by extensions written in C for Python programs to use, is covered in the LWN.net article. As people have sought to improve the performance of their software, they have been driven to rewrite parts of it in C, interfacing these performance-critical parts with the rest of their programs. Although such optimisation techniques make sense and have been a constant presence in software engineering more generally for many decades, it has almost become the path of least resistance when encountering performance difficulties in Python, even amongst the maintainers of the CPython implementation.

And so, alternative implementations need to either extract C-coded functionality and offer it in another form (maybe even written in Python, can you imagine?!), or they need to have a way of interfacing with it, one that could produce difficulties and impair their own efforts to deliver a robust and better-performing solution. Thus, attempts to mitigate CPython’s shortcomings have actually thwarted the efforts of other implementations to mitigate the shortcomings of Python as a whole.

Is “Python” Worth It?

You may well be wondering, if I didn’t manage to lose you already, whether all of these ambitious and brave efforts are really worth it. Might there be something with Python that just makes it too awkward to target with a revised and supposedly better implementation? Again, the LWN.net article describes sentiments that simpler, Python-like languages might be worth considering, mentioning the Hack language in the context of PHP, although I might also suggest Crystal in the context of Ruby, even though the latter is possibly closer to various functional languages and maybe only bears syntactic similarities to Ruby (although I haven’t actually looked too closely).

One has to be careful with languages that look dynamic but are really rather strict in how types are assigned, propagated and checked. And, should one choose to accept static typing, even with type inference, it could be said that there are plenty of mature languages – OCaml, for instance – that are worth considering instead. As people have experimented with Python-like languages, others have been quick to criticise them for not being “Pythonic”, even if the code one writes is valid Python. But I accept that the challenge for such languages and their implementations is to offer a Python-like experience without frustrating the programmer too much about things which look valid but which are forbidden.

My tuning of a Python program to work with Shedskin needed to be informed about what Shedskin was likely to allow and to reject. As far as I am concerned, as long as this is not too restrictive, and as long as guidance is available, I don’t see a reason why such a Python-like language couldn’t be as valid as “proper” Python. Python itself has changed over the years, and the version I first used probably wouldn’t measure up to what today’s newcomers would accept as Python at all, but I don’t accept that the language I used back in 1995 was not Python: that would somehow be a denial of history and of my own experiences.

Could I actually use something closer to Python 1.4 (or even 1.3) now? Which parts of more recent versions would I miss? And which parts of such ancient Pythons might even be superfluous? In pursuing my interests in source code analysis, I decided to consider such questions in more detail, partly motivated by the need to keep the investigation simple, partly motivated by laziness (that something might be amenable to analysis but more effort than I considered worthwhile), and partly motivated by my own experiences developing Python-based solutions.

A Leaner Python

Usually, after a title like that, one might expect to read about how I made everything in Python statically typed, or that I decided to remove classes and exceptions from the language, or do something else that would seem fairly drastic and change the character of the language. But I rather like the way Python behaves in a fundamental sense, with its classes, inheritance, dynamic typing and indentation-based syntax.

Other languages inspired by Python have had a tendency to diverge noticeably from the general form of Python: Boo, Cobra, Delight, Genie and Nim introduce static typing and (arguably needlessly) change core syntactic constructs; Converge and Mython focus on meta-programming; MyPy is the basis of efforts to add type annotations and “optional static typing” to Python itself. Meanwhile, Serpentine is a project being developed by my brother, David, and is worth looking at if you want to write software for Android, have some familiarity with user interface frameworks like PyQt, and can accept the somewhat moderated type discipline imposed by the Android APIs and the Dalvik runtime environment.

In any case, having already made a few rounds trying to perform analysis on Python source code, I am more interested in keeping the foundations of Python intact and focusing on the less visible characteristics of programs: effectively reading between the lines of the source code by considering how it behaves during execution. Solutions like Shedskin take advantage of restrictions on programs to be able to make deductions about program behaviour. These deductions can be sufficient in helping us understand what a program might actually do when run, as well as helping the compiler make more robust or efficient programs.

And the right kind of restrictions might even help us avoid introducing more disruptive restrictions such as having to annotate all the types in a program in order to tell us similar things (which appears to be one of the main directions of Python in the current era, unfortunately). I would rather lose exotic functionality that I have never really been convinced by, than retain such functionality and then have to tell the compiler about things it would otherwise have a chance of figuring out for itself.

Rocking the Boat

Certainly, being confronted with any list of restrictions, despite the potential benefits, can seem like someone is taking all the toys away. And it can be difficult to deliver the benefits to make up for this loss of functionality, no matter how frivolous some of it is, especially if there are considerable expectations in terms of things like performance. Plenty of people writing alternative Python implementations can attest to that. But there are other reasons to consider a leaner, more minimal, Python-like language and accompanying implementation.

For me, one rather basic reason is merely to inform myself about program analysis, figure out how difficult it is, and hopefully produce a working solution. But beyond that is the need to be able to exercise some level of control over the tools I depend on. Python 2 will in time no longer be maintained by the Python core development community; a degree of agitation has existed for some time to replace it with Python 3 in Free Software operating system distributions. Yet I remain unconvinced about Python 3, particularly as it evolves towards a language that offers “optional” static typing that will inevitably become mandatory (despite assertions that it will always officially be optional) as everyone sprinkles their code with annotations and hopes for the magic fairies and pixies to come along and speed it up, that latter eventuality being somewhat less certain.

There are reasons to consider alternative histories for Python in the form of Python-like languages. People argue about whether Python 3’s Unicode support makes it as suitable for certain kinds of programs as Python 2 has been, with the Mercurial project being notable in its refusal to hurry along behind the Python 3 adoption bandwagon. Indeed, PyPy was devised as a platform for such investigations, being only somewhat impaired in some respects by its rather intensive interpreter generation process (but I imagine there are ways to mitigate this).

Making a language implementation that is adaptable is also important. I like the ability to be able to cross-compile programs, and my own work attempts to make this convenient. Meanwhile, cross-building CPython has been a struggle for many years, and I feel that it says rather a lot about Python core development priorities that even now, with the need to cross-build CPython if it is to be available on mobile platforms like Android, the lack of a coherent cross-building strategy has left those interested in doing this kind of thing maintaining their own extensive patch sets. (Serpentine gets around this problem, as well as the architectural limitations of dropping CPython on an Android-based device and trying to hook it up with the different Android application frameworks, by targeting the Dalvik runtime environment instead.)

No Need for Another Language?

I found it depressingly familiar when David announced his Android work on the Python mobile-sig mailing list and got the following response:

In case you weren't aware, you can just write Android apps and services
in Python, using Kivy.  No need to invent another language.

Fortunately, various other people were more open-minded about having a new toolchain to target Android. Personally, the kind of “just use …” rhetoric reminds me of the era when everyone writing Web applications in Python were exhorted to “just use Zope“, which was a complicated (but admittedly powerful and interesting) framework whose shortcomings were largely obscured and downplayed until enough people had experienced them and felt that progress had to be made by working around Zope altogether and developing other solutions instead. Such zero-sum games – that there is one favoured approach to be promoted, with all others to be terminated or hidden – perhaps inspired by an overly-parroted “only one way to do it” mantra in the Python scene, have been rather damaging to both the community and to the adoption of Python itself.

Not being Python, not supporting every aspect of Python, has traditionally been seen as a weakness when people have announced their own implementations of Python or of Python-like languages. People steer clear of impressive works like PyPy or Nuitka because they feel that these things might not deliver everything CPython does, exactly like CPython does. Which is pretty terrible if you consider the heroic effort that the developer of Nuitka puts in to make his software work as similarly to CPython as possible, even going as far as to support Python 2 and Python 3, just as the PyPy team do.

Solutions like MicroPython have got away with certain incompatibilities with the justification that the target environment is rather constrained. But I imagine that even that project’s custodians get asked whether it can run Django, or whatever the arbitrarily-set threshold for technological validity might be. Never mind whether you would really want to run Django on a microcontroller or even on a phone. And never mind whether large parts of the mountain of code propping up such supposedly essential solutions could actually do with an audit and, in some cases, benefit from being retired and rewritten.

I am not fond of change for change’s sake, but new opportunities often bring new priorities and challenges with them. What then if Python as people insist on it today, with all the extra features added over the years to satisfy various petitioners and trends, is actually the weakness itself? What if the Python-like languages can adapt to these changes, and by having to confront their incompatibilities with hastily-written code from the 1990s and code employing “because it’s there” programming techniques, they can adapt to the changing environment while delivering much of what people like about Python in the first place? What if Python itself cannot?

“Why don’t you go and use something else if you don’t like what Python is?” some might ask. Certainly, Free Software itself is far more important to me than any adherence to Python. But I can also choose to make that other language something that carries forward the things I like about Python, not something that looks and behaves completely differently. And in doing so, at least I might gain a deeper understanding of what matters to me in Python, even if others refuse the lessons and the opportunities such Python-like languages can provide.

Rename This Project

Tuesday, December 13th, 2016

It is interesting how the CPython core developers appear to prefer to spend their time on choosing names for someone else’s fork of Python 2, with some rather expansionist views on trademark applicability, than on actually winning over Python 2 users to their cause, which is to make Python 3 the only possible future of the Python language, of course. Never mind that the much broader Python language community still appears to have an overwhelming majority of Python 2 users. And not some kind of wafer-thin, “first past the post”, mandate-exaggerating, Brexit-level majority, but an actual “that doesn’t look so bad but, oh, the scale is logarithmic!” kind of majority.

On the one hand, there are core developers who claim to be very receptive to the idea of other people maintaining Python 2, because the CPython core developers have themselves decided that they cannot bear to look at that code after 2020 and will not issue patches, let alone make new releases, even for the issues that have been worthy of their attention in recent years. Telling people that they are completely officially unsupported applies yet more “stick” and even less “carrot” to those apparently lazy Python 2 users who are still letting the side down by not spending their own time and money on realising someone else’s vision. But apparently, that receptivity extends only so far into the real world.

One often reads and hears claims of “entitlement” when users complain about developers or the output of Free Software projects. Let it be said that I really appreciate what has been delivered over the decades by the Python project: the language has kept programming an interesting activity for me; I still to this day maintain and develop software written in Python; I have even worked to improve the CPython distribution at times, not always successfully. But it should always be remembered that even passive users help to validate projects, and active users contribute in numerous ways to keep projects viable. Indeed, users invest in the viability of such projects. Without such investment, many projects (like many companies) would remain unable to fulfil their potential.

Instead of inflicting burdensome change whose predictable effect is to cause a depreciation of the users’ existing investments and to demand that they make new investments just to mitigate risk and “keep up”, projects should consider their role in developing sustainable solutions that do not become obsolete just because they are not based on the “latest and greatest” of the technology realm’s toys. If someone comes along and picks up this responsibility when it is abdicated by others, then at the very least they should not be given a hard time about it. And at least this “Python 2.8” barely pretends to be anything more than a continuation of things that came before, which is not something that can be said about Python 3 and the adoption/migration fiasco that accompanies it to this day.