Paul Boddie's Free Software-related blog


Archive for January, 2018

Concise Attribute Initialisation in Lichen… and Python?

Monday, January 22nd, 2018

In my review of 2017, I mentioned a project of mine to make a Python-like language called Lichen that is more amenable to compile-time analysis than Python is, while still having a feature set I might actually be able to use in “real” programs one day. There are a lot of different “moving parts” in the Lichen toolchain, and being preoccupied with various other projects and activities, I haven’t been able to get back into working on it properly in the last few months.

Recently, as I found myself writing Python code for another of my projects, I got to wondering about something in Python that can occur a lot: the initialisation of instance attributes. Here is a classic example:

class Point:
    def __init__(self, x, y):
        self.x = x
        self.y = y

# For illustration, here is how the class is used...
p = Point(640, 512)
print p.x, p.y # 640 512

In this example, having to assign the parameter values to the instance attributes is not much of a hardship. But with more verbose initialisation methods with more parameters and more attributes involved, writing everything out can be tiresome. Moreover, mistakes can be made, particularly if the interfaces and structures are evolving. Naturally, there are a range of improvements and measures that attempt to alleviate the problem. Here is the most obvious:

class Point:
    def __init__(self, x, y):
        self.x = x; self.y = y

This just puts the same statements on one line, so let us move beyond it to the next attempt:

class Point:
    def __init__(self, x, y):
        self.x, self.y = x, y

Here, we are actually performing “tuple assignment”, with the parameter values being placed in a tuple whose elements are then assigned to the names in the corresponding positions on the left-hand side of the assignment.

Now, without any Python “magic”, this is probably as far as you can get. The “magic” involves introspection and a feature known as “decorators” (which Lichen doesn’t support) to let us use something like this:

class Point:
    @initialising("x", "y")
    def __init__(self, x, y):
        pass

Here, I am taking inspiration from a collection of actual suggestions and solutions, but none of them look like the above. Indeed, many of them take the approach of initialising attributes using every parameter in the method signature which isn’t always what you want, although it does seem to be requested every now and again.

Although the above example looks quite nice, the mechanism responsible for performing the attribute assignments will not look as nice, and so I won’t show it here. And unless a mode is supported where the names can be omitted, thus initialising attributes using all parameters (except self) when you do want to, it is perhaps tiresome to have to write the names out again somewhere else, even more so as strings.

You will also find people advocating more transparent use of the ** catch-all parameter (also not supported by Lichen), sometimes in response to people worried that writing out lots of assignments is a sign of bad code. This yields solutions like this one:

class Point:
    def __init__(self, **kw):
        for name in ("x", "y"):
            setattr(self, name, kw.get(name))

But keeping named parameters in the signature helps to prevent certain kinds of errors, which is one reason why I don’t intend to support catch-all parameters in Lichen.

But what I wondered is why Python never supported something closer to C++’s initialisation lists. In C++, we might write the code somewhat as follows:

class Point
{
    Number x, y;
public:
    Point(Number x, Number y) : x(x), y(y) {};
}

Here, it is evident that repetition occurs just as in the “magic” Python example, which is something I might want to eliminate. Maybe we would want to have a shorthand for attribute initialisation within the parameter list itself. And then I thought of a possible syntax:

class Point:
    def __init__(self, .x, .y):
        pass

So, any parameter employing a dot before its name would result in the assignment of its value to the instance attribute having the same name. Of course, this wouldn’t support a parameter with one name having its value assigned to an attribute with another name, but I thought it best to stick to the simple cases. “Why not add this to Lichen?” I thought.

And in line with not getting too immersed in the toolchain straight away after such a long break, I decided on some rather simple semantics for this feature: dot-prefixed names would still exist as local names; dot-prefixing would just be a form of shorthand meaning that an assignment would be generated at the very start of the function body. So, the above would really translate to the very first example given at the start of this article or, indeed, the second one which is equivalent and is reproduced below:

# Lichen-only...                   # Python and Lichen...
class Point:                       class Point:
    def __init__(self, .x, .y):        def __init__(self, x, y):
        pass                               self.x = x; self.y = y

Keeping the sophistication of the feature at an unambitious level, besides letting me slowly familiarise myself again with the code, also helps to deal with potential conflicts with other mechanisms. For example, what if someone wanted to employ a name twice – once dot-prefixed, once unprefixed – like this…?

class Point:
    def __init__(self, .x, .y, x):
        self.intensity = x ** 2

By asserting that the dot-prefixed x is really just x that also initialises the attribute of the same name, we can fall back on the normal rules around parameters and forbid such duplicate names without having to think very hard about temporary names or more exotic mechanisms that might be used to initialise attributes directly. One other thing worth mentioning is that I don’t reserve the use of such parameters for the exclusive use of initialiser methods, so other applications are possible. For example:

class Point:
    def __init__(.x, .y): pass
    def update(.x, .y): pass

Here, I also omit self because Lichen defines it as always being present in methods, anyway. And we could actually make the update method an alias of the initialiser method, too, but let us not get too carried away!

Fortunately, I adopted a parser framework in Lichen that was originally written for PyPy that allows relatively straightforward modification of the language grammar. Conveniently, the grammar changes required for this feature are minimal and I don’t even have to add any extra tokens. That made me wonder whether such a syntax had been suggested for Python at some point or other. Some quick searches haven’t yielded any results, and I can’t be bothered to trawl the different mailing list archives to find mentions of such features. I can easily imagine that such a feature might have been discussed rather early in Python’s lifetime, possibly in the mid-1990s.

Arguments for new syntax in Python are often met with arguments against “syntactic sugar”, with such “sugar” introducing more convenient notation or a form of shorthand for particular operations. Over the years, people have argued for more concise ways of referencing instance attributes and class attributes instead of using the almost-special self name (that is rather more special in Lichen). Compound assignments to instance attributes have probably been discussed, too, maybe proposing things like this:

# Compound assignment idea...      # Equivalent assignment...
self.(x, y) = x, y                 self.x, self.y = x, y

In response to such suggestions, people seem to be asked how often they need to write such things, whether it is really such a burden to do so, and whether their programming tools cannot help them write out the conventional assignments semi-automatically instead. Proposed general language constructs may well risk introducing conflicts with other language features in unanticipated ways, and if such constructs only ever get used in certain, rather limited, circumstances then one can justifiably ask whether it is really worth the effort to support them. They will, after all, need people to implement them, test them, maintain them, and keep fixing them long into the future.

As is evident from the discussion of the problem of concise initialisation, Python’s community has grown accustomed to solving simple problems in fairly complicated ways using general mechanisms introduced to support broad classes of functionality. Decorators were introduced into Python as a way of inserting extra code around methods and functions to modify or extend their behaviour, allowing people to tackle such problems by getting that extra code to initialise attributes or to do many other weird, wild and wonderful things. Providing such mechanisms lets the language designers send people elsewhere when those people descend on the designers demanding a quick syntactic fix for a specific problem they might be having.

But it really does surprise me that something as simple as dot-prefixing parameter names never managed to get suggested and quickly introduced into an early version of Python. I did wonder whether other Python-inspired languages might have subconsciously inspired me, but a brief perusal of the Boo, Cobra, Delight and Genie documentation turned up nothing. And so, without any more insight into my inspiration, that is the tale of my first experiment in extending Lichen’s syntax beyond that of Python.

Update

I finally remembered where I had seen the dot-prefixed name notation before. When initialising structures in C, you can explicitly indicate a structure member when specifying a value, and I do this all the time in the code generated for Lichen programs. I even define macros that use this feature. For example:

#define __INTVALUE(VALUE) ((__attr) {.intvalue=((VALUE) << 1) | 1})

So I suppose it shows how long it has been since I had to look at that part of the toolchain! Of course, this is directly initialising a structure member by indicating a value, whereas the Lichen syntax enhancement associates an attribute, which is similar to a member, with a parameter received in a method call. But there are some similarities in purpose, nevertheless.