Paul Boddie's Free Software-related blog


Archive for the ‘English’ Category

Reconsidering Classic Programming Interfaces

Tuesday, June 4th, 2024

Since my last update, I have been able to spend some time gradually broadening and hopefully improving the support for classic programming interfaces in my L4Re-based experiments, centred around a standard C library implementation based on Newlib. Of course, there were some frustrations experienced along the way, and much remains to be done, not only in terms of new functionality that will need to be added, but also verification and correction of the existing functionality as I come to realise that I have made mistakes, these inevitably leading to new frustrations.

One area I previously identified for broadened support was that of process creation and the ability to allow programs to start other programs. This necessitated a review of the standard C process control functions, which are deliberately abstracted from the operating system and are much simpler and more constrained than those found in the unistd.h file that Unix programmers might be more familiar with. The traditional Unix functions are very much tied up with the Unix process model, and there are some arguments to be made that despite being “standard”, these functions are a distraction and, in various respects, undesirable from a software architecture perspective, for both applications and the operating systems that run them.

So, ignoring the idea that I might support the likes of execl, execv, fork, and so on, I returned to consideration of the much more limited system function that is part of the C language standards, this simply running an abstract command provided by a character string and returning a result code when the command has completed:

int system(const char *command);

To any casual application programmer, this all sounds completely reasonable: they embed a command in their code that is then presented to “the system”, which runs the commands and hands back a result or status code. But those of us who are accustomed to running commands at the shell and in our own programs might already be picking apart this simple situation.

First of all, “the system” needs to have what the C standards documentation calls a “command processor”. In fact, even Unix standardisation efforts have adopted the term, despite the Linux manual pages referring to “the shell”. But at this point, my own system does not have a shell or a command processor, instead providing only a process server that manages the creation of new processes. And my process server deals with arrays or “vectors” of strings that comprise a command to be used to run a given program, configured by a sequence of arguments or parameters.

Indeed, this brings us to some other matters that may be familiar to anyone who has had the need to run commands from within programs: that of parameterising command invocations by introducing our own command argument values; and that of making sure that the representation of the program name and its arguments do not cause the shell to misinterpret these elements, by letting an errant space character break the program name into two, for instance. When dealing only with command strings, matters of quoting and tokenisation enter the picture, making the exercise very messy indeed.

So, our common experience has provided us with a very good reason to follow the lead of the classic execv Unix function and to avoid the representational issues associated with command string processing. In this regard, the Python standard library has managed to show the way in some respects, introducing the subprocess module which features interfaces that are equivalent to functions like system and popen, supporting the use of both command strings and lists of command elements to represent the invoked command.

Oddly, however, nobody seems to provide a “vector” version of the system function at the C language level, but it seemed to be the most natural interface I might provide in my own system:

int systemv(int argc, const char *argv[]);

I imagine that those doing low-level process creation in a Unix-style environment would be content to use the exec family of functions, probably in conjunction with the fork function, precisely because a function like execv “shall replace the current process image with a new process image”, as the documentation states. Obviously, replacing the current process isn’t helpful when implementing the system function because it effectively terminates the calling program, whereas the system function is meant to allow the program to continue after command completion. So, fork has to get involved somehow.

The Flow of Convention

I get the impression that people venturing along a similar path to mine are often led down the trail of compatibility with the systems that have gone before, usually tempted by the idea that existing applications will eventually be content to run on their system without significant modification, and thus an implementer will be able to appeal to an established audience. In this case, the temptation is there to support the fork function, the exec semantics, and to go with the flow of convention. And sometimes, a technical obstacle seems like a challenge to be met, to show that an implementation can provide support for existing software if it needs or wants to.

Then again, having seen situations where software is weighed down by the extra complexity of features that people believe it should have, some temptations are best resisted, perhaps with a robust justification for leaving out any particular supposedly desirable feature. One of my valued correspondents pointed me to a paper by some researchers that provides a robust argument for excluding fork and for promoting alternatives. Those alternatives have their shortcomings, as noted in the paper, and they seem rather complicated when considering simple situations like merely creating a completely separate process and running a new program in it.

Of course, there may still be complexity in doing simple things. One troublesome area was that of what might happen to the input and output streams of a process that creates another one: should the new process be able to receive the input that has been sent to the creating process, and should it be able to send its output to the recipient of the creating process’s output? For something like system or systemv, the initial “obvious” answer might be the total isolation of the created process from any existing input, but this limits the usefulness of such functions. It should arguably be possible to invoke system or systemv within a program that is accepting input as part of a pipeline, and for a process created by these functions to assume the input processing role transparently.

Indeed, the Unix world’s standards documentation for system extends the C standard to assert that the system function should behave like a combination of fork and execl, invoking the shell utility, sh, to initiate the program indicated in the call to system. It all sounds a bit prescriptive, but I suppose that what it largely means is that the input and output streams should be passed to the initiated program. A less prescriptive standard might have said that, of course, but who knows what kind of vendor lobbying went on to avoid having to modify the behaviour of those vendors’ existing products?

This leads to the awkward problem of dealing with the state of an input stream when such a stream is passed to another process. If the creating process has already read part of a stream, we need the newly created process to be aware of the extent of consumed data so that it may only read unconsumed data itself. Similarly, the newly created process must be able to append output to the existing output stream instead of overwriting any data that has already been written. And when the created process terminates, we need the creating process to synchronise its own view of the input and output streams. Such exercises are troublesome but necessary to provide predictable behaviour at higher levels in the system.

Some Room for Improvement

Another function that deserves revisiting is the popen function which either employs a dedicated output stream to capture the output of a created process within a program, or a dedicated input stream so that a program can feed the process with data it has prepared. The mode indicates what kind of stream the function will provide: “r” yields an output stream passing data out of the process, “w” yields an input stream passing data into the process.

FILE *popen(const char *command, const char *mode);

This function is not in the C language standards but in Unix-related standards, but it is too useful to ignore. Like the system function, the standards documentation also defines this function in terms of fork and execl, with the shell getting involved again. Not entirely obvious from this documentation is what happens with the stream that isn’t specified, however, but we can conclude that with its talk of input and output filters, as well as the mention of those other functions, that if we request an output stream from the new process, the new process will acquire standard input from the creating process as its own input stream. Correspondingly, if we request an input stream to feed the new process, the new process will acquire standard output for itself and write output to that.

This poses some concurrency issues that the system function largely avoids. Since the system function blocks until the created process is completed, the state of the shared input and output streams can be controlled. But with popen, the created process runs concurrently and can interact with whichever stream it acquired from the creating process, just as the creating process might also be using it, at least until pclose is invoked to wait for the completion of the created process. The standards documentation and the Linux manual page both note such pitfalls, but the whole business seems less than satisfactory.

Again, the Python standard library shows what a better approach might be. Alongside the popen function, the popen2 function creates dedicated input and output pipes for interaction with the created process, the popen3 function adds an error pipe to the repertoire, and there is even a popen4 function that presumably does what some people might expect from popen2, merging the output and error streams into a single stream. Naturally, this was becoming a bit incoherent, and so the subprocess module was brought in to clean it all up.

Our own attempt at a cleaner approach might involve the following function:

pid_t popenv(int argc, const char *argv[], FILE **input, FILE **output, FILE **error);

Here, we want to invoke a program using a vector containing the program and arguments, just as we did before, but we also want to acquire the input, output and error streams. However, we might allow any of these to be specified as NULL, indicating that any such stream will not be opened for the created process. Since this might cause problems, we might need to create special “empty” or “null” streams, where appropriate, so as not to upset the C library.

Unlike popen, we might also provide the process identifier for the created process. This would allow us to monitor the process, control it in some way, and to wait for its completion. The nature of a process identifier is potentially more complicated than one might think, especially in my own system where there can be many process servers, each of them creating new processes without any regard to the others.

A Simpler Portable Environment Standard

Maybe I am just insufficiently aware of the historical precedents in this regard, but it seems that while C language standards are disappointingly tame when it comes to defining interaction with the host environment, the Unix or POSIX standardisation efforts go into too much detail and risk burdening any newly designed system with the baggage of systems that happened to be commercially significant at a particular point in time. Windows NT infamously feigned compliance with such standards to unlock the door to lucrative government contracts and to subvert public software procurement processes, generating colossal revenues that easily paid for any inconvenient compliance efforts. However, for everybody else, such standards seem to encumber system and application developers with obligations and practices that could be refined, improved and made more suitable for modern needs.

My own work depends on L4Re which makes extensive use of capabilities to provide access to entities within the system. For example, each process relies on a task that provides a private address space, within which code and data reside, along with an object space that retains the capabilities available within the task. Although the Fiasco (or L4Re) microkernel has some notion of all the tasks in the system, as well as all the threads, together with other kinds of objects, such global information is effectively private to the kernel, and “user space” programs merely deal with capabilities that reference specific objects. For such programs, there is no way to get some kind of universal list of tasks or threads, or to arbitrarily request control over any particular instances of them.

In systems with different characteristics to the ones we already know, we have to ask ourselves whether we want to reproduce legacy behaviour. To an extent, it might be desirable to have registers of resident processes and the ability to list the ones currently running in the system, introducing dedicated components to retain this information. Indeed, my process servers could quite easily enumerate and remember the details of processes they create, also providing an interface to query this register, maybe even an interface to control and terminate processes.

However, one must ask whether this is essential functionality or not. For now, the rudimentary shell-like environment I employ to test this work provides similar functionality to the job control features of the average Unix shell, remembering the processes created in this environment and offering control in a limited way over this particular section of the broader system.

And so the effort continues to try and build something a little different from, and perhaps a bit more flexible than, what we use today. Hopefully it is something that ends up being useful, too.

Some More Slow Progress

Sunday, April 7th, 2024

A couple of months have elapsed since my last, brief progress report on L4Re development, so I suppose a few words are required to summarise what I have done since. Distractions, travel, and other commitments notwithstanding, I did manage to push my software framework along a little, encountering frustrations and the occasional sensation of satisfaction along the way.

Supporting Real Hardware

Previously, I had managed to create a simple shell-like environment running within L4Re that could inspect an ext2-compatible filesystem, launch programs, and have those programs communicate with the shell – or each other – using pipes. Since I had also been updating my hardware support framework for L4Re on MIPS-based devices, I thought that it was time to face up to implementing support for memory cards – specifically, SD and microSD cards – so that I could access filesystems residing on such cards.

Although I had designed my software framework with things like disks and memory devices in mind, I had been apprehensive about actually tackling driver development for such devices, as well as about whether my conceptual model would prove too simple, necessitating more framework development just to achieve the apparently simple task of reading files. It turned out that the act of reading data, even when almost magical mechanisms like direct memory access (DMA) are used, is as straightforward as one could reasonably expect. I haven’t tested writing data yet, mostly because I am not that brave, but it should be essentially as straightforward as reading.

What was annoying and rather overcomplicated, however, was the way that memory cards have to be coaxed into cooperation, with the SD-related standards featuring layer upon layer of commands added every time they enhanced the technologies. Plenty of time was spent (or wasted) trying to get these commands to behave and to allow me to gradually approach the step where data would actually be transferred. In contrast, setting up DMA transactions was comparatively easy, particularly using my interactive hardware experimentation environment.

There were some memorable traps encountered in the exercise. One involved making sure that the interrupts signalling completed DMA transactions were delivered to the right thread. In L4Re, hardware interrupts are delivered via IRQ (interrupt request) objects to specific threads, and it is obviously important to make sure that a thread waiting for notifications (including interrupts) expects these notifications. Otherwise, they may cause a degree of confusion, which is what happened when a thread serving “blocks” of data to the filesystem components was presented with DMA interrupt occurrences. Obviously, the solution was to be more careful and to “bind” the interrupts to the thread interacting with the hardware.

Another trap involved the follow-on task of running programs that had been read from the memory card. In principle, this should have yielded few surprises: my testing environment involves QEMU and raw filesystem data being accessed in memory, and program execution was already working fine there. However, various odd exceptions were occurring when programs were starting up, forcing me to exercise the useful kernel debugging tool provided with the Fiasco.OC (or L4Re) microkernel.

Of course, the completely foreseeable problem involved caching: data loaded from the memory card was not yet available in the processor’s instruction cache, and so the processor was running code (or potentially something that might not have been code) that had been present in the cache. The problem tended to arise after a jump or branch in the code, executing instructions that did undesirable things to the values of the registers until something severe enough caused an exception. The solution, of course, was to make sure that the instruction cache was synchronised with the data cache containing the newly read data using the l4_cache_coherent function.

Replacing the C Library

With that, I could replicate my shell environment on “real hardware” which was fairly gratifying. But this only led to the next challenge: that of integrating my filesystem framework into programs in a more natural way. Until now, accessing files involved a special “filesystem client” library that largely mimics the normal C library functions for such activities, but the intention has always been to wrap these with the actual C library functions so that portable programs can be run. Ideally, there would be a way of getting the L4Re C library – an adapted version of uClibc – to use these client library functions.

A remarkable five years have passed since I last considered such matters. Back then, my investigations indicated that getting the L4Re library to interface to the filesystem framework might be an involved and cumbersome exercise due to the way the “backend” functionality is implemented. It seemed that the L4Re mechanism for using different kinds of filesystems involved programs dynamically linking to libraries that would perform the access operations on the filesystem, but I could not find much documentation for this framework, and I had the feeling that the framework was somewhat underdeveloped, anyway.

My previous investigations had led me to consider deploying an alternative C library within L4Re, with programs linking to this library instead of uClibc. C libraries generally come across as rather messy and incoherent things, accumulating lots of historical baggage as files are incorporated from different sources to support long-forgotten systems and architectures. The challenge was to find a library that could be conveniently adapted to accommodate a non-Unix-like system, with the emphasis on convenience precluding having to make changes to hundreds of files. Eventually, I chose Newlib because the breadth of its dependencies on the underlying system is rather narrow: a relatively small number of fundamental calls. In contrast, other C libraries assume a Unix-like system with countless, specialised system calls that would need to be reinterpreted and reframed in terms of my own library’s operations.

My previous effort had rather superficially demonstrated a proof of concept: linking programs to Newlib and performing fairly undemanding operations. This time round, I knew that my own framework had become more complicated, employed C++ in various places, and would create a lot of work if I were to decouple it from various L4Re packages, as I had done in my earlier proof of concept. I briefly considered and then rejected undertaking such extra work, instead deciding that I would simply dust off my modified Newlib sources, build my old test programs, and see which symbols were missing. I would then seek to reintroduce these symbols and hope that the existing L4Re code would be happy with my substitutions.

Supporting Threads

For the very simplest of programs, I was able to “stub” a few functions and get them to run. However, part of the sophistication of my framework in its current state is its use of threading to support various activities. For example, monitoring data streams from pipes and files involves a notification mechanism employing threads, and thus a dependency on the pthread library is introduced. Unfortunately, although Newlib does provide a similar pthread library to that featured in L4Re, it is not really done in a coherent fashion, and there is other pthread support present in Newlib that just adds to the confusion.

Initially, then, I decided to create “stub” implementations for the different functions used by various libraries in L4Re, like the standard C++ library whose concurrency facilities I use in my own code. I made a simple implementation of pthread_create, along with some support for mutexes. Running programs did exercise these functions and produce broadly expected results. Continuing along this path seemed like it might entail a lot of work, however, and in studying the existing pthread library in L4Re, I had noticed that although it resides within the “uclibc” package, it is somewhat decoupled from the C library itself.

Favouring laziness, I decided to see if I couldn’t make a somewhat independent package that might then be interfaced to Newlib. For the most part, this exercise involved introducing missing functions and lots of debugging, watching the initialisation of programs fail due to things like conflicts with capability allocation, perhaps due to something I am doing wrong, or perhaps exposing conditions that are fortuitously avoided in L4Re’s existing uClibc arrangement. Ultimately, I managed to get a program relying on threading to start, leaving me with the exercise of making sure that it was producing the expected output. This involved some double-checking of previous measures to allow programs using different C libraries to communicate certain kinds of structures without them misinterpreting the contents of those structures.

Further Work

There is plenty still to do in this effort. First of all, I need to rewrite the remaining test programs to use C library functions instead of client library functions, having done this for only a couple of them. Then, it would be nice to expand C library coverage to deal with other operations, particularly process creation since I spent quite some time getting that to work.

I need to review the way Newlib handles concurrency and determine what else I need to do to make everything work as it should in that regard. I am still using code from an older version of Newlib, so an update to a newer version might be sensible. In this latest round of C library evaluation, I briefly considered Picolibc which is derived from Newlib and other sources, but I didn’t fancy having to deal with its build system or to repackage the sources to work with the L4Re build system. I did much of the same with Newlib previously and, having worked through such annoyances, was largely able to focus on the actual code as opposed to the tooling.

Currently, I have been statically linking programs to Newlib, but I eventually want to dynamically link them. This does exercise different paths in the C and pthread libraries, but I also want to explore dynamic linking more broadly in my own environment, having already postponed such investigations from my work on getting programs to run. Introducing dynamic linking and shared libraries helps to reduce memory usage and increase the performance of a system when multiple programs need the same libraries.

There are also some reasonable arguments for making the existing L4Re pthread implementation more adaptable, consolidating my own changes to the code, and also for considering making or adopting other pthread implementations. Convenient support for multiple C library implementations, and for combining these with other libraries, would be desirable, too.

Much of the above has been a distraction from what I have been wanting to focus on, however. Had it been more apparent how to usefully extend uClibc, I might not have bothered looking at Newlib or other C libraries, and then I probably wouldn’t have looked into threading support. Although I have accumulated some knowledge in the process, and although some of that knowledge will eventually have proven useful, I cannot help feeling that L4Re, being a fairly mature product at this point and a considerable achievement, could be more readily extensible and accessible than it currently is.

A Tall Tale of Denied Glory

Monday, March 4th, 2024

I seem to be spending too much time looking into obscure tales from computing history, but continuing an earlier tangent from a recent article, noting the performance of different computer systems at the end of the 1980s and the start of the 1990s, I found myself evaluating one of those Internet rumours that probably started to do the rounds about thirty years ago. We will get to that rumour – a tall tale, indeed – in a moment. But first, a chart that I posted in an earlier article:

Performance evolution of the Archimedes and various competitors

Performance evolution of the Archimedes and various competitors

As this nice chart indicates, comparing processor performance in computers from Acorn, Apple, Commodore and Compaq, different processor families bestowed a competitive advantage on particular systems at various points in time. For a while, Acorn’s ARM2 processor gave Acorn’s Archimedes range the edge over much more expensive systems using the Intel 80386, showcased in Compaq’s top-of-the-line models, as well as offerings from Apple and Commodore, these relying on Motorola’s 68000 family. One can, in fact, claim that a comparison between ARM-based systems and 80386-based systems would have been unfair to Acorn: more similarly priced systems from PC-compatible vendors would have used the much slower 80286, making the impact of the ARM2 even more remarkable.

Something might be said about the evolution of these processor families, what happened after 1993, and the introduction of later products. Such topics are difficult to address adequately for a number of reasons, principally the absence of appropriate benchmark results and the evolution of benchmarking to more accurately reflect system performance. Acorn never published SPEC benchmark figures, nor did ARM (at the time, at least), and any given benchmark as an approximation to “real-world” computing activities inevitably drifts away from being an accurate approximation as computer system architecture evolves.

However, in another chart I made to cover Acorn’s Unix-based RISC iX workstations, we can consider another range of competitors and quite a different situation. (This chart also shows off the nice labelling support in gnuplot that wasn’t possible with the currently disabled MediaWiki graph extension.)

Performance of the Acorn R-series and various competitors in approximate chronological order of introduction

Performance of the Acorn R-series and various competitors in approximate chronological order of introduction: a chart produced by gnuplot and converted from SVG to PNG for Wikipedia usage.

Now, this chart only takes us from 1989 until 1992, which will not satisfy anyone wondering what happened next in the processor wars. But it shows the limits of Acorn’s ability to enter the lucrative Unix workstation market with a processor that was perceived to be rather fast in the world of personal computers. Acorn’s R140 used the same ARM2 processor introduced in the Archimedes range, but even at launch this workstation proved to be considerably slower than somewhat more expensive workstation models from Digital and Sun employing MIPS and SPARC processors respectively.

Fortunately for Acorn, adding a cache to the ARM2 (plus a few other things) to make the ARM3 unlocked a considerable boost in performance. Although the efficient utilisation of available memory bandwidth had apparently been a virtue for the ARM designers, coupling the processor to memory performance had put a severe limit on overall performance. Meanwhile, the designers of the MIPS and SPARC processor families had started out with a different perspective and had considered cache memory almost essential in the kind of computer architectures that would be using these processors.

Acorn didn’t make another Unix workstation after the R260, released in 1990, for reasons that could be explored in depth another time. One of them, however, was that ARM processor design had been spun out to a separate company, ARM Limited, and appeared to be stalling in terms of delivering performance improvements at the same rate as previously, or indeed at the same rate as other processor families. Acorn did introduce the ARM610 belatedly in 1994 in its Risc PC, which would have been more amenable to running Unix, but by then the company was arguably beginning the process of unravelling for another set of reasons to be explored another time.

So, That Tall Tale

It is against this backdrop of competitive considerations that I now bring you the tall tale to which I referred. Having been reminded of the Atari Transputer Workstation by a video about the Transputer – another fascinating topic and thus another rabbit hole to explore – I found myself investigating Atari’s other workstation product: a Unix workstation based on the Motorola 68030 known as the Atari TT030 or TT/X, augmenting the general Atari TT product with the Unix System V operating system.

On the chart above, a 68030-based system would sit at a similar performance level to Acorn’s R140, so ignoring aspirational sentiments about “high-end” performance and concentrating on a price of around $3000 (with a Unix licence probably adding to that figure), there were some hopes that Atari’s product would reach a broad audience:

As a UNIX platform, the affordable TT030 may leapfrog machines from IBM, Apple, NeXT, and Sun, as the best choice for mass installation of UNIX systems in these environments.

As it turned out, Atari released the TT without Unix in 1990 and only eventually shipped a Unix implementation in around 1992, discontinuing the endeavour not long afterwards. But the tall tale is not about Atari: it is about their rivals at Commodore and some bizarre claims that seem to have drifted around the Internet for thirty years.

Like Atari and Acorn, Commodore also had designs on the Unix workstation market. And like Atari, Commodore had a range of microcomputers, the Amiga series, based on the 68000 processor family. So, the natural progression for Commodore was to design a model of the Amiga to run Unix, eventually giving us the Amiga 3000UX, priced from around $5000, running an implementation of Unix System V Release 4 branded as “Amiga Unix”.

Reactions from the workstation market were initially enthusiastic but later somewhat tepid. Commodore’s product, although delivered in a much more timely fashion than Atari’s, will also have found itself sitting at a similar performance level to Acorn’s R140 but positioned chronologically amongst the group including Acorn’s much faster R260 and the 80486-based models. It goes without saying that Atari’s eventual product would have been surrounded by performance giants by the time customers could run Unix on it, demonstrating the need to bring products to market on time.

So what is this tall tale, then? Well, it revolves around this not entirely coherent remark, entered by some random person twenty-one years ago on the emerging resource known as Wikipedia:

The Amiga A3000UX model even got the attention of Sun Microsystems, but unfortunately Commodore did not jump at the A3000UX.

If you search the Web for this, including the Internet Archive, the most you will learn is that Sun Microsystems were supposedly interested in adopting the Amiga 3000UX as a low-cost workstation. But the basis of every report of this supposed interest always seems to involve “talk” about a “deal” and possibly “interest” from unspecified “people” at Sun Microsystems. And, of course, the lack of any eventual deal is often blamed on Commodore’s management and perennial villain of the Amiga scene…

There were talks of Sun Microsystems selling Amiga Unix machines (the prototype Amiga 3500) as a low-end Unix workstations under their brand, making Commodore their OEM manufacturer. This deal was let down by Commodore’s Mehdi Ali, not once but twice and finally Sun gave up their interest.

Of course, back in 2003, anything went on Wikipedia. People thought “I know this!” or “I heard something about this!”, clicked the edit link, and scrawled away, leaving people to tidy up the mess two decades later. So, I assume that this tall tale is just the usual enthusiast community phenomenon of believing that a favourite product could really have been a contender, that legitimacy could have been bestowed on their platform, and that their favourite company could have regained some of its faded glory. Similar things happened as Acorn went into decline, too.

Picking It All Apart

When such tales appeal to both intuition and even-handed consideration, they tend to retain a veneer of credibility: of being plausible and therefore possibly true. I cannot really say whether the tale is actually true, only that there is no credible evidence of it being true. However, it is still worth evaluating the details within such tales on their merits and determine whether the substance really sounds particularly likely at all.

So, why would Sun Microsystems be interested in a Commodore workstation product? Here, it helps to review Sun’s own product range during the 1980s, to note that Sun had based its original workstation on the Motorola 68000 and had eventually worked up the 68000 family to the 68030 in its Sun-3 products. Indeed, the final Sun-3 products were launched in 1989, not too long before the Amiga 3000UX came to market. But the crucial word in the previous sentence is “final”: Sun had adopted the SPARC processor family and had started introducing SPARC-based models two years previously. Like other workstation vendors, Sun had started to abandon Motorola’s processors, seeking better performance elsewhere.

A June 1989 review in Personal Workstation magazine is informative, featuring the 68030-based Sun 3/80 workstation alongside Sun’s SPARCstation 1. For diskless machines, the Sun 3/80 came in at around $6000 whereas the SPARCstation 1 came in at around $9000. For that extra $3000, the buyer was probably getting around four times the performance, and it was quite an incentive for Sun’s customers and developers to migrate to SPARC on that basis alone. But even for customers holding on to their older machines and wanting to augment their collection with some newer models, Sun was offering something not far off the “low-cost” price of an Amiga 3000UX with hardware that was probably more optimised for the role.

Sun will have supported customers using these Sun-3 models for as long as support for SunOS was available, eventually introducing Solaris which dropped support for the 68000 family architecture entirely. Just like other Unix hardware vendors, once a transition to various RISC architectures had been embarked upon, there was little enthusiasm for going back and retooling to support the Motorola architecture again. And, after years resisting, even Motorola was embracing RISC with its 88000 architecture, tempting companies like NeXT and Apple to consider trading up from the 68000 family: an adventure that deserves its own treatment, too.

So, under what circumstances would Sun have seriously considered adopting Commodore’s product? On the face of it, the potential compatibility sounds enticing, and Commodore will have undoubtedly asserted that they had experience at producing low-cost machines in volume, appealing to Sun’s estimate, expressed in the Personal Workstation review, that the customer base for a low-cost workstation would double for every $1000 drop in price. And surely Sun would have been eager to close the doors on manufacturing a product line that was going to be phased out sooner or later, so why not let Commodore keep making low-cost models to satisfy existing customers?

First of all, we might well doubt any claims to be able to produce workstations significantly cheaper than those already available. The Amiga 3000UX was, as noted, only $1000 or so cheaper than the Sun 3/80. Admittedly, it had a hard drive as standard, making the comparison slightly unfair, but then the Sun 3/80 was around already in 1989, meaning that to be fair to that product, we would need to see how far its pricing will have fallen by the time the Amiga 3000UX became available. Commodore certainly had experience in shipping large volumes of relatively inexpensive computers like the Amiga 500, but they were not shipping workstation-class machines in large quantities, and the eventual price of the Amiga 3000UX indicates that such arguments about volume do not automatically confer low cost onto more expensive products.

Even if we imagine that the Amiga 3000UX had been successfully cost-reduced and made more competitive, we then need to ask what benefits there would have been for the customer, for developers, and for Sun in selling such a product. It seems plausible to imagine customers with substantial investments in software that only ran on Sun’s older machines, who might have needed newer, compatible hardware to keep that software running. Perhaps, in such cases, the suppliers of such software were not interested or capable of porting the software to the SPARC processor family. Those customers might have kept buying machines to replace old ones or to increase the number of “seats” in their environment.

But then again, we could imagine that such customers, having multiple machines and presumably having them networked together, could have benefited from augmenting their old Motorola machines with new SPARC ones, potentially allowing the SPARC machines to run a suitable desktop environment and to use the old applications over the network. In such a scenario, the faster SPARC machines would have been far preferable as workstations, and with the emergence of the X Window System, a still lower-cost alternative would have been to acquire X terminals instead.

We might also question how many software developers would have been willing to abandon their users on an old architecture when it had been clear for some time that Sun would be transitioning to SPARC. Indeed, by producing versions of the same operating system for both architectures, one can argue that Sun was making it relatively straightforward for software vendors to prepare for future products and the eventual deprecation of their old products. Moreover, given the performance benefits of Sun’s newer hardware, developers might well have been eager to complete their own transition to SPARC and to entice customers to follow rapidly, if such enticement was even necessary.

Consequently, if there were customers stuck on Sun’s older hardware running applications that had been effectively abandoned, one could be left wondering what the scale of the commercial opportunity was in selling those customers more of the same. From a purely cynical perspective, given the idiosyncracies of Sun’s software platform from time to time, it is quite possible that such customers would have struggled to migrate to another 68000 family Unix platform. And even without such portability issues and with the chance of running binaries on a competing Unix, the departure of many workstation vendors to other architectures may have left relatively few appealing options. The most palatable outcome might have been to migrate to other applications instead and to then look at the hardware situation with fresh eyes.

And we keep needing to return to that matter of performance. A 68030-based machine was arguably unappealing, like 80386-based systems, clearing the bar for workstation computing but not by much. If the cost of such a machine could have been reduced to an absurdly low price point then one could have argued that it might have provided an accessible entry point for users into a vendor’s “ecosystem”. Indeed, I think that companies like Commodore and Acorn should have put Unix-like technology in their low-end products, harmonising them with higher-end products actually running Unix, and having their customers gradually migrate as more powerful computers became cheaper.

But for workstations running what one commentator called “wedding-cake configurations” of the X Window System, graphical user interface toolkits, and applications, processors like the 68030, 80386 and ARM2 were going to provide a disappointing experience whatever the price. Meanwhile, Sun’s existing workstations were a mature product with established peripherals and accessories. Any cost-reduced workstation would have been something distinct from those existing products, impaired in performance terms and yet unable to make use of things like graphics accelerators which might have made the experience tolerable.

That then raises the question of the availability of the 68040. Could Commodore have boosted the Amiga 3000UX with that processor, bringing it up to speed with the likes of the ARM3-based R260 and 80486-based products, along with the venerable MIPS R2000 and early SPARC processors? Here, we can certainly answer in the affirmative, but then we must ask what this would have done to the price. The 68040 was a new product, arriving during 1990, and although competitively priced relative to the SPARC and 80486, it was still quoted at around $800 per unit, featuring in Apple’s Macintosh range in models that initially, in 1991, cost over $5000. Such a cost increase would have made it hard to drive down the system price.

In the chart above, the HP 9000/425t represents possibly the peak of 68040 workstation performance – “a formidable entry-level system” – costing upwards of $9000. But as workstation performance progressed, represented by new generations of DECstations and SPARCstations, the 68040 stalled, unable to be clocked significantly faster or otherwise see its performance scaled up. Prominent users such as Apple jumped ship and adopted PowerPC along with Motorola themselves! Motorola returned to the architecture after abandoning further development of the 88000 architecture, delivering the 68060 before finally consigning the architecture to the embedded realm.

In the end, even if a competitively priced and competitively performing workstation had been deliverable by Commodore, would it have been in Sun’s interests to sell it? Compatibility with older software might have demanded the continued development of SunOS and the extension of support for older software technologies. SunOS might have needed porting to Commodore’s hardware, or if Sun were content to allow Commodore to add any necessary provision to its own Unix implementation, then porting of those special Sun technologies would have been required. One can question whether the customer experience would have been satisfactory in either case. And for Sun, the burden of prolonging the lifespan of products that were no longer the focus of the company might have made the exercise rather unattractive.

Companies can always choose for themselves how much support they might extend to their different ranges of products. Hewlett-Packard maintained several lines of workstation products and continued to maintain a line of 68030 and 68040 workstations even after introducing their own PA-RISC processor architecture. After acquiring Apollo Computer, who had also begun to transition to their own RISC architecture from the 68000 family, HP arguably had an obligation to Apollo’s customers and thus renewed their commitment to the Motorola architecture, particularly since Apollo’s own RISC architecture, PRISM, was shelved by HP in favour of PA-RISC.

It is perhaps in the adoption of Sun technology that we might establish the essence of this tale. Amiga Unix was provided with Sun’s OPEN LOOK graphical user interface, and this might have given people reason to believe that there was some kind of deeper alliance. In fact, the alliance was really between Sun and AT&T, attempting to define Unix standards and enlisting the support of Unix suppliers. In seeking to adhere most closely to what could be regarded as traditional Unix – that defined by its originator, AT&T – Commodore may well have been picking technologies that also happened to be developed by Sun.

This tale rests on the assumption that Sun was not able to drive down the prices of its own workstations and that Commodore was needed to lead the way. Yet workstation prices were already being driven down by competition. Already by May 1990, Sun had announced the diskless SPARCstation SPC at the magic $5000 price point, although its lowest-cost colour workstation was reportedly the SPARCstation IPC at a much more substantial $10000. Nevertheless, its competitors were quite able to demonstrate colour workstations at reasonable prices, and eventually Sun followed their lead. Meanwhile, the Amiga 3000UX cost almost $8000 when coupled with a colour monitor.

With such talk of commodity hardware, it must not be forgotten that Sun was not without other options. For example, the company had already delivered SunOS on the Sun386i workstation in 1988. Although rather expensive, costing $10000, and not exactly a generic PC clone, it did support PC architecture standards. This arguably showed the way if the company were to target a genuine commodity hardware platform, and eventually Sun followed this path when making its Solaris operating system available for the Intel x86 architecture. But had Sun had a desperate urge to target commodity hardware back in 1990, partnering with a PC clone manufacturer would have been a more viable option than repurposing an Amiga model. That clone manufacturer could have been Commodore, too, but other choices would have been more convincing.

Conclusions and Reflections

What can we make of all of this? An idle assertion with a veneer of plausibility and a hint of glory denied through the notoriously poor business practices of the usual suspects. Well, we can obviously see that nothing is ever as simple as it might seem, particularly if we indulge every last argument and pursue every last avenue of consideration. And yet, the matter of Commodore making a Unix workstation and Sun Microsystems being “interested in rebadging the A3000UX” might be as simple as imagining a rather short meeting where Commodore representatives present this opportunity and Sun’s representatives firmly but politely respond that the door has been closed on a product range not long for retirement. Thanks but no thanks. The industry has moved on. Did you not get that memo?

Given that there is the essence of a good story in all of this, I consulted what might be the first port of call for Commodore stories: David Pleasance’s book, “Commodore The Inside Story”. Sadly, I can find no trace of any such interaction, with Unix references relating to a much earlier era and Commodore’s Z8000-based Unix machine, the unreleased Commodore 900. Yet, had such a bungled deal occurred, I am fairly sure that this book would lay out the fiasco in plenty of detail. Even Dave Haynie’s chapter, which covers development of the Amiga 3000 and subsequent projects, fails to mention any such dealings. Perhaps the catalogue of mishaps at Commodore is so extensive that a lucrative agreement with one of the most prominent corporations in 1990s computing does not merit a mention.

Interestingly, the idea of a low-cost but relatively low-performance 68030-based workstation from a major Unix workstation vendor did arrive in 1989 in the form of the Apollo DN2500, costing $4000, from Hewlett-Packard. Later on, Commodore would apparently collaborate with HP on chipset development, with this being curtailed by Commodore’s bankruptcy. Commodore were finally moving off the 68000 family architecture themselves, all rather too late to turn their fortunes around. Did Sun need a competitive 68040-based workstation? Although HP’s 9000/425 range was amongst the top sellers, Sun was doing nicely enough with its SPARC-based products, shipping over twice as many workstations as HP.

While I consider this tall tale to be nothing more than folklore, like the reminiscences of football supporters whose team always had a shot at promotion to the bigger league every season, “not once but twice” has a specificity that either suggests a kernel of truth or is a clever embellishment to sustain a group’s collective belief in something that never was. Should anyone know the real story, please point us to the documentation. Or, if there never was any paper trail but you happened to be there, please write it up and let us all know. But please don’t just go onto Wikipedia and scrawl it in the tradition of “I know this!”

For the record, I did look around to see if anyone recorded such corporate interactions on Sun’s side. That yielded no evidence, but I did find something else that was rather intriguing: hints that Sun may have been advised to try and acquire Acorn or ARM. Nothing came from that, of course, but at least this is documentation of an interaction in the corporate world. Of stories about something that never happened, it might also be a more interesting one than the Commodore workstation that Sun never got to rebadge.

Update: I did find a mention of Sun Microsystems and Unix International featuring the Amiga 3000UX on their exhibition stands at the Uniforum conference in early 1991. As noted above, Sun had an interest in promoting adoption of OPEN LOOK, and Unix International – the Sun/AT&T initiative to define Unix standards – had an interest in promoting System V Release 4 and, to an extent, OPEN LOOK. So, while the model may have “even got the attention of Sun Microsystems”, it was probably just a nice way of demonstrating vendor endorsement of Sun’s technology from a vendor who admitted that what it could offer was not “competitive with Sun” and what it had to offer.

How to deal with Wikipedia’s broken graphs and charts by avoiding Web technology escalation

Thursday, February 15th, 2024

Almost a year ago, a huge number of graphs and charts on Wikipedia became unviewable because a security issue had been identified in the underlying JavaScript libraries employed by the MediaWiki Graph extension, necessitating this extension’s deactivation. Since then, much effort has been expended formulating a strategy to deal with the problem, although it does not appear to have brought about any kind of workaround, let alone a solution.

The Graph extension provided a convenient way of embedding data into a MediaWiki page that would then be presented as, say, a bar chart. Since it is currently disabled on Wikipedia, the documentation fails to show what these charts looked like, but they were fairly basic, clean and not unattractive. Fortunately, the Internet Archive has a record of older Wikipedia articles, such as one relevant to this topic, and it is able to show such charts from the period before the big switch-off:

Performance evolution of the Archimedes and various competitors

Performance evolution of the Archimedes and various competitors: a chart produced by the Graph extension

The syntax for describing a chart suffered somewhat from following the style that these kinds of extensions tend to have, but it was largely tolerable. Here is an example:

{{Image frame
 | caption=Performance evolution of the Archimedes and various competitors
 | content = {{Graph:Chart
 | width=400
 | xAxisTitle=Year
 | yAxisTitle=VAX MIPS
 | legend=Product and CPU family
 | type=rect
 | x=1987,1988,1989,1990,1991,1992,1993
 | y1=2.8,2.8,2.8,10.5,13.8,13.8,15.0
 | y2=0.5,1.4,2.8,3.6,3.6,22.2,23.3
 | y3=2.1,3.4,6.6,14.7,19.2,30,40.3
 | y4=1.6,2.1,3.3,6.1,8.3,10.6,13.1
 | y1Title=Archimedes (ARM2, ARM3)
 | y2Title=Amiga (68000, 68020, 68030, 68040)
 | y3Title=Compaq Deskpro (80386, 80486, Pentium)
 | y4Title=Macintosh II, Quadra/Centris (68020, 68030, 68040)
}}
}}

Unfortunately, rendering this data as a collection of bars on two axes relied on a library doing all kinds of potentially amazing but largely superfluous things. And, of course, this introduced the aforementioned security issue that saw the whole facility get switched off.

After a couple of months, I decided that I wasn’t going to see my own contributions diminished by a lack of any kind of remedy, and so I did the sensible thing: use an established tool to generate charts, and upload the charts plus source data and script to Wikimedia Commons, linking the chart from the affected articles. The established tool of choice for this exercise was gnuplot.

Migrating the data was straightforward and simply involved putting the data into a simpler format. Here is an excerpt of the data file needed by gnuplot, with some items updated from the version shown above:

# Performance evolution of the Archimedes and various competitors (VAX MIPS by year)
#
Year    "Archimedes (ARM2, ARM3)" "Amiga (68000, 68020, 68030, 68040)" "Compaq Deskpro (80386, 80486, Pentium)" "Mac II, Quadra/Centris (68020, 68030, 68040)"
1987    2.8     0.5     2.1     1.6
1988    2.8     1.5     3.5     2.1
1989    2.8     3.0     6.6     3.3
1990    10.5    3.6     14.7    6.1
1991    13.8    3.6     19.2    8.3
1992    13.8    18.7    28.5    10.6
1993    15.1    21.6    40.3    13.1

Since gnuplot is more flexible and more capable in parsing data files, we get the opportunity to tabulate the data in a more readable way, also adding some commentary without it becoming messy. I have left out the copious comments in the actual source data file to avoid cluttering this article.

And gnuplot needs a script, requiring a little familiarisation with its script syntax. We can see that various options are required, along with axis information and some tweaks to the eventual appearance:

set terminal svg enhanced size 1280 960 font "DejaVu Sans,24"
set output 'Archimedes_performance.svg'
set title "Performance evolution of the Archimedes and various competitors"
set xlabel "Year"
set ylabel "VAX MIPS"
set yrange [0:*]
set style data histogram
set style histogram cluster gap 1
set style fill solid border -1
set key top left reverse Left
set boxwidth 0.8
set xtics scale 0
plot 'Archimedes_performance.dat' using 2:xtic(1) ti col linecolor rgb "#0080FF", '' u 3 ti col linecolor rgb "#FF8000", '' u 4 ti col linecolor rgb "#80FF80", '' u 5 ti col linecolor rgb "#FF80FF"

The result is a nice SVG file that, when uploaded to Wikimedia Commons, will be converted to other formats for inclusion in Wikipedia articles. The file can then be augmented with the data and the script in a manner that is not entirely elegant, but the result allows people to inspect the inputs and to reproduce the chart themselves. Here is the PNG file that the automation produces for embedding in Wikipedia articles:

Performance evolution of the Archimedes and various competitors

Performance evolution of the Archimedes and various competitors: a chart produced by gnuplot and converted from SVG to PNG for Wikipedia usage.

Embedding the chart in a Wikipedia article is as simple as embedding the SVG file, specifying formatting properties appropriate to the context within the article:

[[File:Archimedes performance.svg|thumb|upright=2|Performance evolution of the Archimedes and various competitors]]

The control that gnuplot provides over the appearance is far superior to that of the Graph extension, meaning that the legend in the above figure could be positioned more conveniently, for instance, and there is a helpful gallery of examples that make familiarisation and experimentation with gnuplot more accessible. So I felt rather happy and also vindicated in migrating my charts to gnuplot despite the need to invest a bit of time in the effort.

While there may be people who need the fancy JavaScript-enabled features of the currently deactivated Graph extension in their graphs and charts on Wikipedia, I suspect that many people do not. For that audience, I highly recommend migrating to gnuplot and thereby eliminating dependencies on technologies that are simply unnecessary for the application.

It would be absurd to suggest riding in a spaceship every time we wished to go to the corner shop, knowing full well that more mundane mobility techniques would suffice. Maybe we should adopt similar, proportionate measures of technology adoption and usage in other areas, if only to avoid the inconvenience of seeing solutions being withdrawn for prolonged periods without any form of relief. Perhaps, in many cases, it would be best to leave the spaceship in its hangar after all.

How does the saying go, again?

Monday, February 12th, 2024

If you find yourself in a hole, stop digging? It wasn’t hard to be reminded of that when reading an assertion that a “competitive” Web browser engine needs funding to the tune of at least $100 million a year, presumably on development costs, and “really” $200-300 million.

Web browsers have come a long way since their inception. But they now feature absurdly complicated layout engines, all so that the elements on the screen can be re-jigged at a moment’s notice to adapt to arbitrary changes in the content, and yet they still fail to provide the kind of vanity publishing visuals that many Web designers seem to strive for, ceding that territory to things like PDFs (which, of course, generally provide static content). All along, the means of specifying layout either involves the supposedly elegant but hideously overcomplicated CSS, or to have scripts galore doing all the work, presumably all pounding the CPU as they do so.

So, we might legitimately wonder whether the “modern Web” is another example of technology for technology’s sake: an effort fuelled by Valley optimism and dubiously earned money that not only undermines interoperability and choice by driving out implementers who are not backed by obscene wealth, but also promotes wastefulness in needing ever more powerful systems to host ever more complicated browsers. Meanwhile, the user experience is constantly degraded: now you, the user, get to indicate whether hundreds of data surveillance companies should be allowed to track your activities under the laughable pretense of “legitimate interest”.

It is entirely justified to ask whether the constant technological churn is giving users any significant benefits or whether they could be using less sophisticated software to achieve the same results. In recent times, I have had to use the UK Government’s Web portal to initiate various processes, and one might be surprised to learn that it provides a clear, clean and generally coherent user experience. Naturally, it could be claimed that such nicely presented pages make good use of the facilities that CSS and the Web platform have to offer, but I think that it provides us with a glimpse into a parallel reality where “less” actually does deliver “more”, because reduced technological complication allows society to focus on matters of more pressing concern.

Having potentially hundreds or thousands of developers beavering away on raising the barrier to entry for delivering online applications is surely another example of how our societies’ priorities can be led astray by self-serving economic interests. We should be able to interact with online services using far simpler technology running on far more frugal devices than multi-core systems with multiple gigabytes of RAM. People used things like Minitel for a lot of the things people are doing today, for heaven’s sake. If you had told systems developers forty years ago that, in the future, instead of just connecting to a service and interacting with it, you would end up connecting to dozens of different services (Google, Facebook, random “adtech” platforms running on dark money) to let them record your habits, siphon off data, and sell you things you don’t want, they would probably have laughed in your face. We were supposed to be living on the Moon by now, were we not?

The modern Web apologist would, of course, insist that the modern browser offers so much more: video, for instance. I was reminded of this a few years ago when visiting the Oslo Airport Express Web site which, at that time, had a pointless video of the train rolling into the station behind the user interface controls, making my browser run rather slowly indeed. As an undergraduate, our group project was to design and implement a railway timetable querying system. On one occasion, our group meeting focusing on the user interface slid, as usual, into unfocused banter where one participant helpfully suggested that behind the primary user interface controls there would have to be “dancing ladies”. To which our only female group member objected, insisting that “dancing men” would also have to be an option. The discussion developed, acknowledging that a choice of dancers would first need to be offered, along with other considerations of the user demographic, even before asking the user anything about their rail journey.

Well, is that not where we are now? But instead of being asked personal questions, a bunch of voyeurs have been watching your every move online and have already deduced the answers to those questions and others. Then, a useless video and random developer excess drains away your computer’s interactivity as you type into treacle, trying to get a sensible result from a potentially unhelpful and otherwise underdeveloped service. How is that hole coming along, again?

Slow but Gradual L4Re Progress

Friday, January 26th, 2024

It seems a bit self-indulgent to write up some of the things I have been doing lately, but I suppose it helps to keep track of progress since the start of the year. Having taken some time off, it took a while to get back into the routine, familiarise myself with my L4Re efforts, and to actually achieve something.

The Dry, Low-Level Review of Mistakes Made

Two things conspired to obstruct progress for a while, both related to the way I handle interprocess communication (IPC) in L4Re. As I may have mentioned before, I don’t use the L4Re framework’s own IPC libraries because I find them either opaque or cumbersome. However, that puts the burden on me to get my own libraries and tools right, which I failed to do. The offending area of functionality was that of message items which are used to communicate object capabilities and to map memory between tasks.

One obstacle involved memory mapping. Since I had evolved my own libraries gradually as my own understanding evolved, I had decided to allocate a capability for every item received in a message. Unfortunately, when I introduced my own program execution mechanism, when one of the components (the region mapper) would be making its own requests for memory, I had overlooked that it would be receiving flexpages – an abstraction for mapped memory – and would not need to allocate a capability for each such item received. So, very quickly, the number of capabilities became exhausted for that component. The fix for this was fairly straightforward: just don’t allocate new capabilities in cases where flexpages are to be received.

The other obstacle involved the assignment of received message items. When a thread receives items, it should have declared how they should be assigned to capabilities by putting capability indexes into what are known as buffer registers (although they are really just an array in memory, in practice). A message transmitting items will cause the kernel to associate those items with the declared capability indexes, and then the receiving thread will itself record the capability indexes for its own purposes. What I had overlooked was that if, say, two items might be expected but if the first of these is “void” or effectively not transmitting a capability, the kernel does not skip the index in the buffer register that might be associated with that expected capability. Instead, it assigns that index to the next valid or non-void capability in the message.

Since my code had assumed that the kernel would associate declared capability indexes with items based on their positions in messages, I was discovering that my programs’ view of the capability assignments differed from that of the kernel, and so operations on the capabilities they believed to be valid were failing. The fix for this was also fairly straightforward: consume declared capability indexes in order, not skipping any of them, regardless of which items in the message eventually get associated with them.

Some Slightly More Tangible Results

After fixing things up, I started to make a bit more progress. I had wanted to take advantage of a bit more interactivity when testing the software, learning from experiences developing low-level software for various single-board computers. I also wanted to get programs to communicate via pipes. Previously, I had managed to get them to use an output pipe instead of just outputting to the console via the “log” capability, but now I also wanted to be able to present those pipes to other programs as those programs’ input pipes.

Getting programs to use pipes would allow them to be used to process, inspect and validate the output of other programs, hopefully helping with testing and validation of program behaviour. I already had a test program that was able to execute operations on the filesystem, and so it seemed like a reasonable idea to extend this to allow it to be able to run programs from the filesystem, too. Once I solved some of the problems I had previously created for myself, this test program started to behave a bit more like a shell.

The following potentially confusing transcript shows a program being launched to show the contents of a text file. Here, I have borrowed a command name from VMS – an operating system I probably used only a handful of times in the early 1990s – although “spawn” is a pretty generic term, widely used in a similar sense throughout modern computing. The output of the program is piped to another program whose role is to “clip” a collection of lines from a file or, as is the case here, an input stream and to send those lines to its output pipe. Waiting for this program to complete yields the extracted lines.

> spawn bin/cat home/paulb/LICENCE.txt
[0]+ bin/cat [!]
> pipe + bin/clip - 5 5
> jobs
[0]  bin/cat
[1]+ bin/clip [!]
> wait 1
Completed with signal 0 value 0
 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
 Everyone is permitted to copy and distribute verbatim copies
 of this license document, but changing it is not allowed.

                            Preamble
Completed with signal 0 value 0
> jobs
>

Obviously, this is very rudimentary, but it should be somewhat useful for testing. I don’t want to get into writing an actual shell because this would be a huge task in itself, apparent when considering the operation of the commands illustrated above. The aim will be to port a shell once the underlying library functionality is mature enough. Still, I think it would be an amusing and a tantalising prospect to define one’s own shell environment.

Revisiting L4Re System Development Efforts

Thursday, December 14th, 2023

I had been meaning to return to my investigations into L4Re, running programs in a configurable environment, and trying to evolve some kind of minimal computing environment, but other efforts and obligations intervened and rather delayed such plans. Some of those other efforts had been informative in their own way, though, giving me a bit more confidence that I might one day get to where I want to be with all of this.

For example, experimenting with various hardware devices had involved writing an interactive program that allows inspection of the low-level hardware configuration. Booting straight after U-Boot, which itself provides a level of interactive support for inspecting the state of the hardware, this program (unlike a weighty Linux payload) facilitates a fairly rapid, iterative process of developing and testing device driver routines. I had believed that such interactivity via the text console was more limited in L4Re, and so this opens up some useful possibilities.

But as for my previous work paging in filesystem content and running programs from the filesystem, it had been deferred to a later point in time with fewer distractions and potentially a bit more motivation on my part, particularly since it can take a while to be fully reacquainted with a piece of work with lots of little details that are easily forgotten. Fortuitously, this later moment in time arrived in conjunction with an e-mail I received asking about some of the mechanisms in L4Re involved with precisely the kinds of activities I had been investigating.

Now, I personally do not regard myself as any kind of expert on L4Re and its peculiarities: after a few years of tinkering, I still feel like I am discovering new aspects of the software and its design, encountering its limitations in forms that may be understandable, excusable, both, or neither of these things. So, I doubt that I am any kind of expert, particularly as I feel like I am muddling along trying to implement something sensible myself.

However, I do appreciate that I am possibly the only person publicly describing work of this nature involving L4Re, which is quite unfortunate from a technology adoption perspective. It may not matter one bit to those writing code for and around L4Re professionally whether anyone talks about the technology publicly, and there may be plenty of money to be made conducting business as usual for such matters to be of any concern whatsoever, but history suggests that technologies have better chances of success (and even survival) if they are grounded in a broader public awareness.

So, I took a bit of time trying to make sense out of what I already did, this work being conducted most intensively earlier in the year, and tried to summarise it in a coherent fashion. Hopefully, there were a few things of relevance in that summary that benefited my correspondent and their own activities. In any case, I welcome any opportunity to constructively discuss my work, because it often gives me a certain impetus to return to it and an element of motivation in knowing that it might have some value to others.

I am grateful to my correspondent for initiating this exercise as it required me to familiarise myself with many of the different aspects of my past efforts, helping me to largely pick up where I had left off. In that respect, I had pretty much reached a point of demonstrating the launching of new programs, and at the time I had wanted to declare some kind of success before parking the work for a later time. However, I knew that some tidying up would be required in some areas, and there were some features that I had wanted to introduce, but I had felt that more time and energy needed to be accumulated before facing down the implementation of those features.

The first feature I had in mind was that of plumbing programs or processes together using pipes. Since I want to improve testing of this software, and since this might be done effectively by combining programs, having some programs do work and others assess the output produced in doing this work, connecting programs using pipes in the Unix tradition seems like a reasonable approach. In L4Re, programs tend to write their output to a “log” capability which can be consumed by other programs or directed towards the console output facility, but the functionality seems quite minimal and does not seem to lend itself readily to integration with my filesystem framework.

Previously, I had implemented a pipe mechanism using shared memory to transfer data through pipes, this being to support things like directory listings yielding the contents of filesystem directories. Consequently, I had the functionality available to be able to conveniently create pipes and to pass their endpoints to other tasks and threads. It therefore seemed possible that I might create a pipe when creating a new process, passing one endpoint to the new process for it to use as its output stream, retaining the other endpoint to consume that output.

Having reviewed my process creation mechanisms, I determined that I would need to modify them so that the component involved – a process server – would accept an output capability, supplying it to a new process in its environment and “mapping” the capability into the task created for the process. Then, the program to be run in the process would need to extract the capability from its environment and use it as an output stream instead of the conventional L4Re output functionality, this being provided by L4Re’s native C library. Meanwhile, any process creating another would need to monitor its own endpoint for any data emitted by the new process, also potentially checking for a signal from the new process in the event of it terminating.

Much of this was fairly straightforward work, but there was some frustration in dealing with the lifecycles of various components and capabilities. For example, it is desirable to be able to have the creating process just perform a blocking read over and over again on the reading endpoint of the pipe, only stopping when the endpoint is closed, with this closure occurring when the created process terminates.

But there were some problems with getting the writing endpoint of the pipe to be discarded by the created process, even if I made the program being run explicitly discard or “unmap” the endpoint capability. It turned out that L4Re’s capability allocator is not entirely useful when dealing with capabilities acquired from the environment, and the task API is needed to do the unmapping job. Eventually, some success was eventually experienced: a test program could now launch another and consume the output produced, echoing it to the console.

The next step, of course, is to support input streams to created processes and to potentially consider the provision of an arbitary number of streams, as opposed to prescribing a fixed number of “standard” streams. Beyond that, I need to return to introducing a C library that supports my framework. I did this once for an earlier incarnation of this effort, putting Newlib on top of my own libraries and mechanisms. On this occasion, it might make sense to introduce Newlib initially only for programs that are launched within my own framework, letting them use C library functions that employ these input and output streams instead of calling lower-level functions.

One significant motivation for getting program launching working in the first place was to finally make Newlib usable in a broad sense, completing coverage of the system calls underpinning the library (as noted in its documentation) not merely by supporting low-level file operations like open, close, read and write, but also by supporting process-related operations such as execve, fork and wait. Whether fork and the semantics of execve are worth supporting is another matter, however, these being POSIX-related functions, and perhaps something like the system function (in stdlib.h, part of the portable C process control functions) would be adequate for portable programs.

In any case, the work will continue, hopefully at a slightly quicker pace as the functionality accumulates, with existing features hopefully making new features easier to formulate and to add. And hopefully, I will be able to dedicate a bit more time and attention to it in the coming year, too.

Firefox and Monospaced Fonts

Friday, December 8th, 2023

This has been going on for years, but a recent upgrade brought it to my attention and it rather says everything about what is wrong with the way technology is supposedly improved. If you define a style for your Web pages using a monospaced font like Courier, Firefox still decides to convert letter pairs like “fi” and “fl” to ligatures. In other words, it squashes the two letters together into a single character.

Now, I suppose that it does this in such a way that the resulting ligature only occupies the space of a single character, thereby not introducing proportional spacing that would disrupt the alignment of characters across lines, but it does manage to disrupt the distribution of characters and potentially the correspondence of characters between lines. Worst of all, though, this enforced conversion is just ugly.

Here is what WordPress seems to format without suffering from this problem, by explicitly using the “monospace” font-style identifier:

long client_flush(file_t *file);

And here is what happens when Courier is chosen as the font:

long client_flush(file_t *file);

In case theming, browser behaviour, and other factors obscure the effect I am attempting to illustrate, here it is with the ligatures deliberately introduced:

long client_flush(file_t *file);

In fact, the automatic ligatures do remain as two distinct letters crammed into a single space whereas I had to go and find the actual ligatures in LibreOffice’s “special character” dialogue to paste into the example above. One might argue that by keeping the letters distinct, it preserves the original text so that it can be copied and pasted back into a suitable environment, like a program source file or an interactive prompt or shell. But still, when the effect being sought is not entirely obtained, why is anyone actually bothering to do this?

It seems to me that this is yet another example of “design” indoctrination courtesy of the products of companies like Apple and Adobe, combined with the aesthetics-is-everything mentality that values style over substance. How awful it is that someone may put the letter “f” next to the letter “i” or “l” without pulling them closer together and using stylish typographic constructs!

Naturally, someone may jump to the defence of the practice being described here, claiming that what is really happening is kerning, as if someone like me might not have heard of it. Unfortunately for them, I spent quite a bit of time in the early 1990s – quite possibly before some of today’s “design” gurus were born – learning about desktop publishing and typography (for a system that had a coherent outline font system before platforms like the Macintosh and Windows did). Generally, you don’t tend to apply kerning to monospaced fonts like Courier: the big hint is the “monospaced” bit.

Apparently, the reason for this behaviour is something to do with the font library being used and it will apparently be fixed in future Firefox releases, or at least ones later than the one I happen to be using in Debian. Workarounds using configuration files reminiscent of the early 2000s Linux desktop experience apparently exist, although I don’t think they really work.

But anyway, well done to everyone responsible for this mess, whether it was someone’s great typographic “design” vision being imposed on everyone else, or whether it was just that yet more technologies were thrown into the big cauldron and stirred around without any consideration of the consequences. I am sure yet more ingredients will be thrown in to mask the unpleasant taste, also conspiring to make all our computers run more slowly.

Sometimes I think that “modern Web” platform architects have it as their overriding goal to reproduce the publishing solutions of twenty to thirty years ago using hardware hundreds or even thousands of times more powerful, yet delivering something that runs even slower and still producing comparatively mediocre results. As if the aim is to deliver something akin to a turn-of-the-century Condé Nast publication on the Web with gigabytes of JavaScript.

But maybe, at least for the annoyance described here, the lesson is that if something is barely worth doing, largely because it is probably only addressing someone’s offended sense of aesthetics, maybe just don’t bother doing it. There are, after all, plenty of other things in the realm of technology and beyond that more legitimately demand humanity’s attention.

Experiments with a Screen

Sunday, November 19th, 2023

Not much to report, really. Plenty of ongoing effort to overhaul my L4Re-based software support for the MIPS-based Ingenic SoC products, plus the slow resumption of some kind of focus on my more general framework to provide a demand-paged system on top of L4Re. And then various distractions and obligations on top of that.

Anyway, here is a picture of some kind of result:

MIPS Creator CI20 and Pirate Audio Mini Speaker board

The MIPS Creator CI20 driving the Pirate Audio Mini Speaker board’s screen.

It shows the MIPS Creator CI20 using a Raspberry Pi “hat”, driving the screen using the SPI peripheral built into the CI20’s JZ4780 SoC. Although the original Raspberry Pi had a 26-pin expansion header that the CI20 adopted for compatibility, the Pi range then adopted a 40-pin header instead. Hopefully, there weren’t too many unhappy vendors of accessories as a result of this change.

What it means for the CI20 is that its primary expansion header cannot satisfy the requirements of the expansion connector provided by this “hat” or board in its entirety. Instead, 14 pins of the board’s connector are left unconnected, with the board hanging over the side of the CI20 if mounted directly. Another issue is that the pinout of the board employs a pin as a data/command pin instead of as its designated function as a SPI data input pin. Whether the Raspberry Pi can configure itself to utilise this pin efficiently in this way might help to explain the configuration, but it isn’t compatible with the way such pins are assigned on the CI20.

Fortunately, the CI20’s designers exposed a SPI peripheral via a secondary header, including a dedicated data/command pin, meaning that a few jumper wires can connect the relevant pins to the appropriate connector pins. After some tedious device driver implementation and accompanying frustration, the screen could be persuaded to show an image. With the SPI peripheral being used instead of “bit banging”, or driving the data transfer to the screen controller directly in software, it became possible to use DMA to have the screen image repeatedly sent. And with that, the screen can be used to continuously reflect the contents of a generic framebuffer, becoming like a tiny monitor.

The board also has a speaker that can be driven using I2S communication. The CI20 doesn’t expose I2S signals via the header pins, instead routing I2S audio via the HDMI connector, analogue audio via the headphone socket, and PCM audio via the Wi-Fi/Bluetooth chip, presumably supporting Bluetooth audio. Fortunately, I have another means of testing the speaker, so I didn’t waste half of my money buying this board!

Continuing Explorations into Filesystems and Paging with L4Re

Saturday, April 8th, 2023

Towards the end of last year, I spent a fair amount of time trying to tidy up and document the work I had been doing on integrating a conventional filesystem into the L4 Runtime Environment (or L4Re Operating System Framework, as it now seems to be called). Some of that effort was purely administrative, such as giving the work a more meaningful name and changing references to the naming in various places, whereas other aspects were concerned with documenting mundane things like how the software might be obtained, built and used. My focus had shifted somewhat towards sharing the work and making it slightly more accessible to anyone who might be interested (even if this is probably a very small audience).

Previously, in seeking to demonstrate various mechanisms such as the way programs might be loaded and run, with their payloads paged into memory on demand, I had deferred other work that I felt was needed to make the software framework more usable. For example, I was not entirely happy with the way that my “client” library for filesystem access hid the underlying errors, making troubleshooting less convenient than it could be. Instead of perpetuating the classic Unix “errno” practice, I decided to give file data structures their own error member to retain any underlying error, meaning that a global variable would not be involved in any error reporting.

Other matters needed attending to, as well. Since acquiring a new computer in 2020 based on the x86-64 architecture, the primary testing environment for this effort has been a KVM/QEMU instance invoked by the L4Re build process. When employing the same x86-64 architecture for the instance as the host system, the instance should in theory be very efficient, but for some reason the startup time of such x86-64 instances is currently rather long. This was not the case at some point in the past, but having adopted the Git-based L4Re distribution, this performance regression made an appearance. Maybe at some stage in the future I will discover why it sits there for half a minute spinning up at the “Booting ROM” stage, but for now a reasonable workaround is to favour QEMU instances for other architectures when testing my development efforts.

Preserving Portability

Having long been aware of the necessity of software portability, I have therefore been testing the software in QEMU instances emulating the classic 32-bit x86 architecture as well as MIPS32, in which I have had a personal interest for several years. Surprisingly, testing on x86 revealed a few failures that were not easily explained, but I eventually tracked them down to interoperability problems with the L4Re IPC library, where that library was leaving parts of IPC message values uninitialised and causing my own IPC library to misinterpret the values being sent. This investigation also led me to discover that the x86 Application Binary Interface is rather different in character to the ABI for other architectures. On those other architectures, the alignment of members in structures (and of parameters in parameter lists) needs to be done more carefully due to the way values in memory are accessed. On x86, meanwhile, it seems that values of different sizes can be more readily packed together.

In any case, I came to believe that the L4Re IPC library is not following the x86 ABI specification in the way IPC messages are prepared. I did wonder whether this was deliberate, but I think that it is actually inadvertent. One of my helpful correspondents confirmed that there was indeed a discrepancy between the L4Re code and the ABI, but nothing came of any enquiries into the matter, so I imagine that in any L4Re systems deployed on x86 (although I doubt that there can be many), the use of the L4Re code on both sides of any given IPC transaction manages to conceal this apparent deficiency. The consequence for me was that I had to introduce a workaround in the cases where my code needs to interact with various existing L4Re components.

Several other portability changes were made to resolve a degree of ambiguity around the sizes of various types. This is where the C language family and various related standards and technologies can be infuriating, with care required when choosing data types and then using these in conjunction with libraries that might have their own ideas about which types should be used. Although there probably are good reasons for some types to be equivalent to a “machine word” in size, such types sit uncomfortably with types of other, machine-independent sizes. I am sure I will have to revisit these choices over and over again in future.

Enhancing Component Interface Descriptions

One thing I had been meaning to return to was the matter of my interface description language (IDL) tool and its lack of support for composing interfaces. For example, a component providing file content might expose several different interfaces for file operations, dataspace operations, and so on. These compound interfaces had been defined by specifying arguments for each invocation of the IDL tool that indicate all the interfaces involved, and thus the knowledge of each compound interface ended up being encoded as definitions within Makefiles like this:

mapped_file_object_INTERFACES = dataspace file flush mapped_file notification

A more natural approach involved defining these interfaces in the interface description language itself, but this was going to require putting in the effort to extend the tool, which would not be particularly pleasant, being written in C using Flex and Bison.

Eventually, I decided to just get on with remedying the situation, adding the necessary tool support, and thus tidying up and simplifying the Makefiles in my L4Re build system package. This did raise the complexity level in the special Makefiles provided to support the IDL tool – nothing in the realm of Makefiles is ever truly easy – but it hopefully confines such complexity out of sight and keeps the main project Makefiles as concise as can reasonably be expected. For reference, here is how a file component interface looks with this new tool support added:

interface MappedFileObject composes Dataspace, File, Flush, MappedFile, Notification;

And for reference, here is what one of the constituent interfaces looks like:

interface Flush
{
  /* Flush data and update the size, if appropriate. */

  [opcode(5)] void flush(in offset_t populated_size, out offset_t size);
};

I decided to diverge from previous languages of this kind and to use “composes” instead of language like “inherits”. These compound interface descriptions deliberately do not seek to combine interfaces in a way that entirely resembles inheritance as supported by various commonly used programming languages, and an interface composing other interfaces cannot also add operations of its own: it can merely combine other interfaces. The main reason for such limitations is the deliberate simplicity or lack of capability of the tool: it only really transcribes the input descriptions to equivalent forms in C or C++ and neglects to impose many restrictions of its own. One day, maybe I will revisit this and at least formalise these limitations instead of allowing them to emerge from the current state of the implementation.

A New Year

I had hoped to deliver something for broader perusal late last year, but the end of the year arrived and with it some intriguing but increasingly time-consuming distractions. Having written up the effective conclusion of those efforts, I was able to turn my attention to this work again. To start with, that involved reminding myself where I had got to with it, which underscores the need for some level of documentation, because documentation not only communicates the nature of a work to others but it also communicates it to one’s future self. So, I had to spend some time rediscovering the finer detail and reminding myself what the next steps were meant to be.

My previous efforts had demonstrated the ability to launch new programs from my own programs, reproducing some of what L4Re already provides but in a form more amenable to integrating with my own framework. If the existing L4Re code had been more obviously adaptable in a number of different ways throughout my long process of investigation and development for it, I might have been able to take some significant shortcuts and save myself a lot of effort. I suppose, however, that I am somewhat wiser about the technologies and techniques involved, which might be beneficial in its own way. The next step, then, was to figure out how to detect and handle the termination of programs that I had managed to launch.

In the existing L4Re framework, a component called Ned is capable of launching programs, although not being able to see quite how I might use it for my own purposes – that being to provide a capable enough shell environment for testing – had led me along my current path of development. It so happens that Ned supports an interface for “parent” tasks that is used by created or “child” tasks, and when a program terminates, the general support code for the program that is brought along by the C library includes the invocation of an operation on this parent interface before the program goes into a “wait forever” state. Handling this operation and providing this interface seemed to be the most suitable approach for replicating this functionality in my own code.

Consolidation and Modularisation

Before going any further, I wanted to consolidate my existing work which had demonstrated program launching in a program written specifically for that purpose, bringing along some accompanying abstractions that were more general in nature. First of all, I decided to try and make a library from the logic of the demonstration program I had written, so that the work involved in setting up the environment and resources for a new program could be packaged up and re-used. I also wanted the functionality to be available through a separate server component, so that programs wanting to start other programs would not need to incorporate this functionality but could instead make a request to this separate “process server” component to do the work, obtaining a reference to the new program in response.

One might wonder why one might bother introducing a separate component to start programs on another program’s behalf. As always when considering the division of functionality between components in a microkernel-based system, it is important to remember that components can have different configurations that afford them different levels of privilege within a system. We might want to start programs with one level of privilege from other programs with a different level of privilege. Another benefit of localising program launching in one particular component is that it might provide an overview of such activities across a number of programs, thus facilitating support for things like job and process control.

Naturally, an operating system does not need to consolidate all knowledge about running programs or processes in one place, and in a modular microkernel-based system, there need not even be a single process server. In fact, it seems likely that if we preserve the notion of a user of the system, each user might have their own process server, and maybe even more than one of them. Such a server would be configured to launch new programs in a particular way, having access only to resources available to a particular user. One interesting possibility is that of being able to run programs provided by one filesystem that then operate on data provided by another filesystem. A program would not be able to see the filesystem from which it came, but it would be able to see the contents of a separate, designated filesystem.

Region Mapper Deficiencies

A few things conspired to make the path of progress rather less direct than it might have been. Having demonstrated the launching of trivial programs, I had decided to take a welcome break from the effort. Returning to the effort, I decided to test access to files served up by my filesystem infrastructure, and this caused programs to fail. In order to support notification events when accessing files, I employ a notification thread to receive such events from other components, but the initialisation of threading in the C library was failing. This turned out to be due to the use of a region mapper operation that I had not yet supported, so I had to undertake a detour to implement an appropriate data structure in the region mapper, which in C++ is not a particularly pleasant experience.

Later on, the region mapper caused me some other problems. I had neglected to implement the detach operation, which I rely on quite heavily for my file access library. Attempting to remedy these problems involved reacquainting myself with the region mapper interface description which is buried in one of the L4Re packages, not to be confused with plenty of other region mapper abstractions which do not describe the actual interface employed by the IPC mechanism. The way that L4Re has abandoned properly documented interface descriptions is very annoying, requiring developers to sift through pages of barely commented code and to be fully aware of the role of that code. I implemented something that seemed to work, quite sure that I still did not have all the details correct in my implementation, and this suspicion would prove correct later on.

Local and Non-Local Capabilities

Another thing that I had not fully understood, when trying to put together a library handling IPC that I could tolerate working with, was the way that capabilities may be transferred in IPC messages within tasks. Capabilities are references to components in the system, and when transferred between tasks, the receiving task is meant to allocate a “slot” for each received capability. By choosing a slot denoted by an index, the task (or the program running in it) can tell the kernel where to record the capability in its own registry for the task, and by employing this index in its own registry, the program will be able to maintain a record of available capabilities consistent with that of the kernel.

The practice of allocating capability slots for received capabilities is necessary for transfers between tasks, but when the transfer occurs within a task, there is no need to allocate a new slot: the received capability is already recorded within the task, and so the item describing the capability in the message will actually encode the capability index known to the task. Previously, I was not generally sending capabilities in messages within tasks, and so I had not knowingly encountered any issues with my simplistic “general case” support for capability transfers, but having implemented a region mapper that resides in the same task as a program being run, it became necessary to handle the capabilities presented to the region mapper from within the same task.

One counterintuitive consequence of the capability management scheme arises from the general, inter-task transfer case. When a task receives a capability from another task, it will assign a new index to the capability ahead of time, since the kernel needs to process this transfer as it propagates the message. This leaves the task with a new capability without any apparent notion of whether it has seen that capability before. Maybe there is a way of asking the kernel if two capabilities refer to the same object, but it might be worthwhile just not relying on such facilities and designing frameworks around such restrictions instead.

Starting and Stopping

So, back to the exercise of stopping programs that I had been able to start! It turned out that receiving the notification that a program had finished was only the start; what then needed to happen was something of a mystery. Intuitively, I knew that the task hosting the program’s threads would need to be discarded, but I envisaged that the threads themselves probably needed to be discarded first, since they are assigned to the task and probably cannot have that task removed from under them, even if they are suspended in some sense.

But what about everything else referenced by the task? After all, the task will have capabilities for things like dataspaces that provide access to regions of files and to the program stack, for things like the filesystem for opening other files, for semaphore and IRQ objects, and so on. I cannot honestly say that I have the definitive solution, and I could not easily find much in the way of existing guidance, so I decided in the end to just try and tidy all the resources up as best I could, hopefully doing enough to make it possible to release the task and have the kernel dispose of it. This entailed a fairly long endeavour that also encouraged me to evolve the way that the monitoring of the process termination is performed.

When the started program eventually reaches the end and sends a message to its “parent” component, that component needs to record any termination state communicated in the message so that it may be passed on to the program’s creator or initiator, and then it also needs to commence the work of wrapping up the program. Here, I decided on a distinct component separate from one responsible for any paging activities to act as the contact point for the creating or initiating program. When receiving a termination message or signal, this component disconnects the terminating program from its internal pager by freeing the capability, and this then causes the internal pager to terminate, itself sending a signal to its own parent.

One important aspect of starting and terminating processes is that of notifying the party that sought to start a process in the first place. For filesystem operations, I had already implemented support for certain notification events related to opening, modifying and closing files and pipes, with these being particularly important for pipes. I wanted to extend this support to processes so that it might be possible to monitor files, pipes and processes together using a kind of select or poll operation. This led to a substantial detour where I became dissatisfied with the existing support, modified it, had to debug it, and remain somewhat concerned that it might need more work in the future.

Testing on the different architectures under QEMU also revealed that I would need to handle the possibility that a program might be started and run to completion before its initiator had even received a reference to the program for notification purposes. Fortunately, a similar kind of vanishing resource problem arose when I was developing the file paging system, and so I had a technique available to communicate the reference to the process monitor component to the initiator of the program, ensuring that the process monitor becomes established in the kernel’s own records, before the program itself gets started, runs and completes, avoiding the process monitor being tidied up before its existence becomes known to the wider system.

Wrapping Up Again

A few concerns remain with the state of the work so far. I experienced problems with filesystem access that I traced to the activity of repeatedly attaching and detaching dataspaces, which is something my filesystem access library does deliberately, but the error suggested that the L4Re region mapper had somehow failed to attach the appropriate region. This may well be caused by issues within my own code, and my initial investigation did indeed uncover a problem in my own code where the size of the attached region of a file would gradually increase over time. With this mistake fixed, the situation was improved, but the underlying problem was not completely eliminated, judging from occasional errors. A workaround has been adopted for now.

Various other problems arose and were hopefully resolved. I would say that some of them were due to oversights when getting things done takes precedence over a more complete consideration of all the issues, particularly when working in a language like C++ where lower-level chores like manual memory management enter the picture. The differing performance when emulating various architectures under QEMU also revealed a deficiency with my region mapper implementation. It turned out that detach operations were not returning successfully, leading the L4Re library function to return without invalidating memory pages, and so my file access operations were returning pages of incorrect content instead of the expected file content for the first few accesses until the correct pages had been paged in and were almost continuously resident.

Here, yet more digging around in the L4Re code revealed an apparent misunderstanding about the return value associated with one of the parameters to the detach operation, that of the detached dataspace. I had concluded that a genuine capability was meant to be returned, but it seems that a simple index value is returned in a message word instead of a message item, and so there is no actual capability transferred to the caller, not even a local one. The L4Re IPC framework does not really make the typing semantics very clear, or at least not to me, and the code involved is quite unfathomable. Again, a formal interface specification written in a clearly expressed language would have helped substantially.

Next Steps

I suppose progress of sorts has been made in the last month or so, for which I can be thankful. Although tidying up the detritus of my efforts will remain an ongoing task, I can now initiate programs and wait for them to finish, meaning that I can start building up test suites within the environment, combining programs with differing functionality in a Unix-like fashion to hopefully validate the behaviour of the underlying frameworks and mechanisms.

Now, I might have tried much of this with L4Re’s Lua-based scripting, but it is not as straightforward as a more familiar shell environment, appearing rather more low-level in some ways, and it is employed in a way that seems to favour parallel execution instead of the sequential execution that I might desire when composing tests: I want tests to involve programs whose results feed into subsequent programs, as opposed to just running a load of programs at once. Also, without more extensive documentation, the Lua-based scripting support remains a less attractive choice than just building something where I get to control the semantics. Besides, I also need to introduce things like interprocess pipes, standard input and output, and such things familiar from traditional software platforms. Doing that for a simple shell-like environment would be generally beneficial, anyway.

Should I continue to make progress, I would like to explore some of the possibilities hinted at above. The modular architecture of a microkernel-based system should allow a more flexible approach in partitioning the activities of different users, along with the configuration of their programs. These days, so much effort is spent in “orchestration” and the management of containers, with a veritable telephone directory of different technologies and solutions competing for the time and attention of developers who are now also obliged to do the work of deployment specialists and systems administrators. Arguably, much of that involves working around the constraints of traditional systems instead of adapting to those systems, with those systems themselves slowly adapting in not entirely convincing or satisfactory ways.

I also think back to my bachelor’s degree dissertation about mobile software agents where the idea was that untrusted code might be transmitted between systems to carry out work in a safe and harmless fashion. Reducing the execution environment of such agent programs to a minimum and providing decent support for monitoring and interacting with them would be something that might be more approachable using the techniques explored in this endeavour. Pervasive, high-speed, inexpensively-accessed networks undermined the envisaged use-cases for mobile agents in general, although the practice of issuing SQL queries to database servers or having your browser run JavaScript programs deployed in Web pages demonstrates that the general paradigm is far from obsolete.

In any case, my “to do” list for this project will undoubtedly remain worryingly long for the foreseeable future, but I will hopefully be able to remedy shortcomings, expand the scope and ambition of the effort, and continue to communicate my progress. Thank you to those who have made it to the end of this rather dry article!