Some More Slow Progress
Sunday, April 7th, 2024A couple of months have elapsed since my last, brief progress report on L4Re development, so I suppose a few words are required to summarise what I have done since. Distractions, travel, and other commitments notwithstanding, I did manage to push my software framework along a little, encountering frustrations and the occasional sensation of satisfaction along the way.
Supporting Real Hardware
Previously, I had managed to create a simple shell-like environment running within L4Re that could inspect an ext2-compatible filesystem, launch programs, and have those programs communicate with the shell – or each other – using pipes. Since I had also been updating my hardware support framework for L4Re on MIPS-based devices, I thought that it was time to face up to implementing support for memory cards – specifically, SD and microSD cards – so that I could access filesystems residing on such cards.
Although I had designed my software framework with things like disks and memory devices in mind, I had been apprehensive about actually tackling driver development for such devices, as well as about whether my conceptual model would prove too simple, necessitating more framework development just to achieve the apparently simple task of reading files. It turned out that the act of reading data, even when almost magical mechanisms like direct memory access (DMA) are used, is as straightforward as one could reasonably expect. I haven’t tested writing data yet, mostly because I am not that brave, but it should be essentially as straightforward as reading.
What was annoying and rather overcomplicated, however, was the way that memory cards have to be coaxed into cooperation, with the SD-related standards featuring layer upon layer of commands added every time they enhanced the technologies. Plenty of time was spent (or wasted) trying to get these commands to behave and to allow me to gradually approach the step where data would actually be transferred. In contrast, setting up DMA transactions was comparatively easy, particularly using my interactive hardware experimentation environment.
There were some memorable traps encountered in the exercise. One involved making sure that the interrupts signalling completed DMA transactions were delivered to the right thread. In L4Re, hardware interrupts are delivered via IRQ (interrupt request) objects to specific threads, and it is obviously important to make sure that a thread waiting for notifications (including interrupts) expects these notifications. Otherwise, they may cause a degree of confusion, which is what happened when a thread serving “blocks” of data to the filesystem components was presented with DMA interrupt occurrences. Obviously, the solution was to be more careful and to “bind” the interrupts to the thread interacting with the hardware.
Another trap involved the follow-on task of running programs that had been read from the memory card. In principle, this should have yielded few surprises: my testing environment involves QEMU and raw filesystem data being accessed in memory, and program execution was already working fine there. However, various odd exceptions were occurring when programs were starting up, forcing me to exercise the useful kernel debugging tool provided with the Fiasco.OC (or L4Re) microkernel.
Of course, the completely foreseeable problem involved caching: data loaded from the memory card was not yet available in the processor’s instruction cache, and so the processor was running code (or potentially something that might not have been code) that had been present in the cache. The problem tended to arise after a jump or branch in the code, executing instructions that did undesirable things to the values of the registers until something severe enough caused an exception. The solution, of course, was to make sure that the instruction cache was synchronised with the data cache containing the newly read data using the l4_cache_coherent function.
Replacing the C Library
With that, I could replicate my shell environment on “real hardware” which was fairly gratifying. But this only led to the next challenge: that of integrating my filesystem framework into programs in a more natural way. Until now, accessing files involved a special “filesystem client” library that largely mimics the normal C library functions for such activities, but the intention has always been to wrap these with the actual C library functions so that portable programs can be run. Ideally, there would be a way of getting the L4Re C library – an adapted version of uClibc – to use these client library functions.
A remarkable five years have passed since I last considered such matters. Back then, my investigations indicated that getting the L4Re library to interface to the filesystem framework might be an involved and cumbersome exercise due to the way the “backend” functionality is implemented. It seemed that the L4Re mechanism for using different kinds of filesystems involved programs dynamically linking to libraries that would perform the access operations on the filesystem, but I could not find much documentation for this framework, and I had the feeling that the framework was somewhat underdeveloped, anyway.
My previous investigations had led me to consider deploying an alternative C library within L4Re, with programs linking to this library instead of uClibc. C libraries generally come across as rather messy and incoherent things, accumulating lots of historical baggage as files are incorporated from different sources to support long-forgotten systems and architectures. The challenge was to find a library that could be conveniently adapted to accommodate a non-Unix-like system, with the emphasis on convenience precluding having to make changes to hundreds of files. Eventually, I chose Newlib because the breadth of its dependencies on the underlying system is rather narrow: a relatively small number of fundamental calls. In contrast, other C libraries assume a Unix-like system with countless, specialised system calls that would need to be reinterpreted and reframed in terms of my own library’s operations.
My previous effort had rather superficially demonstrated a proof of concept: linking programs to Newlib and performing fairly undemanding operations. This time round, I knew that my own framework had become more complicated, employed C++ in various places, and would create a lot of work if I were to decouple it from various L4Re packages, as I had done in my earlier proof of concept. I briefly considered and then rejected undertaking such extra work, instead deciding that I would simply dust off my modified Newlib sources, build my old test programs, and see which symbols were missing. I would then seek to reintroduce these symbols and hope that the existing L4Re code would be happy with my substitutions.
Supporting Threads
For the very simplest of programs, I was able to “stub” a few functions and get them to run. However, part of the sophistication of my framework in its current state is its use of threading to support various activities. For example, monitoring data streams from pipes and files involves a notification mechanism employing threads, and thus a dependency on the pthread library is introduced. Unfortunately, although Newlib does provide a similar pthread library to that featured in L4Re, it is not really done in a coherent fashion, and there is other pthread support present in Newlib that just adds to the confusion.
Initially, then, I decided to create “stub” implementations for the different functions used by various libraries in L4Re, like the standard C++ library whose concurrency facilities I use in my own code. I made a simple implementation of pthread_create, along with some support for mutexes. Running programs did exercise these functions and produce broadly expected results. Continuing along this path seemed like it might entail a lot of work, however, and in studying the existing pthread library in L4Re, I had noticed that although it resides within the “uclibc” package, it is somewhat decoupled from the C library itself.
Favouring laziness, I decided to see if I couldn’t make a somewhat independent package that might then be interfaced to Newlib. For the most part, this exercise involved introducing missing functions and lots of debugging, watching the initialisation of programs fail due to things like conflicts with capability allocation, perhaps due to something I am doing wrong, or perhaps exposing conditions that are fortuitously avoided in L4Re’s existing uClibc arrangement. Ultimately, I managed to get a program relying on threading to start, leaving me with the exercise of making sure that it was producing the expected output. This involved some double-checking of previous measures to allow programs using different C libraries to communicate certain kinds of structures without them misinterpreting the contents of those structures.
Further Work
There is plenty still to do in this effort. First of all, I need to rewrite the remaining test programs to use C library functions instead of client library functions, having done this for only a couple of them. Then, it would be nice to expand C library coverage to deal with other operations, particularly process creation since I spent quite some time getting that to work.
I need to review the way Newlib handles concurrency and determine what else I need to do to make everything work as it should in that regard. I am still using code from an older version of Newlib, so an update to a newer version might be sensible. In this latest round of C library evaluation, I briefly considered Picolibc which is derived from Newlib and other sources, but I didn’t fancy having to deal with its build system or to repackage the sources to work with the L4Re build system. I did much of the same with Newlib previously and, having worked through such annoyances, was largely able to focus on the actual code as opposed to the tooling.
Currently, I have been statically linking programs to Newlib, but I eventually want to dynamically link them. This does exercise different paths in the C and pthread libraries, but I also want to explore dynamic linking more broadly in my own environment, having already postponed such investigations from my work on getting programs to run. Introducing dynamic linking and shared libraries helps to reduce memory usage and increase the performance of a system when multiple programs need the same libraries.
There are also some reasonable arguments for making the existing L4Re pthread implementation more adaptable, consolidating my own changes to the code, and also for considering making or adopting other pthread implementations. Convenient support for multiple C library implementations, and for combining these with other libraries, would be desirable, too.
Much of the above has been a distraction from what I have been wanting to focus on, however. Had it been more apparent how to usefully extend uClibc, I might not have bothered looking at Newlib or other C libraries, and then I probably wouldn’t have looked into threading support. Although I have accumulated some knowledge in the process, and although some of that knowledge will eventually have proven useful, I cannot help feeling that L4Re, being a fairly mature product at this point and a considerable achievement, could be more readily extensible and accessible than it currently is.