Paul Boddie's Free Software-related blog
Paul's activities and perspectives around Free Software
Porting L4Re and Fiasco.OC to the Ben NanoNote (Part 7)
Having tracked down lots of places in the Fiasco.OC kernel that needed changing to run on the Ben NanoNote, including some places I had already visited, changed, and then changed again to fix my earlier mistakes, I was on the “home straight”, needing only to activate the sigma0 thread and the boot thread before the kernel would enter its normal operational state. But the sigma0 thread was proving problematic: control was apparently passed to it, but the kernel would never get control back.
Could this be something peculiar to sigma0, I wondered? I decided to swap the order of the activation, putting the boot thread first and seeing if the kernel “kept going”. Indeed, it did seem to, showing an indication on screen that I had inserted into the code to test whether the kernel regained control after activating the boot thread. So this indeed meant that something about sigma0 was causing the problem. I decided to trace execution through various method calls to see what might be going wrong.
The Big Switch
The process of activating a thread is rather interesting, involving the following methods:
- Context::activate (in kernel/fiasco/src/kern/context.cpp)
- Context::switch_to_locked
- Context::schedule_switch_to_locked
- Context::switch_exec_locked
- Context::switch_cpu (in kernel/fiasco/src/kern/mips/context-mips.cpp)
This final, architecture-specific method saves a few registers on the stack belonging to the current thread, doing so because we are now going to switch away from that thread. One of these registers ($31 or $ra) holds the return address which, by convention, is the address the processor jumps to after returning from a subroutine, function, method or something of that sort. Here, the return address, before it is stored, is set to a label just after a jump (or branch) instruction that is coming up. The flow of the code is such that we are to imagine that this jump will occur and then the processor will return to the label immediately following the jump:
- Set return address register to refer to the instruction after the impending jump
- Store registers on the stack
- Take the jump…
- Continue after the jump
However, there is a twist. Just before the jump is taken, the stack pointer is switched to one belonging to another thread. On top of this, as the jump is taken, the return address is also replaced by loading a value from this other thread’s stack. The processor then jumps to the Context::switchin_context method. When the processor encounters the end of the generated code for this method, it encounters an instruction that makes it jump to the address in the return address register. Here is a more accurate summary:
- Set return address register to refer to the instruction after the impending jump
- Store registers on the stack
- Switch the stack pointer, referencing another thread’s stack
- Switch the return address to the other thread’s value
- Take the jump…
Let us consider what might happen if this other thread was, in fact, the one we started with: the original kernel thread. Then the return address register would now hold the address of the label after the jump. Control would now pass back into the Context::switch_cpu method, and back out of that stack of invocations shown above, going from bottom to top.
But this is not what happens at the moment: the return address is something that was conjured up for a new thread and execution will instead proceed in the place indicated by that address. A switch occurs, leaving the old thread dormant while a new thread starts its life somewhere else. The problem I now faced was figuring out where the processor was “returning” to at the end of Context::switchin_context and what was happening next.
Following the Threads
I already had some idea of where the processor would end up, but inserting some debugging code in Context::switchin_context and reading the return address from the $ra register allowed me to see what had been used without chasing down how the value had got there in the first place. Then, there is a useful tool that can help people like me find out the significance of a program address. Indeed, I had already used it a few times by this stage: the sibling of objdump known as addr2line. Here is an example of its use:
mipsel-linux-gnu-addr2line -e mybuild/fiasco.debug 814165f0
This indicated that the processor was “returning” to the Thread::user_invoke method (in kernel/fiasco/src/kern/mips/thread-mips.cpp). In fact, looking at the Thread::Thread constructor function, it becomes apparent that this information is specified using the Context::prepare_switch_to method, setting this particular destination up for the above activation “switch” operation. And here we encounter some more architecture-specific tricks.
One thing that happens of importance in user_invoke is the disabling of interrupts. But if sigma0 is to be activated, they have to be enabled again so that sigma0 is then interrupted when a timer interrupt occurs. And I couldn’t find how this was going to be achieved in the code: nothing was actually setting the interrupt enable (IE) flag in the status register.
The exact mechanism escaped me for a while, and it was only after some intensive debugging interventions that I realised that the status register is also set up way back in the Thread::Thread constructor function. There, using the result of the Cp0_status::status_eret_to_user_ei method (found in kernel/fiasco/src/kern/mips/cp0_status.cpp), the stored status register is set for future deployment. The status_eret_to_user_ei method initially looks like it might do things directly with the status register, but it just provides a useful value for the point at which control is handed over to a “user space” program.
And, indeed, in the routine called ret_from_user_invoke (found in kernel/fiasco/src/kern/mips/exception.S), implemented by a pile of macros, there is a macro called restore_cp0_status that finally sets the status register to this useful value. A few instructions later, the “return from exception” instruction called eret appears and, with that, control passes to user space and hopefully some valid program code.
Finding the Fault
I now wondered whether the eret instruction was “returning” to a valid address. This caused me to take a closer look at the data structure, Entry_frame (found in kernel/fiasco/src/kern/mips/entry_frame-mips.cpp), used to hold the details of thread execution states. Debugging in user_invoke, by invoking the ip method on the current context (itself obtained from the stored stack information) yielded an address of 0x2000e0. I double-checked this by doing some debugging in the different macros implementing the routine returning control to user space.
Looking around, the Makefile for sigma0 (found as l4/pkg/l4re-core/sigma0/server/src/Makefile) provided this important clue as to its potential use of this address:
DEFAULT_RELOC_mips := 0x00200000
Using our old friend objdump on the sigma0 binary (found as mybuild/bin/mips_32/l4f/sigma0), it was confirmed that 0x2000e0 is the address of the _start routine for sigma0. So we could at least suppose that the eret instruction was passing control to the start of sigma0.
I had a suspicion that some instruction was causing an exception but not getting handled. But I had checked the generated code already and found no new unsupported instructions. This now meant the need to debug an unhandled exception condition. This can be done with care, as always, in the assembly language file containing the various “bare metal” handlers (exception.S, mentioned above), and such an intervention was enough to discover that the cause of the exception was an invalid memory access: an address exception.
Now, the handling of such exceptions can be traced from the handlers into the kernel. There is what is known as a “vector table” which lists the different exception causes and the corresponding actions to be taken when they occur. One of the entries in the table looks like this:
ENTRY_ADDR(slowtrap) # AdEL
This indicates that for an address exception upon load or instruction fetch, the slowtrap routine will be used. And this slowtrap routine itself employs a jump to a function in the kernel called thread_handle_trap (found in kernel/fiasco/src/kern/mips/thread-mips.cpp). Here, unsurprisingly, I found that attempts to handle the exception would fail and that the kernel would enter the Thread::halt method (in kernel/fiasco/src/kern/thread.cpp). This was a natural place for my next debugging intervention!
I now emitted bit patterns for several saved registers associated with the thread/context. One looked particularly interesting: the stored exception program counter which contained the value 0x2092ac. I had a look at the code for sigma0 using objdump and saw the following (with some extra annotations added for explanatory purposes):
209290: 3c02fff3 lui v0,0xfff3 209294: 24422000 addiu v0,v0,8192 // 0xfff32000 209298: afc2011c sw v0,284(s8) 20929c: 7c02e83b 0x7c02e83b // rdhwr v0, $29 (ULR) 2092a0: afc20118 sw v0,280(s8) 2092a4: 8fc20118 lw v0,280(s8) 2092a8: 8fc30158 lw v1,344(s8) 2092ac: 8c508ff4 lw s0,-28684(v0) // 0xfff2aff4
Of particular interest was the instruction at 0x20929c, which I annotated above as corresponding to our old favourite, rdhwr, along with some other key values. Now, the final instruction above is where the error occurs, and it is clear that the cause is the access to an address that is completely invalid (as annotated). The origin of this address information occurs in the second instruction above. You may have realised by now that the rdhwr instruction was failing to set the v0 (or $v0) register with the value retrieved from the indicated hardware register.
How could this be possible?! Support for this rdhwr variant was already present in Fiasco.OC, so could it be something I had done to break it? Perhaps the first rule of debugging is not to assume your own innocence for any fault, particularly if you have touched the code in the recent past. So I invoked my first strategy once again and started to look in the “reserved instruction” handler for likely causes of problems. Sure enough, I found this:
ASM_INS $at, zero, 0, THREAD_BLOCK_SHIFT, k1 # TCB addr in $at
I had introduced a macro for the ins instruction that needed a temporary register to do its work, it being specified by the last argument. But k1 (or $k1) still holds the instruction that is being inspected, and it gets used later in the routine. By accidentally overwriting it, the wrong target register gets selected, and then the hardware register value is transferred to the wrong register, leaving the correct register unaffected. This is exactly what we saw above!
What I meant to write was this:
ASM_INS $at, zero, 0, THREAD_BLOCK_SHIFT, k0 # TCB addr in $at
We can afford to overwrite k0 (or $k0) since it gets overwritten shortly after, anyway. And sure enough, with this modification, sigma0 started working, with control being returned to the kernel after its activation.
Returning to User Space
Verifying that the kernel boot process completed was slightly tricky in that further debugging interventions seemed to be unreliable. Although I didn’t investigate this thoroughly, I suspect that with sigma0 initialised, memory may have been cleared, and this clearing activity erases the descriptor structure used by the LCD peripheral. So it then becomes necessary to reinitialise the peripheral by choosing somewhere for the descriptor, writing the appropriate members, and enabling the peripheral again. Since we might now be overwriting other things critical for the system’s proper functioning, we cannot then expect execution to continue after we have written something informative to the framebuffer, so an infinite loop deliberately hangs the kernel and lets us see our debugging output on the screen.
I felt confident that the kernel was now booting and going about its normal business of switching between threads, handling interrupts and exceptions, and so on. For testing purposes, I had chosen the “hello” example as a user space program to accompany the kernel in the deployed payload, but this example is useless on the Ben without connecting up the serial console connection, which I hadn’t done. So now, I needed to start preparing examples that would actually show something on the screen, working towards running the “spectrum” example provided in the L4Re distribution.