21 July, 2024

Adding system calls.

Part 6 Adding system calls

The focus of the previous posts has been on the implementation of the Owl-2820 CPU. However, a CPU doesn’t work in isolation. It works in combination with its execution environment, which is essentially the machine or mechanisms that it interacts with to get things done.

In the case of the Owl-2820 CPU, the execution environment is not physical - it is not implemented by hardware. The Owl-2820 execution environment is our host program. In other words, our humble host program is an implementation of an Owl-2820 virtual machine, or VM.

In the previous post we gave our Owl-2820 VM the ability to handle memory. In this post, we’re going to give it the ability to do I/O.

To do that, we need to address the question of how programs running on an Owl-2820 VM will do I/O. How will they get input from the user? How will they read data into memory from a device such as a disk? How will they write data back to such a device? How will they display information on the screen? How will they make LEDs blink on and off?

There are a number of ways of doing all of these things, but in our implementation we’re going to do it by means of system calls. The code for that is going to turn out to be surprisingly easy. But first, let’s go into some background as to why we might need system calls in the first place.

Performing I/O

I/O on a single process system

If the Owl-2820 CPU was a real, physical CPU or a microcontroller, then it would have physical interfaces to do I/O. For example, making a character appear on a screen might be a case of writing a value to an address in a screen buffer whose contents are written to the screen. Similarly, reading characters from a keyboard might be a case of polling some memory locations to see which bits are set and comparing them against previously saved values to detect key presses and key releases, then translating the results into keystrokes.

To a large extent, this is how the 8-bit and 16-bit microcomputers of the 80s worked. For example, if you wanted to put a character onto the screen of a TRS-80, which was a microcomputer popular in the late 70s and early 80s, then you’d poke the relevant ASCII value into the 1K screen buffer starting at address 15360. If you wanted to know if the space bar was currently pressed, then you’d read the 8-bit value at address 14400 and check if bit 7 was set. Note that these were all physical memory addresses as these machines did not have virtual memory.

Early PCs running MS-DOS rather than Windows were similar. If you wanted characters to appear on the screen then you’d set the screen into text mode, then poke those characters into memory at a particular address - once you’d got your head around the byzantine horror that was the segmented memory architecture of the 8086. If you wanted pixel graphics, then you’d make a BIOS call to switch the display into graphics mode, then you’d set and clear bits in the screen buffer to change the colours of pixels. Yes, you could make a DOS call to display a string, or make a BIOS call to set a pixel, but this was far slower than doing it directly.

These old systems were essentially single user, single process. If you wrote a program for a PC running MS-DOS, then you could safely assume that your program had full control over nearly everything on the system, and that it wouldn’t have to compete with other processes for access to resources such as the screen and the keyboard.

Owl-2820 as a single process system

For the Owl-2820, we could simulate this quite easily. We could simulate a memory mapped screen by reserving part of the address space for a screen buffer, then have our host program interpret the contents of that buffer as a bitmap and display it in a window. We could handle keyboard input by writing keycodes from the keyboard into a buffer for our Owl-2820 program to read, then setting a bit in a virtual I/O register to tell it that there are keystrokes available in the buffer.

This would be workable if we only wanted the Owl-2820 VM to be a single user, single process system. Any program running on this system would be able to assume unfettered access to its resources, so if it wanted to, say, read from an I/O register, then it could safely assume that it was the only program doing so and the data won’t be lost because another program read it first.

Such an approach would be perfectly acceptable if we wanted to use the Owl-2820 VM to create a virtual console such as the PICO-8, or the TIC-80. But it would not be a suitable approach if we wanted our Owl-2820 VM to be capable of running a multi process system.

I/O on a multi process system

What if we wanted to run more than one program at once? We take this for granted on desktops and servers with Windows, Linux, and macOS, and on our mobile phones with Android and iOS. That was most definitely not the case on those old systems, as they had one very big limitation. They were single process. That is, they could only run one program at a time.

If you were using a program such as a word processor on an MS-DOS based PC, then that word processor would be the only program running on that machine. To switch to another program, you would have to stop the current program and start the new one.

Here’s an example to make it clear how limiting this was.

If you were editing a document in a word processor and you realised that you needed to open a spreadsheet to get the results of something you’d saved previously, then you would have to exit the word processor, start the spreadsheet program, load the results that you’d saved previously and maybe print them out or scribble them down on a piece of paper. Then you would exit the spreadsheet program, start the word processor, reload your document, and only then would you be able to record the results in your document.

For those of you thinking, “Why didn’t you just copy it to the clipboard and paste it into the word processor?”… there was no clipboard. There was no copy-paste.

Similarly, if you wanted to do a simple mathematical calculation while running your word processor, then you would probably reach for your hand-held calculator, because even switching to a calculator program was beyond the capabilities of MS-DOS. The only exception was if you were also running a program such as the original versions of Borland Sidekick which firstly used a clever trick known as a TSR (Terminate and Stay Resident) that allowed it to stay in memory, and secondly intercepted the keyboard-handling interrupt so that it could be activated with a hot-key.

What ran on those old PCs and home micros was barely an operating system by today’s standards. Modern operating systems such as Windows, Linux, and MacOS support multiple programs at once. This is clearly a benefit for us, the users, because now we can switch programs without first having to stop one and load the other. We don’t think twice about running email, instant messaging, a spreadsheet, a web browser, and a multitude of other programs all at once. Even now, as I write this on my PC, I have a command prompt open, I have two instances of Visual Studio Code running, I’m running a browser with several tabs open, and I’m playing some music in the background in one of those browser tabs. None of this would have been possible on a PC of the 80s.

However, this increase in flexibility also comes with a decrease in what any given process is permitted to do. What would happen if my command prompt decided that it wanted to steal all of the input, so whenever I typed something it would only appear in the command window? What would happen if a rogue plugin in VS Code decided that it wanted to fill the screen with emojis? Worse still, what would happen if VS Code could read any memory on the system and a malicious plugin decided to read passwords from my password manager?

Clearly, with multiple processes, not only is there competition for resources but proceses have to be protected from each other, and given access to only the resources that they have requested. And even then, access to those resources is limited by what the process is allowed to do with them. For example, VS Code running on my Windows machine isn’t allowed to stop and start Windows services, or unload device drivers.

In short, one of the roles of a multi process operating system is sharing resources of the system between processes.

Owl-2820 as a multi process system with a kernel

If we want to scale things up so that Owl-2820 supports multiple processes, then we need some way of sharing the resources of the system between those processes. As we’ve just seen, that’s one of the roles of the operating system.

Typically, processes gain access to resources via system calls (or syscalls), that switch the CPU from a low-privilege user mode which has no direct access to resources, into a high-privilege machine mode or supervisor mode used by the kernel, which does.

Most processes running on a system are user mode processes and have no direct access to resources. System calls are a mechanism by which the high-privilege kernel provides services to low-privilege user mode processes to give them access to those resources in a controlled manner.

On the Owl-2820 CPU, a syscall is implemented by setting a register to the syscall number, then using the ecall instruction to invoke the syscall itself.

In the the example below, the syscall number is held in the a7 register, and we have the following syscalls available to us.

Syscall (number in a7)	Description
0	Exit the process. a0 contains the exit status.
1	Displays a string to stdout. a0 contains the address of the string to display.

The example program makes two syscalls. It uses syscall 1 to display a string, then it exits the process by invoking syscall 0.

    ; display a message using (syscall 1)
    lui     a0, %hi(message)
    addi    a0, a0, %lo(message)    ; a0 contains the address of `message`
    li      a7, 1                   ; a7 is the system call number
    ecall                           ; ask the kernel to perform the syscall to display the message

    ; exit (syscall 0)
    li      a0, 0                   ; a0 is the exit code
    li      a7, 0                   ; a7 is the system call number
    ecall                           ; ask the kernel to stop this process

message:
    .string "This program cannot be run in DOS mode.\n\n"

That’s what it looks like from user mode. But what does it look like from the kernel? How would we go about implementing syscalls in the Owl-2820 VM?

One option is to write a simple kernel in Owl-2820 machine code. To do this, we would have to give our Owl-2820 CPU a machine mode which has access to all of the machine’s resources, and a user mode that can only get access to those resources by requesting them from machine mode via a syscall.

To implement this approach, we would have to write the kernel in Owl-2820 assembly language as we don’t have a compiler yet, and provide emulated virtual devices so that the kernel could access resources on the host machine. As this would be a VM then even the kernel wouldn’t actually have access to the resources of the host system except where the host made them available so we would also have to define how that worked.

However interesting this might be to implement, it is a lot of work to write a kernel, and we’re not ready for that yet.

Implementing system calls in the host program

There is another, simpler option, which is to implement system calls directly in the host program. In other words, rather than implementing and emulating a kernel that provides syscalls, we can adapt the host program to carry out this role.

Implementing system calls in the Owl-2820 VM

What would that look like? Well, we already made a start on this in part 5 when we implemented print_fib by hard-coding ecall to print a message. In effect, we implemented a system that supports a single syscall.

    while (!done)
    {
        // ...

        switch (opcode)
        {
        case Opcode::Ecall: {
            std::cout << std::format("fib({}) = {}\n", x[a0], x[a1]);
            break;
        }

        // ...

We can easily extend this to handle multiple system calls. We’ll stick with the convention that we used in the previous example in which the syscall number is passed in register a7, as that means that we can stick to the calling convention from part 4 without having to shuffle the arguments for any function that invokes a syscall.

Here’s what that might look like, implemented for Exit and Print.

    while (!done)
    {
        // ...

        switch (opcode)
        {
        case Opcode::Ecall: {
            const auto syscall = Syscall(x[a7]);
            switch (syscall)
            {
            case Syscall::Exit: {
                std::cout << std::format("Exiting with status {}\n", x[a0]);
                done = true;
                break;
            }
            case Syscall::Print: {
                // For illustration. We'd obviously be a lot more cautious, because we don't want to
                // accept just any data from user mode.
                char* message = reinterpret_cast<char*>(memory.data() + x[a0]);
                std::cout << message;
                break;
            }
            break;
        }

        // ...

This is far easier than implementing a kernel in Owl-2820 assembly language. It also gives us a way of bootstrapping capabilities that our Owl-2820 VM can’t do yet because we can backfill with a syscall. Indeed, that’s exactly what we’ve been doing with our print_fib subroutine.

Code changes

The code changes to implement syscalls are genuinely straightforward.

I’ve modified the main interpreter loop of the Owl-2820 CPU to implement Exit and PrintFib as system calls, as follows.

    while (!done)
    {
        // ...

        switch (opcode)
        {
        case Opcode::Ecall: {
            const auto syscall = Syscall(x[a7]);
            switch (syscall)
            {
            case Syscall::Exit:
                std::cout << std::format("Exiting with status {}\n", x[a0]);
                done = true;
                break;

            case Syscall::PrintFib:
                std::cout << std::format("fib({}) = {}\n", x[a0], x[a1]);
                break;
            }
            break;
        }

I’ve changed the assembly language so that it invokes the Exit syscall to stop the VM, rather than doing so by emitting an illegal instruction.

    Label main = a.MakeLabel();
    a.Call(main);

    // Invoke the `Exit` syscall. There's no coming back from this, so there's no `ret`
    a.Li(a7, Syscall::Exit);    // li      a7, 0
    a.Ecall();                  // ecall

And the print_fib subroutine now invokes the PrintFib syscall.

// print_fib:
    a.BindLabel(print_fib);
    
    // Invoke the `PrintFib` syscall.
    a.Li(a7, Syscall::PrintFib);// li       a7, 1
    a.Ecall();                  // ecall

    a.Ret();                    // ret                             ; return from print_fib

And that’s it.

Summary

Much of this post has been spent explaining the need for syscalls in our VM, and why we would want to implement them directly in the host program rather than writing a kernel. The implementation that we have come up with is surprisingly simple, yet it puts us in a good position for running multiple processes on our Owl-2820 VM.

However, we haven’t yet implemented all of the Owl-2820 instruction set. That will be the topic of the next post.

Full code

Here is a link to the full code for this post on the inimitable and excellent Compiler Explorer.