k23 - Experimental WASM Operating System

Welcome to the official k23 manual! This manual will guide you through the installation and usage of k23, an experimental WASM microkernel operating system. GitHub repo


Watch my talk at RustNL 2024 about k23

What is k23?

k23 is an active research project exploring a more secure, modular, and easy to develop for operating system by using WebAssembly as the primary execution environment. The project is still in its early stages and is not yet ready for production use.

Why?

As the world has changed, so has the way we interact with computers. When UNIX was invented in the 1960s, the world was a very different place. Time-sharing, the concept of multiple users sharing a single computer, was the hot new thing and having a wold-spanning connected system was a pipe dream. And while countless people have worked incredibly hard to adapt the old systems to the new world, it is clear that the old systems are not up to the task.

In todays massively interconnected world, where security is paramount maybe, just maybe, there is an opportunity for a new OS to rethink how we can build secure, scalable and understandable systems for the 21st century.

How?

k23 is built around the idea of using WebAssembly as the primary execution environment. This allows for a number of benefits:

  • Security: WebAssembly is designed to run in a sandboxed environment, making it much harder to exploit.
  • Modularity: WebAssembly modules can depend on each other, importing and exporting functionality and data, forming a modular system where dependency management is a first class citizen.
  • Portability: WebAssembly is designed to be very portable. Forget questions like "is this binary compiled for amd64 or arm?". k23 programs just run wherever.
  • Static Analysis: WebAssembly is famous for being very easy to analyze. This means we can check for bad programs without even running them.

k23 also uses a microkernel architecture where only the most core kernel functionliaty and WASM runtime are running in privileged mode. Everything else is implemented as a WebAssembly module, running in a strongly sandboxed environment.

The JIT compiler

The core thesis of k23 is that by directly integrating the compiler into the kernel, they enter into a symbiotic relationship where e.g. the kernels knowledge of the physical machine can inform specific optimization in the compiler and the total knowledge of all programs running on the system by the compiler can inform various sppedups in the kernel. Cool stuff that only becomes possible because os this is:

  • Zero-cost IPC calls. By leveraging the total knowledge of all programs the kernel can reduce the cost of IPC calls to almost the cost of regular function calls.
  • Machine specific optimizations The kernel knows the exacts capability of the machine, of each core and much more. Being tightly integrated allows for these details to feed into compiler optimization passes.
  • Program aware scheduling The compiler collects information about each program such as instruction use, information about possibly hot loops etc. This information can be fed back into the scheduler to allow it to make more informed decisions, like using performance cores vs efficiency cores.

k23 uses cranelift as its JIT compiler backend.

How to Build and Run K23

Prerequisites:

The following tools are required to build and run k23:

  • Rust - k23 is written entirely in Rust
  • just - Just is the simple command runner that k23 uses
  • QEMU - QEMU used to run the kernel in a virtual machine
  • Nix OPTIONAL - Nix is used to manage the development environment

Running

Type just to see the available actions to run. The one you are probably looking for is just run-riscv64 which will build k23 for riscv64 and run it inside QEMU. Note that this is currently just running a few basic tests and exits. Other actions include:

  • just preflight which will run all lints and checks
  • just run which will run k23 in QEMU

Debugging k23

The rest of this guide assumes you are using LLDB, but the same principles apply to GDB and "command translation guides" are available online.

Attaching to the Kernel

You can run the kernel with the --debug or --dbg (or --gdb for typos) flag to start the kernel in a paused state. You can then launch and attach to the kernel with LLDB using the following commands:

rust-lldb target/riscv64gc-unknown-k23-kernel/debug/kernel

# In LLDB
gdb-remote localhost:1234

Catching Panics

Quite often, you will need to stop the kernel when a panic occurs, to inspect the state of the system. For this you can set a breakpoint on the rust_panic symbol which is a special unmangled function for exactly this purpose (this technique mirrors Rusts std library and is implemented in the kstd crate here).

Using LLDB you can set a breakpoint with the following command:

b rust_panic

and then use e.g. the bt command to print a backtrace.

Pretty Printing

To make debugging easier, you can add pretty printers for the vmm::PhysicalAddress and vmm::VirtualAddress types. This can be done by through the following commands in LLDB:

type summary add --summary-string "vmm::PhysicalAddress(${var.0%x})" vmm::PhysicalAddress
type summary add --summary-string "vmm::VirtualAddress(${var.0%x})" vmm::VirtualAddress

Overview of k23's Architecture

k23 has 3 main components: The bootloader that is responsible for loading the kernel, the kernel itself, which is the main operating system, and the WASM runtime, which is responsible for running WebAssembly programs. The last two components are highly intertwined by design.

Bootloader

The bootloader is responsible for loading the kernel, verifying its integrity, decompressing it and setting up the necessary environment. That means collecting earyl information about the system, setting up the stack for each hart, setting up the page tables, and finally jumping to the kernel's entry point.

The bootloader has to be generic over the payloads it accepts, since the kernel is not the only thing that can be loaded. When running tests, each test is compiled as a separate binary and ran in separate VMs. The bootloader has to be able to load these binaries as well.

For this, payloads can declare their entry points and a few options through the loader_api crates #[entry] macro. The bootloader then uses this information to set up the environment for the payload. This macro also enforces a type signature for the entry point, which means that payloads can completely forgo the usual assembly tramploines and just declare a Rust function as their entry point.

Kernel

The kernel is relatively minimal at the moment, and as a microkernel will likely stay that way. Much of the kernels functions, such as memory management, syscalls etc. are implemented in the runtime. This leaves only the most basic functions in the kernel, such as interrupt handling, physical memory management and the like.

WASM Runtime

The WASM runtime is the heart of k23, it is responsible for running WebAssembly programs. It is not a standalone crate, but implemented as part of the kernel since it is so core to the system. The runtime uses the wasmparser and cranelift crates to parse and compile the WASM programs.

Currently, the runtime is quite simple, it only supports the most basic WASM instructions and features.

TODO this section will expand with more info.

Boot Flow

k23 uses a two stage boot flow, with a smaller, bootloader stage before the actual kernel. This smaller bootloader is responsible for setting up the environment including virtual memory mapping, TLS (thread local storage) and more.

Loader Boot Flow

Kernel Boot Flow

Address Space Layout Randomization

Address-space layout randomization is a security technique that - as the name implies - randomizes the placement of various objects in virtual memory. This well known technique defends against attacks such as return-oriented programming (ROP) where an attacker chains together instruction sequences of legitimate programs (called "gadgets") to archive privilege escalation.

Randomizing the placement of objects makes these techniques much harder since now an attacker has to correctly guess the address from a potentially huge number of possibilities.

KASLR

k23 randomizes the location of the kernel, stacks, TLS regions and heap at boot time.

ASLR in k23

k23 implements more advanced userspace ASLR that other operating systems, it not only randomizes the placement of WASM executable code, tables, globals, and memories; but also the location of individual WASM functions at each program startup (a similar technique is used by the Linux kernel called function-grained kernel address space layout randomization (FGKASLR))

TODO explain more in detail

ASLR Entropy

Entropy determines how "spread out" allocations are in the address space higher values mean a more sparse address space, this is configured through the entropy_bits option (TODO).

Ideally the number would be as high as possible, since more entropy means harder to defeat ASLR. However, a sparser address space requires more memory for page tables and a higher value for entropy means allocating virtual memory takes longer (more misses the search function that searches for free gaps). The maximum entropy value also depends on the target architecture and chosen memory mode:

ArchitectureVirtual Address Usable BitsMax Entropy Bits
Riscv32 Sv323219
Riscv64 Sv393926
Riscv64 Sv484835
Riscv64 Sv575744
x86_644835
aarch64 3 TLB lvls3926
aarch64 4 TLB lvls4835

In conclusion, the best value for entropy_bits depends on a lot of factors and should be tuned for best results trading off sparseness and runtime complexity for better security.

Note also that for e.g. Riscv64 Sv57 it might not even be desirable to use all 44 bits of available entropy since the address space itself is already huge and performance might degrade too much.

RISC-V

This section describes RISC-V specific details of k23.

Virtual Memory Layout on RISC-V

This page outlines the virtual memory layout used by k23 depending on the selected memory mode. Currently supported memory modes are Riscv64Sv39, Riscv64Sv48 and Riscv64Sv57. Note that addresses marked as <dynamic> are not fixed and depend on the number of harts (hardware threads) in the system.

The code implementing this memory layout can be found in loader/src/mapping.rs.

Sv39

Address RangeSizeDescription
0x0000000000000000..=0x0000003fffffffff256 GBuser-space virtual memory
0x0000004000000000..=0xffffffbfffffffff~16K PBhole of non-canonical virtual memory addresses
kernel-space virtual memory
0xffffffc000000000..=<dynamic>~96 GBunused
<dynamic>..=<dynamic><dynamic>kernel stacks
<dynamic>..=0xffffffd7ffffffff<dynamic>kernel TLS (thread local storage)
0xffffffd800000000..=0xffffffe080000000124 GBdirect mapping of all physical memory (PHYS_OFFSET)
0xffffffff80000000..=0xffffffffffffffff2 GBkernel (KERN_OFFSET)

Sv48

Address RangeSizeDescription
0x0000000000000000..=0x00007fffffffffff128 TBuser-space virtual memory
0x0000800000000000..=0xffff7fffffffffff~16K PBhole of non-canonical virtual memory addresses
kernel-space virtual memory
0xffff800000000000..=0xffffbfff7ffefffe~64 TBunused
<dynamic>..=<dynamic><dynamic>kernel stacks
<dynamic>..=0xffffbfff7fffffff<dynamic>kernel TLS (thread local storage)
0xffffbfff80000000..=0xffffffff7fffffff64 TBdirect mapping of all physical memory (PHYS_OFFSET)
0xffffffff80000000..=0xffffffffffffffff2 GBkernel (KERN_OFFSET)

Sv57

Address RangeSizeDescription
0x0000000000000000..=0x00ffffffffffffff64 PBuser-space virtual memory
0x0100000000000000..=0xfeffffffffffffff~16K PBhole of non-canonical virtual memory addresses
kernel-space virtual memory
0xff00000000000000..=0xff7fffff7ffefffe~32 PBunused
<dynamic>..=<dynamic><dynamic>kernel stacks
<dynamic>..=0xff7fffff7fffffff<dynamic>kernel TLS (thread local storage)
0xff7fffff80000000..=0xffffffff7fffffff32 PBdirect mapping of all physical memory (PHYS_OFFSET)
0xffffffff80000000..=0xffffffffffffffff2 GBkernel (KERN_OFFSET)

Supported WASM Features & Proposals

This page documents all the WASM features, APIs and proposals that k23 supports. This list will be revised as time progresses and features are implemented.

Standardized Features

These features have been adopted into the WebAssembly standard and k23 aims to support all applicable features.

Proposals

These features are proposals for the WebAssembly standard. Many proposals change very frequently and support for them will range from limited to non-existent. Additionally some proposals may not be applicable to k23.

Explainer

  • ✅: Implemented
  • ❌: Not Implemented
  • ?: The applicability of this feature is unclear, e.g. due to the lack of a detailed proposal.
  • N/A: Not Applicable

WASI Features & Proposals

In addition to the main WASM features, k23 applications will interact with the host through WASI (WebAssembly System Interface) APIs. The following table lists all current proposals and their implementation status.

Explainer

  • ✅: Implemented
  • ❌: Not Implemented
  • ?: The applicability of this feature is unclear, e.g. due to the lack of a detailed proposal.
  • N/A: Not Applicable