k23 - Experimental WASM Operating System

Welcome to the official k23 manual! This manual will guide you through the installation and usage of k23, an experimental WASM microkernel operating system. GitHub repo

Presentations

k23 & Wasm in Operating Systems @ WasmIO 2025

k23 & Rust Operating Systems Development @ Rust Dortmund Meetup

k23 is an active research project exploring a more secure, modular, and easy to develop for operating system by using WebAssembly as the primary execution environment. The project is still in its early stages and is not yet ready for production use.

Why?

As the world has changed, so has the way we interact with computers. When UNIX was invented in the 1960s, the world was a very different place. Time-sharing, the concept of multiple users sharing a single computer, was the hot new thing and having a wold-spanning connected system was a pipe dream. And while countless people have worked incredibly hard to adapt the old systems to the new world, it is clear that the old systems are not up to the task.

In todays massively interconnected world, where security is paramount maybe, just maybe, there is an opportunity for a new OS to rethink how we can build secure, scalable and understandable systems for the 21st century.

How?

k23 is built around the idea of using WebAssembly as the primary execution environment. This allows for a number of benefits:

Security: WebAssembly is designed to run in a sandboxed environment, making it much harder to exploit.
Modularity: WebAssembly modules can depend on each other, importing and exporting functionality and data, forming a modular system where dependency management is a first class citizen.
Portability: WebAssembly is designed to be very portable. Forget questions like "is this binary compiled for amd64 or arm?". k23 programs just run wherever.
Static Analysis: WebAssembly is famous for being very easy to analyze. This means we can check for bad programs without even running them.

k23 also uses a microkernel architecture where only the most core kernel functionliaty and WASM runtime are running in privileged mode. Everything else is implemented as a WebAssembly module, running in a strongly sandboxed environment.

The JIT compiler

The core thesis of k23 is that by directly integrating the compiler into the kernel, they enter into a symbiotic relationship where e.g. the kernels knowledge of the physical machine can inform specific optimization in the compiler and the total knowledge of all programs running on the system by the compiler can inform various sppedups in the kernel. Cool stuff that only becomes possible because os this is:

Zero-cost IPC calls. By leveraging the total knowledge of all programs the kernel can reduce the cost of IPC calls to almost the cost of regular function calls.
Machine specific optimizations The kernel knows the exacts capability of the machine, of each core and much more. Being tightly integrated allows for these details to feed into compiler optimization passes.
Program aware scheduling The compiler collects information about each program such as instruction use, information about possibly hot loops etc. This information can be fed back into the scheduler to allow it to make more informed decisions, like using performance cores vs efficiency cores.

k23 uses cranelift as its JIT compiler backend.

How to Build and Run K23

Prerequisites:

The following tools are required to build and run k23:

Rust - k23 is written entirely in Rust
just - Just is the simple command runner that k23 uses
QEMU - QEMU used to run the kernel in a virtual machine
Nix OPTIONAL - Nix is used to manage the development environment

Running

Type just to see the available actions to run. The one you are probably looking for is just run-riscv64 which will build k23 for riscv64 and run it inside QEMU. Note that this is currently just running a few basic tests and exits. Other actions include:

just preflight which will run all lints and checks
just run which will run k23 in QEMU

Debugging k23

The rest of this guide assumes you are using LLDB, but the same principles apply to GDB and "command translation guides" are available online.

Debug logging

The kernel uses the tracing to produce the kernel debuglog (as well as span information). In order to emit messages to this debuglog you should use the following macros:

#![allow(unused)]
fn main() {
fn function() {
    tracing::trace!("Trace");
    tracing::debug!("Debug");
    tracing::info!("Info");
    tracing::warn!("Warn");
    tracing::error!("Error");
}
}

Note that the log macros will work as well, but that support only exists to capture output from 3rd party crates, kernel code should generally use tracing.

The debuglog will be printing to the semihosting STDOUT at the moment.

Filtering

By default, the debuglog will only print messages of severity DEBUG and higher (i.e. DEBUG, INFO, WARN, and ERROR), but this can be filtered and configured using the same syntax as tracings EnvFilter, by passing the log boot argument.

For example, to enable all levels you can pass this directive in the log boot argument:

cargo xtask qemu profile/riscv64/qemu.toml -- --append "log=trace"

A more reasonable configuration that omits the quite verbose output from cranelift but otherwise keeps the trace logging:

cargo xtask qemu profile/riscv64/qemu.toml -- --append "log=trace,cranelift_codegen=off"

Attaching to the Kernel

You can run the kernel with the --debug or --dbg (or --gdb for typos) flag to start the kernel in a paused state. You can then launch and attach to the kernel with LLDB using the following commands:

rust-lldb target/riscv64gc-unknown-k23-kernel/debug/kernel

# In LLDB
gdb-remote localhost:1234

Catching Panics

Quite often, you will need to stop the kernel when a panic occurs, to inspect the state of the system. For this you can set a breakpoint on the rust_panic symbol which is a special unmangled function for exactly this purpose (this technique mirrors Rusts std library and is implemented in the kstd crate here).

Using LLDB you can set a breakpoint with the following command:

b rust_panic

and then use e.g. the bt command to print a backtrace.

Pretty Printing

To make debugging easier, you can add pretty printers for the vmm::PhysicalAddress and vmm::VirtualAddress types. This can be done by through the following commands in LLDB:

type summary add --summary-string "vmm::PhysicalAddress(${var.0%x})" vmm::PhysicalAddress
type summary add --summary-string "vmm::VirtualAddress(${var.0%x})" vmm::VirtualAddress

Boot Arguments

Boot arguments configure various aspects of the kernels behaviour. They read from the /chosen/bootargs property of the flattened device tree that is passed to the kernel by the previous stage bootloader.

The format is a simple key=value;key=value;.. list of semicolon separated of key-value pairs.

`log`

Allows configuring the verbosity and filtering of debug messages.

# Enable the most verbose logging messages
cargo xtask qemu profile/riscv64/qemu.toml -- --append "log=trace"
# A more reasonable configuration that keeps trace messages enabled, but silences the very spammy ones
cargo xtask qemu profile/riscv64/qemu.toml -- --append "log=trace,cranelift_codegen=off,ksharded_slab=off"

`backtrace`

Allows configuring the verbosity of kernel panic backtraces. There are two possible values: short (default) and full. short will print an abridged backtrace that omits frames related to the unwinding and panic machinery itself.

# To print shorter panic backtraces (the default)
cargo xtask qemu profile/riscv64/qemu.toml -- --append "backtrace=short"
# To print more verbose panic backtraces
cargo xtask qemu profile/riscv64/qemu.toml -- --append "backtrace=full"

Overview of k23's Architecture

k23 has 3 main components: The bootloader that is responsible for loading the kernel, the kernel itself, which is the main operating system, and the WASM runtime, which is responsible for running WebAssembly programs. The last two components are highly intertwined by design.

Bootloader

The bootloader is responsible for loading the kernel, verifying its integrity, decompressing it and setting up the necessary environment. That means collecting earyl information about the system, setting up the stack for each hart, setting up the page tables, and finally jumping to the kernel's entry point.

The bootloader has to be generic over the payloads it accepts, since the kernel is not the only thing that can be loaded. When running tests, each test is compiled as a separate binary and ran in separate VMs. The bootloader has to be able to load these binaries as well.

For this, payloads can declare their entry points and a few options through the loader_api crates #[entry] macro. The bootloader then uses this information to set up the environment for the payload. This macro also enforces a type signature for the entry point, which means that payloads can completely forgo the usual assembly tramploines and just declare a Rust function as their entry point.

Kernel

The kernel is relatively minimal at the moment, and as a microkernel will likely stay that way. Much of the kernels functions, such as memory management, syscalls etc. are implemented in the runtime. This leaves only the most basic functions in the kernel, such as interrupt handling, physical memory management and the like.

WASM Runtime

The WASM runtime is the heart of k23, it is responsible for running WebAssembly programs. It is not a standalone crate, but implemented as part of the kernel since it is so core to the system. The runtime uses the wasmparser and cranelift crates to parse and compile the WASM programs.

Currently, the runtime is quite simple, it only supports the most basic WASM instructions and features.

TODO this section will expand with more info.

System Startup

Loader

k23 uses a two stage boot flow, with a smaller loader stage before the actual kernel. The loader is not a full bootloader, its only responsibility is mapping the kernel ELF file into virtual memory and jumping to it. This includes mapping the thread-local-storage (TLS) blocks declared by the kernel, as well as a stack for each detected CPU.

A compiled loader executable contains an inlined copy of the k23 kernel. It is essentially a self-extracting executable.

There are three primary reasons for splitting into the early boot phase into the separate loader executable.

Simplified kernel startup code. The kernel entrypoint is just a simple extern "C" fn _start() function which gets called by the loader. This lets us focus on the important kernel startup sequence, which is complex enough.
Clean physmem to virtmem transition. When the loader starts, the MMU is disabled, and we're running in physical memory mode. During startup, we have to enable the MMU and switch into virtual memory mode; Doing so however, will invalidate all pointers into physical memory (unless we have identity-mapped that memory first!). This includes memory containing instructions or even the stack we use for function calls. Having one executable that deals primarily with physical memory (the loader) and one that only ever deals with virtual memory (the k23 kernel) makes it much harder to make stupid mistakes.
Easy early boot cleanup. Usually boot code (and data) is only ever used during startup. This means that during runtime our kernel would be carrying around dead memory - memory which is taken up by our ELF file but never accessed! As you can see below, splitting this boot code into a separate executable lets us easily reclaim this memory for runtime usage.

When the system resets, it will jump to a fixed address in (physical) memory. The exact address depends on the CPU, but in the diagram below it is 0x1234. The loader has been loaded at this address into memory by a previous stage bootloader.

boot 01

The loader will begin to set up the MMU page translation tables in preparation for the switch to virtual memory. This means identity mapping itself so it can continue to run after the switch, but more importantly mapping the kernel ELF file into the higher-half of virtual memory where it expects to live. During this phase the loader will also map TLS ( thread-local storage) blocks and stacks for each detected CPU.

boot-02

At this point the loader has done its job and will hand off control to the k23 kernel. It does that by jumping to the ELF entrypoint address it had parsed during the mapping phase. Loader will pass along all information it has collected about the system through the BootInfo struct.

boot-03

Now that we're inside the kernel, we need to - before moving on with startup - reclaim the now unused loader memory. We remove the region from the MMUs page translation tables and then add the now unused memory regions to our pool of available physical memory. We need to be very careful though to not add the kernel ELFs physical memory to that pool. Yes, it was part of the loader image, but it is still in use! We want to surgically reclaim all memory around it though.

boot-04

Finally, we're all up and running. The kernel is correctly mapped and the loader fully reclaimed. At this point we can move on with the system startup.

boot-06

Kernel Startup

The k23 startup is split into three phases: Early, Main, and Late. Additionally, there is Per-CPU and Global initialization.

	Per-CPU	Global
Early	`arch::per_cpu_init_early`	Rng & DeviceTree init
Main	N/A	See below
Late	`arch::per_cpu_init_late`	See below

During early startup phase we mainly just set up the root random number generator which is local to each CPU. We then call arch::per_cpu_init_early which performs CPU-specific resets like setting the initial FPU state and enabling performance counters.

The main chunk of Global initialization work happens during the Main phase. This includes setting up the tracing infrastructure, kernel heap allocator, device-tree, as well as the frame-allocator and virtual memory subsystem. This will also set up the global state for the scheduler and the WebAssembly-runtimes global Engine state.

Lastly, during Late startup arch::per_cpu_init_late sets up the exception handler, enables interrupts, and initializes the timer & IRQ drivers for each CPU.

At this point we are done with startup and enter the scheduling loop on each CPU.

Address Space Layout Randomization

Address-space layout randomization is a security technique that - as the name implies - randomizes the placement of various objects in virtual memory. This well known technique defends against attacks such as return-oriented programming (ROP) where an attacker chains together instruction sequences of legitimate programs (called "gadgets") to archive privilege escalation.

Randomizing the placement of objects makes these techniques much harder since now an attacker has to correctly guess the address from a potentially huge number of possibilities.

KASLR

k23 randomizes the location of the kernel, stacks, TLS regions and heap at boot time.

ASLR in k23

k23 implements more advanced userspace ASLR that other operating systems, it not only randomizes the placement of WASM executable code, tables, globals, and memories; but also the location of individual WASM functions at each program startup (a similar technique is used by the Linux kernel called function-grained kernel address space layout randomization (FGKASLR))

TODO explain more in detail

ASLR Entropy

Entropy determines how "spread out" allocations are in the address space higher values mean a more sparse address space, this is configured through the entropy_bits option (TODO).

Ideally the number would be as high as possible, since more entropy means harder to defeat ASLR. However, a sparser address space requires more memory for page tables and a higher value for entropy means allocating virtual memory takes longer (more misses the search function that searches for free gaps). The maximum entropy value also depends on the target architecture and chosen memory mode:

Architecture	Virtual Address Usable Bits	Max Entropy Bits
Riscv32 Sv32	32	19
Riscv64 Sv39	39	26
Riscv64 Sv48	48	35
Riscv64 Sv57	57	44
x86_64	48	35
aarch64 3 TLB lvls	39	26
aarch64 4 TLB lvls	48	35

In conclusion, the best value for entropy_bits depends on a lot of factors and should be tuned for best results trading off sparseness and runtime complexity for better security.

Note also that for e.g. Riscv64 Sv57 it might not even be desirable to use all 44 bits of available entropy since the address space itself is already huge and performance might degrade too much.

RISC-V

This section describes RISC-V specific details of k23.

Virtual Memory Layout on RISC-V

This page outlines the virtual memory layout used by k23 depending on the selected memory mode. Currently supported memory modes are Riscv64Sv39, Riscv64Sv48 and Riscv64Sv57. Note that addresses marked as <dynamic> are not fixed and depend on the number of harts (hardware threads) in the system.

The code implementing this memory layout can be found in loader/src/mapping.rs.

Sv39

Address Range	Size	Description
0x0000000000000000..=0x0000003fffffffff	256 GB	user-space virtual memory
0x0000004000000000..=0xffffffbfffffffff	~16K PB	hole of non-canonical virtual memory addresses
		kernel-space virtual memory
0xffffffc000000000..=<dynamic>	~96 GB	unused
<dynamic>..=<dynamic>	<dynamic>	kernel stacks
<dynamic>..=0xffffffd7ffffffff	<dynamic>	kernel TLS (thread local storage)
0xffffffd800000000..=0xffffffe080000000	124 GB	direct mapping of all physical memory (PHYS_OFFSET)
0xffffffff80000000..=0xffffffffffffffff	2 GB	kernel (KERN_OFFSET)

Sv48

Address Range	Size	Description
0x0000000000000000..=0x00007fffffffffff	128 TB	user-space virtual memory
0x0000800000000000..=0xffff7fffffffffff	~16K PB	hole of non-canonical virtual memory addresses
		kernel-space virtual memory
0xffff800000000000..=0xffffbfff7ffefffe	~64 TB	unused
<dynamic>..=<dynamic>	<dynamic>	kernel stacks
<dynamic>..=0xffffbfff7fffffff	<dynamic>	kernel TLS (thread local storage)
0xffffbfff80000000..=0xffffffff7fffffff	64 TB	direct mapping of all physical memory (PHYS_OFFSET)
0xffffffff80000000..=0xffffffffffffffff	2 GB	kernel (KERN_OFFSET)

Sv57

Address Range	Size	Description
0x0000000000000000..=0x00ffffffffffffff	64 PB	user-space virtual memory
0x0100000000000000..=0xfeffffffffffffff	~16K PB	hole of non-canonical virtual memory addresses
		kernel-space virtual memory
0xff00000000000000..=0xff7fffff7ffefffe	~32 PB	unused
<dynamic>..=<dynamic>	<dynamic>	kernel stacks
<dynamic>..=0xff7fffff7fffffff	<dynamic>	kernel TLS (thread local storage)
0xff7fffff80000000..=0xffffffff7fffffff	32 PB	direct mapping of all physical memory (PHYS_OFFSET)
0xffffffff80000000..=0xffffffffffffffff	2 GB	kernel (KERN_OFFSET)

Supported WASM Features & Proposals

This page documents all the WASM features, APIs and proposals that k23 supports. This list will be revised as time progresses and features are implemented.

Standardized Features

These features have been adopted into the WebAssembly standard and k23 aims to support all applicable features.

Features	Status	Tracking Issue
JS BigInt to Wasm i64 integration	N/A
Bulk memory operations	✅
Extended Constant Expressions	❌	#31
Garbage collection	❌	#32
Multiple memories	❌	#33
Multi-value	❌	#34
Mutable globals	✅
Reference types	❌	#35
Relaxed SIMD	❌	#36
Non-trapping float-to-int conversions	✅
Sign-extension operations	✅
Fixed-width SIMD	❌	#37
Tail call	❌	#38
Threads	❌	#39

Proposals

These features are proposals for the WebAssembly standard. Many proposals change very frequently and support for them will range from limited to non-existent. Additionally some proposals may not be applicable to k23.

Features	Status	Tracking Issue
Typed Function References	❌	#44
Custom Annotation Syntax in the Text Format	❌	#45
Branch Hinting	❌	#43
Exception handling	❌	#42
Memory64	❌	#41
Web Content Security Policy	N/A
JS Promise Integration	N/A
Type Reflection for WebAssembly JavaScript API	N/A
ESM Integration	N/A
JS String Builtins	N/A
Relaxed dead code validation	❌	#46
Numeric Values in WAT Data Segments	❌	#47
Instrument and Tracing Technology	?	#48
Extended Name Section	❌	#40
Type Imports	❌
Component Model	❌
WebAssembly C and C++ API	N/A
Flexible Vectors	?
Call Tags	❌
Stack Switching	❌
Constant Time	❌	#50
JS Customization for GC Objects	N/A
Memory control	❌
Reference-Typed Strings	N/A
Profiles	?
Rounding Variants	❌
Shared-Everything Threads	❌
Frozen Values	?
Compilation Hints	❌	#49
Custom Page Sizes	❌	#51
Half Precision	❌
Compact Import Section	?

Explainer

✅: Implemented
❌: Not Implemented
?: The applicability of this feature is unclear, e.g. due to the lack of a detailed proposal.
N/A: Not Applicable

WASI Features & Proposals

In addition to the main WASM features, k23 applications will interact with the host through WASI (WebAssembly System Interface) APIs. The following table lists all current proposals and their implementation status.

Features	Status	Tracking Issue
I/O	❌
Clocks	❌
Random	❌
Filesystem	❌
Sockets	❌
CLI	❌
HTTP	❌
Machine Learning	❌
Clocks: Timezone	X
Blob Store	Not planned
Crypto	❌
Digital I/O	?
Distributed Lock Service	Not planned
I2C	❌
Key-value Store	❌
Logging	❌
Messaging	❌
Observe	❌
Parallel	❌
Pattern Match	?
Runtime Config	?
SPI	?
SQL	?
SQL Embed	N/A
Threads	Not Planned
URL	?
USB	❌
WebGPU	❌

Explainer

✅: Implemented
❌: Not Implemented
?: The applicability of this feature is unclear, e.g. due to the lack of a detailed proposal.
N/A: Not Applicable

Keyboard shortcuts

k23 Manual