k23 - Experimental WASM Operating System
Welcome to the official k23 manual! This manual will guide you through the installation and usage of k23, an experimental WASM microkernel operating system. GitHub repo
Presentations
k23 @ RustNL 2024
k23 & Wasm in Operating Systems @ WasmIO 2025
k23 & Rust Operating Systems Development @ Rust Dortmund Meetup
What is k23?
k23 is an active research project exploring a more secure, modular, and easy to develop for operating system by using WebAssembly as the primary execution environment. The project is still in its early stages and is not yet ready for production use.
Why?
As the world has changed, so has the way we interact with computers. When UNIX was invented in the 1960s, the world was a very different place. Time-sharing, the concept of multiple users sharing a single computer, was the hot new thing and having a wold-spanning connected system was a pipe dream. And while countless people have worked incredibly hard to adapt the old systems to the new world, it is clear that the old systems are not up to the task.
In todays massively interconnected world, where security is paramount maybe, just maybe, there is an opportunity for a new OS to rethink how we can build secure, scalable and understandable systems for the 21st century.
How?
k23 is built around the idea of using WebAssembly as the primary execution environment. This allows for a number of benefits:
- Security: WebAssembly is designed to run in a sandboxed environment, making it much harder to exploit.
- Modularity: WebAssembly modules can depend on each other, importing and exporting functionality and data, forming a modular system where dependency management is a first class citizen.
- Portability: WebAssembly is designed to be very portable. Forget questions like “is this binary compiled for amd64 or arm?”. k23 programs just run wherever.
- Static Analysis: WebAssembly is famous for being very easy to analyze. This means we can check for bad programs without even running them.
k23 also uses a microkernel architecture where only the most core kernel functionliaty and WASM runtime are running in privileged mode. Everything else is implemented as a WebAssembly module, running in a strongly sandboxed environment.
The JIT compiler
The core thesis of k23 is that by directly integrating the compiler into the kernel, they enter into a symbiotic relationship where e.g. the kernels knowledge of the physical machine can inform specific optimization in the compiler and the total knowledge of all programs running on the system by the compiler can inform various speedups in the kernel. Cool stuff that only becomes possible because os this is:
- Zero-cost IPC calls. By leveraging the total knowledge of all programs the kernel can reduce the cost of IPC calls to almost the cost of regular function calls.
- Machine specific optimizations The kernel knows the exacts capability of the machine, of each core and much more. Being tightly integrated allows for these details to feed into compiler optimization passes.
- Program aware scheduling The compiler collects information about each program such as instruction use, information about possibly hot loops etc. This information can be fed back into the scheduler to allow it to make more informed decisions, like using performance cores vs efficiency cores.
k23 uses cranelift as its JIT compiler backend.
How to Build and Run K23
Prerequisites
Building k23 needs a lot of tools (rustc, buck2, qemu, and more) at specific versions 1. Rather than ask you to chase all of these down by hand, we lean on Nix to pin them all.
This makes Nix the one thing you do have to install yourself. Grab it via the Determinate Installer or upstream nix, and enable flakes (experimental-features = nix-command flakes in ~/.config/nix/nix.conf. The Determinate Installer sets this for you).
Linux and macOS are supported on x86_64 and aarch64; on Windows you’ll want to develop from inside WSL2 as Nix doesn’t run natively there.
Entering the dev shell
With Nix installed, running nix develop -c $SHELL drops you into a shell with every required tool in PATH 2.
Running
Inside the dev shell run just run //sys:k23-qemu-riscv64 which builds k23 for riscv64 and boots it under QEMU.
Type just (no args) to list every recipe available. This includes convenient recipes for running linters, tests and more. Every recipe is a thin wrapper around buck2 (just run //sys:k23-qemu-riscv64 is roughly buck2 run //sys:k23-qemu-riscv64).
-
We’ve had issues with wildly outdated QEMU versions in linux package repositories for example. ↩
-
The
-c $SHELLpart instructs nix to use your current shell binary, otherwise it defaults to bash, yuck. ↩
IDE Setup with rust-analyzer
Skeleton — bullets to expand into prose.
Since we use buck2 to build k23 instead of Cargo, rust-analyzer has nothing to work with by default.
buck2 ships a companion tool called rust-project that walks the buck2 target graph and emits a rust-project.json file that rust-analyzer will automatically load. We provide a convenient just rust-project command for generating this file.
You should run this command periodicially, at least whenever you added or removed a crate or third-party dependency. It should be relatively straightforward to notice though: when autocompletion is broken re-running just rust-project is in order.
All IDEs using rust-analyzer should pick up on this file automatically. If not, raise an issue please.
Debugging k23
Note: the debugging story for k23 is very much a work in progress. The flow described below works, but it is rough around the edges and the steps are still mostly manual. Improvements — better launch ergonomics, pretty printers, an LLDB/GDB init script bundled with the repo — are very welcome; if you’d like to help, please reach out.
The rest of this guide assumes you are using LLDB, but the same principles apply to GDB and “command translation guides” are available online.
Debug logging
The kernel uses the tracing to produce the kernel debuglog (as well as span information).
In order to emit messages to this debuglog you should use the following macros:
#![allow(unused)]
fn main() {
fn function() {
tracing::trace!("Trace");
tracing::debug!("Debug");
tracing::info!("Info");
tracing::warn!("Warn");
tracing::error!("Error");
}
}
Note that the log macros will work as well, but that support only exists to capture output from 3rd party crates, kernel
code should generally use tracing.
The debuglog will be printing to the semihosting STDOUT at the moment.
Filtering
By default, the debuglog will only print messages of severity DEBUG and higher (i.e. DEBUG, INFO, WARN, and ERROR),
but this can be filtered and configured using the same syntax as tracings EnvFilter,
by passing the log boot argument.
For example, to enable all levels you can pass this directive in the log boot argument:
just run //sys:k23-qemu-riscv64 -- --append "log=trace"
A more reasonable configuration that omits the quite verbose output from cranelift but otherwise keeps the trace logging:
just run //sys:k23-qemu-riscv64 -- --append "log=trace,cranelift_codegen=off"
Attaching to the Kernel
There is no convenience flag for this yet — you wire it up by hand using QEMU’s gdbstub. Forward -s -S to QEMU to
have it expose a gdb server on localhost:1234 and halt the CPU at startup:
just run //sys:k23-qemu-riscv64 -- -s -S
The equivalent buck2 invocation is buck2 run //sys:k23-qemu-riscv64 -- -s -S.
In a second terminal, ask Buck2 for the path to the freshly built kernel ELF and launch LLDB against it:
rust-lldb "$(buck2 build --show-output //sys/kernel:kernel | awk '{print $2}')"
# In LLDB
gdb-remote localhost:1234
Catching Panics
Quite often, you will need to stop the kernel when a panic occurs, to inspect the state of the system. For this you can
set a breakpoint on the rust_panic symbol which is a special unmangled function for exactly this purpose (this
technique mirrors Rusts std library and is implemented in the panic-unwind
crate here).
Using LLDB you can set a breakpoint with the following command:
b rust_panic
and then use e.g. the bt command to print a backtrace.
Pretty Printing
To make debugging easier, you can add pretty printers for the mem_core::PhysicalAddress and mem_core::VirtualAddress
types. This can be done by through the following commands in LLDB:
type summary add --summary-string "mem_core::PhysicalAddress(${var.0%x})" mem_core::PhysicalAddress
type summary add --summary-string "mem_core::VirtualAddress(${var.0%x})" mem_core::VirtualAddress
Boot Arguments
Boot arguments configure various aspects of the kernels behaviour. They read from the
/chosen/bootargs property of the
flattened device tree that is passed to the kernel by the previous stage bootloader.
The format is a simple key=value;key=value;.. list of semicolon separated of key-value pairs.
log
Allows configuring the verbosity and filtering of debug messages.
# Enable the most verbose logging messages
just run //sys:k23-qemu-riscv64 -- --append "log=trace"
# A more reasonable configuration that keeps trace messages enabled, but silences the very spammy ones
just run //sys:k23-qemu-riscv64 -- --append "log=trace,cranelift_codegen=off,sharded_slab=off"
The underlying buck2 invocation is buck2 run //sys:k23-qemu-riscv64 -- --append "..."; everything after -- is forwarded to QEMU.
backtrace
Allows configuring the verbosity of kernel panic backtraces. There are two possible values: short (default) and full.
short will print an abridged backtrace that omits frames related to the unwinding and panic machinery itself.
# To print shorter panic backtraces (the default)
just run //sys:k23-qemu-riscv64 -- --append "backtrace=short"
# To print more verbose panic backtraces
just run //sys:k23-qemu-riscv64 -- --append "backtrace=full"
The buck2 build system
Why buck2
k23 is not a typical Rust project. We produce many different artifacts:
- The kernel build for a custom Rust target and with from-source-rebuilt
coreandalloccrates - The loader binary that is built with different Rust flags and for a different Rust target
- The full disk image(s) that is a combination of both binaries, with an initial ramdisk and possibly drivers/apps
- Additionally we also have many different kinds of tests (unittest, fuzz, loom, wasm-spec, selftests, etc) that all require different modes and apply only to subsets of libraries.
Cargo’s build model (one --target, one profile, one feature resolution per workspace) is unfortunately not well equipped to handle this. k23 needs a build system that is flexible, can deal with the same source node appearing multiple times with different configuration, and where post-processing steps are easy to express.
That’s what buck2 gives us. Complex tooling (rust, c++, mdbook, qemu, python) is wired in as ordinary build rules. The build graph is hermetic and content-addressed. Buck2 can schedule much more optimal builds across the entire build graph and aggressively cache results along the way. Because buck2 isn’t Rust-specific, a full “image” (kernel + loader + ramdisk + apps) and QEMU runner can be declared as an elegant chaining of rules.
High-level components
Tree layout
k23/
├── sys/ non-standalone subsystems (only make sense inside k23)
│ ├── loader/ bootloader binary
│ ├── kernel/ kernel binary
│ └── async/ kasync — the async runtime
├── lib/ standalone, potentially-publishable libraries
│ (range-tree, wavltree, cpu-local, spin, fdt, …)
├── third-party/ reindeer-generated BUCK rules; the one source of truth
│ for every non-first-party dep
├── tests/ wasm spec testsuite + handwritten .wast fixtures
├── platforms/ target platforms (riscv64, aarch64, x86_64) bundling
│ constraint values
├── manual/ the mdbook you are reading (//manual:manual)
├── build/ the buck2 build infrastructure itself (see below)
├── fuzz/ running corpus (gitignored) + committed crash repros
│ in fuzz/artifacts/
├── bench/ criterion baselines; gitignored; cached on main in CI
└── buck-out/ buck2's everything; gitignored; `buck2 clean` clears it
Build infrastructure (build/)
Everything that defines how k23 is built (as opposed to what gets built) lives in build/:
build/
├── BUCK declares kcfg options, target JSON, and the named
│ transitions (loader, kernel, rust_bootstrap, fuzz)
├── constraints/ constraint enums (opt-level, debuginfo, strip,
│ rust-std, env, sanitizer)
├── toolchains/ toolchain rules (rust, cxx, qemu, mdbook, python, …)
│ plus flake.bzl, which exposes nix-flake packages
├── targets/ Rust target-spec JSON files
├── transitions.bzl the generic configuration-transition rule
├── kcfg.bzl typed buckconfig wrapper + kcfg_docs rule that
│ auto-generates the config reference in this manual
├── qemu.bzl qemu_binary — wraps a kernel ELF into a QEMU command
├── bench.bzl rust_benchmark macro (criterion)
└── fuzz.bzl rust_fuzz macro (libfuzzer + persistent corpus)
Cargo to buck2 cheat sheet
| Cargo Command | k23 Equivalent |
|---|---|
cargo check / cargo build | just check |
cargo build -p foo | just check //lib/foo:foo |
cargo test / cargo test -p foo | just unittests / just unittests //lib/foo:foo |
cargo bench | just benchmark |
cargo fuzz run target | just fuzz |
cargo clippy | just clippy |
cargo fmt --check / cargo fmt | just check-fmt / just fmt |
cargo doc | just doc |
edit Cargo.toml / cargo update | edit third-party/Cargo.toml, then just buckify |
[package] | rust_library / rust_binary in a BUCK file |
[dependencies] | deps = [...] attribute of a rust_library/rust_binary rule |
[features] | features = [...] attribute of a rust_library/rust_binary rule |
profiles (debug/release) | constraints (opt-level/debuginfo/strip) + named modifier aliases in PACKAGE |
--target=… | constraints (prelude//cpu:riscv64, …) bundled into a platform under platforms/ |
RUSTFLAGS=-Cfoo | rustc_flags = ["-Cfoo"] attribute of a rust_library/rust_binary rule |
cfg(test) | a dedicated rust_test target; can carry its own deps |
cfg(loom) | rust_test with rustc_flags = ["--cfg=loom"] and labels = ["loom"] |
Adding a Crate
This document guides you through adding a new crate to the project.
Decide where it lives
The first step is deciding where the crate should live:
lib/for standalone libraries that could plausibly be useful outside of k23sys/for subsystem crates that only make sense as part of k23 (e.g. kernel subsystems like the virtual memory subsystem)build/for tools that run as part of the build process (e.g. disk image creation). These should generally only be simple tools, any complicated logic likely belongs intolib/
If you’re unsure about where to put a crate, default to lib/.
Crate Layout
Crates generally look the same under buck2 as they do under Cargo: a src folder containing your Rust code, a src/lib.rs or src/main.rs entrypoint. The biggest difference is the BUCK file (written in Starlark): It is our equivalent of Cargo.toml and where you declare all the crates metadata to the build system.
# declare the crate so the build system knows about it
rust_library(
# the name of the crate. rust code imports from this name.
# the convention is to match the crate dir
name = "mycrate",
# buck2 requires you to explicitly declare all source files
srcs = glob(["**/*.rs"]),
# and dependencies
deps = [
"//lib/util:util",
"//third-party:cfg-if",
],
# we also require you explicitly list which targets provide tests for this crate
# (see below)
tests = [":mycrate_unittests"],
# mark this crate as visible to others in this project (so we can depend on it)
visibility = ["PUBLIC"],
)
# make the unit-tests in this crate visible to buck2 as well.
# without it `just unittest` wont run the unit tests for this crate
rust_test(
name = "mycrate_unittests",
srcs = glob(["**/*.rs"]),
deps = [
"//lib/util:util",
"//third-party:cfg-if",
"//third-party:proptest", # or whatever the tests need
],
visibility = ["PUBLIC"],
)
Of course files like README.md or CHANGELOG.md belong into the crate directory.
Depending on your crate
To pull your crate into a consumer, simply add your crates buck path to the consumers deps array:
deps = [
"//lib/mycrate:mycrate",
...
]
Verify your changes
- Check your Rust code by running
just check //lib/mycrate:mycrate. This is the equivalent of runningcargo check -p mycrate. - Run the new tests you added by running
just unittests //lib/mycrate:mycrate just preflightwill run as much of the full CI suite locally. Run this before you push! You can also run the full suite for just your crate by runningjust preflight //lib/mycrate:mycrate.
Conventions & Tips
If your crate has architecture specific dependencies, you can gate them using select()
# and dependencies
deps = [
"//lib/util:util",
"//third-party:cfg-if",
] + select({
"prelude//cpu/constraints:riscv64": ["//lib/riscv:riscv"], # if the riscv64 constraint matches, add the riscv dependency
"DEFAULT": [] # otherwise nothing
})
If your crate has special features depending on whether its used in the kernel or loader (tends to happen sometimes) you can also use select():
features = select({
"constraints//:env[kernel]": ["thread-local"], # when running inside the kernel thread-locals are available, so lets use them
"DEFAULT": [] # otherwise we use some fallback mechanism
})
Removing a crate
When you remove a crate, simply delete its directory and remove the crate from any consumers deps. You can use the following buck2 query command to list all direct dependents of your crate: buck2 uquery "rdeps(//..., //lib/mycrate:mycrate, 1)".
You will also want to regenerate the rust-project.json file by running just rust-project so your rust-analyzer suggestions are up-to-date.
Adding Tests
k23 has quite a number of test flavors, each with its own usecase.
Unit tests (rust_test)
The most straightforward and common kind of test. These are just regular Rust unit tests that you write either inside the modules directly or in test/ files. They use the regular rust #[test] macro annotated functions:
#![allow(unused)]
fn main() {
#[test]
fn foo() {
assert!(true);
}
}
You declare them in the crate’s BUCK file with the rust_test rule:
rust_test(
name = "mycrate_unittests",
srcs = glob(["**/*.rs"]),
deps = [...],
visibility = ["PUBLIC"],
)
Lastly, don’t forget to add the test to the crates’ tests array! The test runner will not pick up on your tests
otherwise!
just unittests or just unittests //lib/mycrate:mycrate to run the tests.
just miri will automatically run the tests under miri as well.
Loom tests (concurrency model checking)
Loom is a very useful tool for checking concurrent and asynchronous code. It will explore many possible concurrent executions of your code to find deadlocks, panics, race conditions and more. If your crate touches anything concurrency related, you must absolutely add loom tests.
You declare them using the rust_loom_test rule:
rust_loom_test(
name = "mycrate_loom_tests",
srcs = glob(["**/*.rs"]),
deps = [..., "//third-party:loom"],
)
rust_loom_test automatically sets the correct compiler flags (--cfg=loom and others) and makes the loom tests visible to the build system. Run the loom tests with ``just loom //lib/mycrate:mycrate`.
See lib/spin/BUCK for a complete example.
Fuzz tests
Fuzz tests drive a function with random inputs to find non-obvious bugs. Any crate (especially parsers or data structures) that deal with user input should have a fuzz testing suite. Each lives under <crate>/fuzz/<name>.rs.
Declare the target in the crate’s BUCK file using our rust_fuzz rule:
load("//build:fuzz.bzl", "rust_fuzz")
rust_fuzz(
name = "mycrate_fuzz",
srcs = ["./fuzz/myfuzz.rs"],
crate_root = "./fuzz/myfuzz.rs",
deps = [
":mycrate",
"//third-party:libfuzzer-sys",
"//third-party:arbitrary",
],
visibility = ["PUBLIC"],
)
The rust_fuzz rule automatically sets the correct compiler flags and makes the fuzz tests visible to the build system. Fuzz targets use libfuzzer-sys for the harness and typically derive structured inputs with arbitrary. Run the fuzz tests with just fuzz //lib/mycrate:mycrate_fuzz. You can pass arguments such as the max time through the named fuzz_args argument just fuzz_args='--test-arg=-max_total_time=60' fuzz //lib/mycrate:mycrate_fuzz.
Fuzz tests produce two directories in the project root.
fuzz/corpus/is the running corpus. Persists exploration state between runs and makes them more useful.fuzz/artifacts/holds crashes the fuzz test found. Commit these so in the future we run them as regression tests. When CI finds a crash, copy the file from the uploadedfuzz-artifactsbundle intofuzz/artifacts/<name>/and commit it.
See lib/range-tree/fuzz/range_tree.rs and lib/range-tree/BUCK for a complete example.
Benchmarks
Benchmarks measure performance and catch regressions. You should probably add a benchmark for any library that is not a build dependency only. Each benchmark lives under <crate>/benches/<name>.rs and is written using criterion.
Declare the benchmark in the crate’s BUCK file using our rust_benchmark rule:
load("//build:bench.bzl", "rust_benchmark")
rust_benchmark(
name = "mycrate_benchmarks",
srcs = ["./benches/whatever.rs"],
crate_root = "./benches/whatever.rs",
deps = [
":mycrate",
"//third-party:criterion",
],
visibility = ["PUBLIC"],
target_compatible_with = [host_configuration.cpu, host_configuration.os],
)
The rust_benchmark rule sets compiler flags automatically (opt-level[3], debuginfo[line-tables-only], strip[debuginfo]) and makes the benchmark visible to the build system. Run it with just benchmark //lib/mycrate:mycrate_benchmarks.
Benchmarks produce a bench/ directory in the project root holding baselines and reports.
See lib/range-tree/benches/comparisons.rs and sys/async/benches/spawn.rs for complete examples.
Wasm tests
Wasm tests exercise the kernel’s WebAssembly engine end-to-end. They are all written in the wast language — a superset of the WebAssembly text format that adds assertions like assert_return and assert_trap for declaring expected outcomes. There are two sources of .wast files: small handwritten regression fixtures under tests/*.wast, and the upstream WebAssembly spec testsuite under tests/testsuite/ (a git submodule).
Wasm tests are registered through the wast_tests! macro in sys/kernel/src/tests/spectest.rs:
#![allow(unused)]
fn main() {
wast_tests!(
fib "../../../tests/fib.wast",
trap "../../../tests/trap.wast",
// address "../../../tests/testsuite/address.wast",
// ...
);
}
Each entry pairs a test name with a path to a .wast file. The macro generates a kernel test for each entry, so adding a new test is a matter of dropping the file into tests/ and listing it in the macro.
Run them via the kernel test harness with just run //sys:k23-qemu-riscv64.
See tests/fib.wast and tests/trap.wast for complete examples.
Adding a Third-Party Dependency
Even though we use buck2 and not Cargo to build k23 we nonetheless use 3rd party crates from crates.io. There are plenty of high-quality legitimately useful libraries available and we want to use them.
The tight integration between crates.io and Cargo (a good thing!) requires a bit of finagling which we do using reindeer maintained by meta. It takes a Cargo.toml file, resolves all the crates and generates a BUCK file that lets us reference these
crates from throughout our buck2 project.
TL;DR
- add the crate to
third-party/Cargo.toml - run
reindeer buckify - depend on it from a first-party crate via
//third-party:<crate-name> - commit
third-party/Cargo.toml,third-party/Cargo.lock, andthird-party/BUCK
The reindeer-clean job will complain if the Cargo.toml and BUCK file are out of sync.
The fields you’ll touch
third-party/Cargo.toml— the master manifest reindeer reads[dependencies]for plain cratesdefault-features = falseis the norm — most of our deps need to beno_std-friendlyfeatures = [...]only for what you actually need- mark optional with
optional = trueif the crate is only pulled in by some downstream feature
third-party/Cargo.lock— auto-managed; commit it as-isthird-party/BUCK— generated, large, do not hand-editthird-party/fixups/<crate>/fixups.toml— optional per-crate overrides for the rare cases where reindeer needs hints (build script behavior, env vars, conditional features). Look at existing examples (getrandom,rustix,serde) before writing onethird-party/deny.toml— license allowlist; cargo-deny CI checks against this
Adding a crate
- Add to manifest
# third-party/Cargo.toml
[dependencies]
foo = { version = "0.4", default-features = false, features = ["bar"] }
If you’re adding a dev or build dependency (that will be run on the host and needs access to std) you should mark it EITHER as optional = true and add it to the default feature OR enable it’s std-requiring feature in the default feature.
- Update
Cargo.lock
reindeer update
This will update the Cargo.lock file, reusing the Cargo/crates.io resolution logic. The updated lockfile is required by the next step.
- Regenerate buck rules
reindeer buckify
This will read the lockfile and generate the third-party/BUCK file. This file contains buck2 target declarations corresponding to the dependencies you added in step 1. Note that these buck2 targets directly fetch the libraries from crates.io, so you dont actually need Cargo installed at all.
- Use it in a
BUCK
deps = [
"//third-party:foo",
...
]
You can then reference 3rd party libraries by their buck2 path. The name of the target is the same as in the Cargo.toml manifest.
- Verify
To verify your changes are correct, you may run just check //path/to/consumer:target or just preflight //path/to/consumer:target to run all checks.
If cargo-deny complains about the 3rd party crates’ license you may extend third-party/deny.toml if the license is MIT compatible or pick a different crate (it’s probably best to discuss this with the maintainers and community first in any case).
Reindeer fixups
In some situations reindeer needs human help to correctly generate the dependency graph. These hints are called “fixups” and live in third-party/fixups/<crate>/fixups.toml files. You will need a fixup when:
- the crate has a required buildscript =>
reindeerdoes not build/run buildscripts by default. You have to explicitly opt-in by settingbuildscript.run = true - the crate reads env vars set at compile time => you may need to declare them manually or set
cargo_env = truefor the “common” cargo env vars - the crate has complex
cfg(...)dependencies or features => you will need to spell them out, see the reindeer manual for help.
buck2-fixups is a community maintained list of fixups for common crates.io dependencies. It’s always worth a look.
See third-party/fixups/getrandom/fixups.toml and third-party/fixups/serde/fixups.toml for examples of complex fixups.
Updating an existing dependency
Updating an existing dep is as easy as bumping its version in third-party/Cargo.toml and running reindeer update followed by reindeer buckify.
Removing a dependency
Removing a dependency means deleting it from third-party/Cargo.toml, deleting its corresponding third-party/fixups/<crate>/fixups.toml if present and run reindeer buckify to synchronize the third-party/BUCK file.
Git dependencies
You should prefer crates.io releases since git dependencies complicate and slow down the build. But if its unavoidable,
pin a branch or rev in Cargo.toml and add the host to third-party/deny.toml [sources] allow-git.
For example, we currently pull in JonasKruckenberg/wasmtime (the cranelift no_std fork) as a git dependency.
Overview of k23’s Architecture
k23 has 3 main components: The bootloader that is responsible for loading the kernel, the kernel itself, which is the main operating system, and the WASM runtime, which is responsible for running WebAssembly programs. The last two components are highly intertwined by design.
Bootloader
The bootloader is responsible for loading the kernel, verifying its integrity, decompressing it and setting up the necessary environment. That means collecting early information about the system, setting up the stack for each hart, setting up the page tables, and finally jumping to the kernel’s entry point.
The bootloader has to be generic over the payloads it accepts, since the kernel is not the only thing that can be loaded. When running tests, each test is compiled as a separate binary and ran in separate VMs. The bootloader has to be able to load these binaries as well.
For this, payloads can declare their entry points and a few options through the loader_api crates #[entry] macro.
The bootloader then uses this information to set up the environment for the payload. This macro also enforces a type
signature for the entry point, which means that payloads can completely forgo the usual assembly tramploines and just
declare a Rust function as their entry point.
Kernel
The kernel is relatively minimal at the moment, and as a microkernel will likely stay that way. Much of the kernels functions, such as memory management, syscalls etc. are implemented in the runtime. This leaves only the most basic functions in the kernel, such as interrupt handling, physical memory management and the like.
WASM Runtime
The WASM runtime is the heart of k23, it is responsible for running WebAssembly programs. It is not a standalone crate,
but implemented as part of the kernel since it is so core to the system. The runtime uses the wasmparser
and cranelift crates to parse and compile the WASM programs.
Currently, the runtime is quite simple, it only supports the most basic WASM instructions and features.
TODO this section will expand with more info.
System Startup
Loader
k23 uses a two stage boot flow, with a smaller loader stage before the actual kernel. The loader is not a full bootloader, its only responsibility is mapping the kernel ELF file into virtual memory and jumping to it. This includes mapping the thread-local-storage (TLS) blocks declared by the kernel, as well as a stack for each detected CPU.
A compiled loader executable contains an inlined copy of the k23 kernel. It is essentially a self-extracting executable.
There are three primary reasons for splitting into the early boot phase into the separate loader executable.
- Simplified kernel startup code. The kernel entrypoint is just a simple
extern "C" fn _start()function which gets called by the loader. This lets us focus on the important kernel startup sequence, which is complex enough. - Clean physmem to virtmem transition. When the loader starts, the MMU is disabled, and we’re running in physical memory mode. During startup, we have to enable the MMU and switch into virtual memory mode; Doing so however, will invalidate all pointers into physical memory (unless we have identity-mapped that memory first!). This includes memory containing instructions or even the stack we use for function calls. Having one executable that deals primarily with physical memory (the loader) and one that only ever deals with virtual memory (the k23 kernel) makes it much harder to make stupid mistakes.
- Easy early boot cleanup. Usually boot code (and data) is only ever used during startup. This means that during runtime our kernel would be carrying around dead memory - memory which is taken up by our ELF file but never accessed! As you can see below, splitting this boot code into a separate executable lets us easily reclaim this memory for runtime usage.
When the system resets, it will jump to a fixed address in (physical) memory. The exact address depends on the CPU, but
in the diagram below it is 0x1234. The loader has been loaded at this address into memory by a previous stage
bootloader.
The loader will begin to set up the MMU page translation tables in preparation for the switch to virtual memory. This means identity mapping itself so it can continue to run after the switch, but more importantly mapping the kernel ELF file into the higher-half of virtual memory where it expects to live. During this phase the loader will also map TLS ( thread-local storage) blocks and stacks for each detected CPU.
At this point the loader has done its job and will hand off control to the k23 kernel. It does that by jumping to the
ELF entrypoint address it had parsed during the mapping phase. Loader will pass along all information it has collected
about the system through the BootInfo struct.
Now that we’re inside the kernel, we need to - before moving on with startup - reclaim the now unused loader memory. We remove the region from the MMUs page translation tables and then add the now unused memory regions to our pool of available physical memory. We need to be very careful though to not add the kernel ELFs physical memory to that pool. Yes, it was part of the loader image, but it is still in use! We want to surgically reclaim all memory around it though.
Finally, we’re all up and running. The kernel is correctly mapped and the loader fully reclaimed. At this point we can move on with the system startup.
Kernel Startup
The k23 startup is split into three phases: Early, Main, and Late. Additionally, there is Per-CPU and Global initialization.
| Per-CPU | Global | |
|---|---|---|
| Early | arch::per_cpu_init_early | Rng & DeviceTree init |
| Main | N/A | See below |
| Late | arch::per_cpu_init_late | See below |
During early startup phase we mainly just set up the root random number generator which is local to each CPU. We then
call arch::per_cpu_init_early which performs CPU-specific resets like setting the initial FPU state and enabling
performance counters.
The main chunk of Global initialization work happens during the Main phase. This includes setting up the tracing
infrastructure, kernel heap allocator, device-tree, as well as the frame-allocator and virtual memory subsystem. This
will also set up the global state for the scheduler and the WebAssembly-runtimes global Engine state.
Lastly, during Late startup arch::per_cpu_init_late sets up the exception handler, enables interrupts, and initializes
the timer & IRQ drivers for each CPU.
At this point we are done with startup and enter the scheduling loop on each CPU.
Address Space Layout Randomization
Address-space layout randomization is a security technique that - as the name implies - randomizes the placement of various objects in virtual memory. This well known technique defends against attacks such as return-oriented programming (ROP) where an attacker chains together instruction sequences of legitimate programs (called “gadgets”) to archive privilege escalation.
Randomizing the placement of objects makes these techniques much harder since now an attacker has to correctly guess the address from a potentially huge number of possibilities.
KASLR
k23 randomizes the location of the kernel, stacks, TLS regions and heap at boot time.
ASLR in k23
k23 implements more advanced userspace ASLR that other operating systems, it not only randomizes the placement of WASM executable code, tables, globals, and memories; but also the location of individual WASM functions at each program startup (a similar technique is used by the Linux kernel called function-grained kernel address space layout randomization (FGKASLR))
TODO explain more in detail
ASLR Entropy
Entropy determines how “spread out” allocations are in the address space
higher values mean a more sparse address space, this is configured through the entropy_bits option (TODO).
Ideally the number would be as high as possible, since more entropy means harder to defeat ASLR. However, a sparser address space requires more memory for page tables and a higher value for entropy means allocating virtual memory takes longer (more misses the search function that searches for free gaps). The maximum entropy value also depends on the target architecture and chosen memory mode:
| Architecture | Virtual Address Usable Bits | Max Entropy Bits |
|---|---|---|
| Riscv32 Sv32 | 32 | 19 |
| Riscv64 Sv39 | 39 | 26 |
| Riscv64 Sv48 | 48 | 35 |
| Riscv64 Sv57 | 57 | 44 |
| x86_64 | 48 | 35 |
| aarch64 3 TLB lvls | 39 | 26 |
| aarch64 4 TLB lvls | 48 | 35 |
In conclusion, the best value for entropy_bits depends on a lot of factors and should be tuned
for best results trading off sparseness and runtime complexity for better security.
Note also that for e.g. Riscv64 Sv57 it might not even be desirable to use all 44 bits of available entropy since the address space itself is already huge and performance might degrade too much.
RISC-V
This section describes RISC-V specific details of k23.
Virtual Memory Layout on RISC-V
This page outlines the virtual memory layout used by k23 depending on the selected memory mode.
Currently supported memory modes are Riscv64Sv39, Riscv64Sv48 and Riscv64Sv57.
Note that addresses marked as <dynamic> are not fixed and depend on the number of harts (hardware threads) in the
system.
The code implementing this memory layout can be found in sys/loader/src/mapping.rs.
Sv39
| Address Range | Size | Description |
|---|---|---|
| 0x0000000000000000..=0x0000003fffffffff | 256 GB | user-space virtual memory |
| 0x0000004000000000..=0xffffffbfffffffff | ~16K PB | hole of non-canonical virtual memory addresses |
| kernel-space virtual memory | ||
| 0xffffffc000000000..=<dynamic> | ~96 GB | unused |
| <dynamic>..=<dynamic> | <dynamic> | kernel stacks |
| <dynamic>..=0xffffffd7ffffffff | <dynamic> | kernel TLS (thread local storage) |
| 0xffffffd800000000..=0xffffffe080000000 | 124 GB | direct mapping of all physical memory (PHYS_OFFSET) |
| 0xffffffff80000000..=0xffffffffffffffff | 2 GB | kernel (KERN_OFFSET) |
Sv48
| Address Range | Size | Description |
|---|---|---|
| 0x0000000000000000..=0x00007fffffffffff | 128 TB | user-space virtual memory |
| 0x0000800000000000..=0xffff7fffffffffff | ~16K PB | hole of non-canonical virtual memory addresses |
| kernel-space virtual memory | ||
| 0xffff800000000000..=0xffffbfff7ffefffe | ~64 TB | unused |
| <dynamic>..=<dynamic> | <dynamic> | kernel stacks |
| <dynamic>..=0xffffbfff7fffffff | <dynamic> | kernel TLS (thread local storage) |
| 0xffffbfff80000000..=0xffffffff7fffffff | 64 TB | direct mapping of all physical memory (PHYS_OFFSET) |
| 0xffffffff80000000..=0xffffffffffffffff | 2 GB | kernel (KERN_OFFSET) |
Sv57
| Address Range | Size | Description |
|---|---|---|
| 0x0000000000000000..=0x00ffffffffffffff | 64 PB | user-space virtual memory |
| 0x0100000000000000..=0xfeffffffffffffff | ~16K PB | hole of non-canonical virtual memory addresses |
| kernel-space virtual memory | ||
| 0xff00000000000000..=0xff7fffff7ffefffe | ~32 PB | unused |
| <dynamic>..=<dynamic> | <dynamic> | kernel stacks |
| <dynamic>..=0xff7fffff7fffffff | <dynamic> | kernel TLS (thread local storage) |
| 0xff7fffff80000000..=0xffffffff7fffffff | 32 PB | direct mapping of all physical memory (PHYS_OFFSET) |
| 0xffffffff80000000..=0xffffffffffffffff | 2 GB | kernel (KERN_OFFSET) |
Supported WASM Features & Proposals
This page documents all the WASM features, APIs and proposals that k23 supports. This list will be revised as time progresses and features are implemented.
Standardized Features
These features have been adopted into the WebAssembly standard and k23 aims to support all applicable features.
Proposals
These features are proposals for the WebAssembly standard. Many proposals change very frequently and support for them will range from limited to non-existent. Additionally some proposals may not be applicable to k23.
Explainer
- ✅: Implemented
- ❌: Not Implemented
- ?: The applicability of this feature is unclear, e.g. due to the lack of a detailed proposal.
- N/A: Not Applicable
WASI Features & Proposals
In addition to the main WASM features, k23 applications will interact with the host through WASI (WebAssembly System Interface) APIs. The following table lists all current proposals and their implementation status.
| Features | Status | Tracking Issue |
|---|---|---|
| I/O | ❌ | |
| Clocks | ❌ | |
| Random | ❌ | |
| Filesystem | ❌ | |
| Sockets | ❌ | |
| CLI | ❌ | |
| HTTP | ❌ | |
| Machine Learning | ❌ | |
| Clocks: Timezone | X | |
| Blob Store | Not planned | |
| Crypto | ❌ | |
| Digital I/O | ? | |
| Distributed Lock Service | Not planned | |
| I2C | ❌ | |
| Key-value Store | ❌ | |
| Logging | ❌ | |
| Messaging | ❌ | |
| Observe | ❌ | |
| Parallel | ❌ | |
| Pattern Match | ? | |
| Runtime Config | ? | |
| SPI | ? | |
| SQL | ? | |
| SQL Embed | N/A | |
| Threads | Not Planned | |
| URL | ? | |
| USB | ❌ | |
| WebGPU | ❌ |
Explainer
- ✅: Implemented
- ❌: Not Implemented
- ?: The applicability of this feature is unclear, e.g. due to the lack of a detailed proposal.
- N/A: Not Applicable