Decreasing BareMetal Cloud Cold Boot Time to Under 5ms
Cold boot time matters for VMs.
A fast cold boot time is critical because it turns startup from a fixed infrastructure penalty into a negligible part of execution. When a system can boot in a couple milliseconds, compute can be provisioned only when needed, used for a specific job, and then destroyed immediately. This reduces latency, idle infrastructure, warm-pool waste, memory pressure, and attack exposure while improving autoscaling, isolation, and cost efficiency. The industries that benefit most are cloud/serverless platforms, edge computing, telecom/NFV, cybersecurity, finance, AI inference, gaming infrastructure, CI/CD, media processing, industrial control, defense, automotive, healthcare, and high-traffic e-commerce. The core value is that fast cold boot makes disposable, isolated compute practical at very small time and resource scales.
For BareMetal Cloud, it is central to the design to be able to boot and run a workload quickly. BareMetal Cloud is intended for high-density, high-performance services that can be provisioned, run, and destroyed quickly.
Recent work has pushed the BareMetal Cloud cold boot time below 5ms inside KVM.
Standard KVM VMs
Prior to the recent changes cold boot from the firmware handoff to the app payload running was ~50ms. Changes were completed in the two components that make up BareMetal Cloud. Pure64 - the kernel loader, and BareMetal - the kernel.
Pure64
Pure64 is responsible for preparing the machine before handing control to the BareMetal kernel. This means getting the system into a proper 64-bit mode if it was booted via BIOS, parsing the system memory map, and starting up any additional CPU cores into 64-bit mode as well.
The easiest 10ms win came from rewriting the ACPI RSDP search path.
Previously, the loader searched too broadly for the ACPI Root System Description Pointer which was needed for some older physcial systems that didn’t adhere to the specifications. That was safe, but memory access is expensive. The updated code follows the standard search regions: the EBDA area and the BIOS ROM range from 0xE0000 to 0xFFFFF, with the signature checked on the expected 16-byte boundaries. The relevant Pure64 ACPI update changed src/init/acpi.asm and narrowed the BIOS ROM scan to 8192 iterations over that 128KiB region. (GitHub)
Pure64 previously took around 30ms. After the latest optimization work, it now ranges from roughly 0.5ms to 20ms depending on system configuration:
Configuration | Pure64 startup time |
Single CPU, no video output | ~0.5ms |
Multiple CPUs, no video output | ~11ms |
Multiple CPUs, video enabled | ~20ms |
The multi-CPU no-video case is dominated by the existing SMP startup delays, which account for about 10.5ms (10,000µs INIT, 500µs SIPI). The multiple-CPU video case adds the cost of video initialization on top. Video output can be configured by Pure64 but will not be used for BareMetal Cloud.
BareMetal
The BareMetal kernel also had two major boot-time improvements.
The first was removing unnecessary memory clearing.
Clearing memory feels harmless because it is simple and deterministic. But during cold boot, every cleared byte has a cost. BareMetal does not need to behave like a general-purpose kernel preparing for arbitrary future workloads and has a much narrower execution model. If a region does not need to be cleared for correctness, clearing it is waste.
The second major improvement was rewriting PCIe bus enumeration.
The previous PCIe enumeration code was correct enough, but not efficient. This ineffiency was due to using helper functions which were a leftover from the older PCI method via ports 0xCF8 and 0xCFC. The new version steps through the PCIe memory range directly. The updated BareMetal kernel code will be published to the open-source BareMetal kernel repository shortly.
These changes helped reduce the BareMetal kernel boot time from about 20ms to about 4ms.
Results
The current measured results are:
Component | Before | After |
Pure64 | ~30ms | ~0.5ms to ~20ms |
BareMetal | ~20ms | ~4ms |
Firecracker Is Even Faster
Firecracker is different - it’s not a typical virtual machine. There is no BIOS (the CPU starts in 64-bit mode), there is no PCIe bus as it uses Virtio-MMIO devices instead. It is a very stripped down VM providing just enough hardware:
- VirtIO block and network devices via MMIO
- A serial console
- PS/2 keyboard controller (only used for sending Ctrl+Alt+Del)
- PIC/IOAPIC (Interrupt Controllers)
- KVM clock
Pure64 isn’t needed for booting on Firecracker. A new “BareMetal Init” was written for bridging the gap from the Firecracker handover to the BareMetal kernel start.
The VirtIO drivers for the BareMetal kernel needed to be reworked for MMIO but the majority of the logic remained the same. All unused parts of the BareMetal kernel were removed - No PCIe/PCI bus, no HPET, no USB, no linear frame buffer for graphics, or VGA text mode. This resulted in a kernel binary of just under 8KiB. Total memory usage for the kernel was brought down from 4MiB to 2MiB.
These changes result in a cold start of 500-700 microseconds (<1ms) in Firecracker.
This is exactly the kind of environment BareMetal is designed for: a tiny, predictable machine model where the OS acts as the hardware abstraction layer.
Why This Matters
A sub-5ms cold boot changes how an operating system can be used.
BareMetal is not trying to be a traditional general-purpose OS. It is designed around a smaller, sharper model: provision, compute, destroy.
That makes cold boot performance directly relevant to real deployments:
- high-density VM workloads
- short-lived compute jobs
- unikernel-style services
- fast provision/compute/destroy infrastructure