isaacwoods / poplar Goto Github PK

View Code? Open in Web Editor NEW

244.0 13.0 8.0 1.66 GB

Microkernel and userspace written in Rust exploring modern ideas

Home Page: https://poplar.isaacwoods.dev

License: Mozilla Public License 2.0

Rust 99.09% Assembly 0.75% HTML 0.08% CSS 0.05% Shell 0.03%

operating-system os rust microkernel kernel osdev risc-v x86-64

poplar's Introduction

Poplar

Poplar is a microkernel and userspace written in Rust, exploring modern ideas. It is not a UNIX, and does not aim for compatibility with existing software. It currently supports x86_64 and RISC-V.

The best way to learn about Poplar is to read the book. The website also hosts some other useful resources.

Building and running

Operating systems tend to be complex to build and run. We've tried to make this as simple as we can, but if you encounter problems or have suggestions to make it easier, feel free to file an issue :)

Getting the source

Firstly, clone the repository and fetch the submodules:

git clone https://github.com/IsaacWoods/poplar.git
git submodule update --init --recursive

Things you'll need

A nightly Rust toolchain
The rust-src component (install with rustup component add rust-src)
A working QEMU installation (one that provides qemu-system-{arch})

Building

This repository includes an xtask-based build tool to simplify building and running Poplar. The tool can be configured in Poplar.toml - this can, for example, be used to set whether to build using the release profile, and the architecture to build for.

Running cargo xtask dist will build a disk image
Running cargo xtask qemu will build a disk image, and then start emulating it in QEMU

See cargo xtask --help for more information about how to invoke the build system.

poplar's People

Contributors

Stargazers

Watchers

Forkers

happy-ferret memoryruins matthias-fauconneau elmergonzalezb dalalsunil1986 portal-os mariszo iq-scm

poplar's Issues

Install early exception handlers

Remaining tasks:

Probably just hardcode how many IST stacks we create, and their indices
Allocate them using the kernel stack allocator (potentially for each CPU, see below)
Put all IST stacks in every CPUs (wait... do we need distinct stacks for every CPU [imagine if two CPUs page-fault, and then use the same stacks...] - maybe check what other people do)
Put the correct IST indices in the IDT entries for exceptions

To nicely detect kernel stack overflows:

Maybe extract symbols of the guard page for the initial kernel stack
Make some way of detecting guard pages for non-initial kernel stacks
Check for them in the #PF handler

Documentation landing page

Index page at pebble-os.github.io/pebble that has links to:

Kernel documentation (currently here)
the book

Local APIC timer

I am planning to use the local APIC timer for a per-CPU scheduler timer to pre-empt greedy threads, so we'd ideally always like to be able to configure it correctly. We therefore need multiple back-ups to correctly find its frequency:

Check if we're running under a hypervisor and if it gives us the frequency
Get the core crystal clock frequency
Provide a suitable backup choice on known microarchitectures
Use another timer (such as the ACPI timer, or worst case the PIT) to time it for a known period of time

This suggests the KVM has special support for giving us the APIC frequency.
This also suggests that linux has a dedicated CPUID leaf for the hypervisor to specify timer frequencies

Move to using the acpi library

To alleviate some maintenance burden, move to using the official multiboot2-elf64 and acpi libraries. This also allows us to use Pebble as a sort-of-testing framework for acpi (until we get real tests working)

To get back to where we were functionality wise:

Get RSDP and framebuffer tags upstreamed and deployed on multiboot2
Move to using the acpi crate for parsing ACPI tables
Make sure acpi can actually parse and validate QEMU's tables correctly
Parse the MADT in acpi and actually pass the information back correctly
Use that information to configure the APICs again

Libmessage test suite

libmessage forms the foundation of the entire message passing framework, and so needs a nice test suite. We should at least test that a bunch of stuff serializes and deserializes to form the same data. bincode seems to have a similar set of tests.

As libmessage is also used from within the kernel and parses data from userland, it needs to be resistant to any malicious malformed messages. At first, just making sure we can't get it to panic would be a good start.

Book and kernel docs links on the website are broken

Test harness

As the kernel gets larger and more complex, it would be great if we could perform some unit and regression testing to make sure we don't break stuff. The end goal is to have a test harness using some virtualisation framework to run the kernel on some emulator, and then be able to extract diagnostics from said emulator to record for any failed tests. It would also be nice if the actual syntax of the tests was nice and didn't interfere with kernel code too much, so maybe this should wait until external test harnesses are a thing?

Some interesting resources that other people in the Rust kernel space have come across are:

Switch from recursive mapping to full physical memory mapping

Switch away from using a recursive mapping to mapping the full physical address space into the kernel P4 entry. This uses surprisingly little physical memory (to manage the page tables) if we use large pages, and simplifies a few things:

Accessing page tables - we no longer need different types of mapping or the recursive mapping logic
Apparently recursive mappings are not possible on other architectures, so this brings x86_64 more in line with what future architectures will have to do
Simplifies accessing physical memory for other parts of the kernel (e.g. ACPI tables, APIC config spaces etc.). This allows us to remove the special PhysicalRegionMapper logic we've had to have up until now.
It may have better cache properties (I haven't looked into this myself so not sure)

To do:

Rewrite paging code to remove old stuff, support multiple page sizes, and use the new physical mapping
Construct the kernel page table in the bootloader using the new system
Write code to unmap an entry and unmap the stack guard-page again
Create the full physical mapping in the bootloader
Custom debug implementation for EntryFlags

Make distribution step of xtask build a builder pattern

I think this could be a lot cleaner, as it would allow the structure to retain information needed in multiple places (e.g. list of userspace tasks to build would be used to build each one, and then to add each to the EFI image)

Convert rest of `multiboot2` to use proper memory address types

The Multiboot stuff under kernel/x86_64/src/multiboot2 still uses primitive types (u64 mainly) for memory addresses, when it should really use PhysicalAddress or VirtualAddress respectively (both found under kernel/x86_64/memory/paging).

elf_sections.rs has already been looked at, which has simplified some code in remap_kernel (kernel/x86_64/memory/paging/mod.rs) greatly. It would be great to get the rest of this module done!

This is probably a good first issue with only a little bit of architectural knowledge required, and would be great for learning a little more about x86_64 memory addressing and paging and whatnot! I can also serve as a mentor if needed.

Move to EFISTUB-like bootloader on x86_64

Instead of having a separate bootloader on x86_64, do what Linux can do and construct a fake PE image that contains the kernel image, and move the UEFI bootstrapping code from the current bootloader into the kernel.

Microarch-specific backups for APIC frequency

Some microarchs have non-portable methods of getting the APIC timer frequency (checking MSRs, hardcoded values, etc.). If we fail to get the TSC frequency from cpuid in x86_64::hw::cpu::CpuInfo::apic_frequency, we should match on the microarch and see if we can use one of these methods.

Allow user to choose which arch to build for

Switch UEFI bootloader to use efiapi ABI

rust-lang/rust#65809 introduced a new ABI, efiapi, to interact with UEFI functions through. We should switch from the win64 ABI to the new one.

Rework how we allocate task kernel stacks

At the moment, we allocate a large block for each AddressSpace. However, the situation where you want 1024 tasks in a single address space seems very rare, and so while we're doing a redesign, we might as well change how this works into a large slab allocator and just allocate individual task kernel stacks.

We can use the BitmapArray trait to handle the larger stack bitmap.

Include boot info in kernel memory map

Merge error in efiloader/src/main.rs

https://github.com/IsaacWoods/pebble/blob/e0a9a24a62908ed120bef03dcd29f1109fb4fe73/kernel/efiloader/src/main.rs#L453-L454

Two versions of the fn declaration for find_volume. I'd submit a PR but I can't tell which one is correct.

Move site generation over to Github Actions

We've moved building and testing over to Github Actions, but it doesn't yet deploy the site, docs, and book onto Github Pages. Once this is done, we can disable Travis CI completely.

Make arithmetic with addresses easier

The arithmetic operations on PhysicalAddress and VirtualAddress return Option<Self> because they can be made invalid / non-canonical (respectively) by arithmetic. This was meant to encourage proper error handling, but actually just means unwrap is scattered around the use-sites a bunch.

Instead, we should just panic for PhysicalAddresses, or make the address canonical manually for VirtualAddresses.

This will also allow us to implement += and friends 🎉

Move bootloader and kernel to ufmt?

Maybe move the bootloader and kernel to ufmt, which is much smaller and faster.

Would it also be possible to lint against use of core::fmt, to make sure we don't include any code from it if we do move to ufmt?

Correctly propagate flags to parent paging structures

When we add new mappings for pages who's parent structures are already present, if they add new permissions, we need to propagate these permissions up the tree.

Add book section for kernel objects

Safe attribute

I was thinking of writing a proc-macro to allow unsafe blocks to be annotated with a #[safe(...)] attribute. Depending on what level of bureaucracy is wanted (this would be more important if Pebble ever got more contributors), all sorts of stuff could be useful:

#[safe(
    reason = "bit patterns are the same",
    author = "IsaacWoods",
    reviewer = "IsaacWoods"
)]
let four_as_a_float = unsafe {
    mem::transmute::<f32>(4 : u32)
};

Eventually, a check could be made on review (possibly using the Checks API) to make sure every unsafe block has one of these attributes.

Unsafety audit

At some point, I'd like to just make sure that all the places we use unsafe are actually safe. When this is complete:

All unsafe functions must have a doc-comment explaining how to use them safely
All uses of unsafe blocks must have a comment explaining why they're safe
We should see if we can replace any unsafe code with a safe equivelent

Migrate to xtask-style build system

The build tool could be a fair bit simpler. Use the xtask ecosystem: xshell, xflags, etc.

Move from reading a bootcmd file to the UEFI image arguments

UEFI applications can be passed command line arguments, which are either passed on the shell or can be included in a boot entry. These can be accessed from the application as the LoadOptions field of the structure passed back by the LoadedImage protocol.

Looks like we'd have to implement this protocol if we migrate to rust-osdev/uefi-rs

Move TFTP server out-of-tree and split it into a library

Travis build failing. Mine too.

Not sure if this is a problem for Pebble or the Rust nightly...

https://travis-ci.org/github/IsaacWoods/pebble/builds/682635635

I get the same failure with a local build as well.

mtnygard@spark:~/work/pebble/kernel$ make
cargo xbuild --target=x86_64-unknown-uefi --manifest-path efiloader/Cargo.toml
    Updating crates.io index
   Compiling core v0.0.0 (/home/mtnygard/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libcore)
   Compiling compiler_builtins v0.1.27
   Compiling rustc-std-workspace-core v1.99.0 (/home/mtnygard/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/tools/rustc-std-workspace-core)
error: <inline asm>:2:33: error: unknown flag
            .section .llvmbc,"e"
                                ^


error: <inline asm>:3:34: error: unknown flag
            .section .llvmcmd,"e"
                                 ^


error: aborting due to 2 previous errors

error: could not compile `rustc-std-workspace-core`.

To learn more, run the command again with --verbose.
warning: build failed, waiting for other jobs to finish...
error: <inline asm>:2:33: error: unknown flag
            .section .llvmbc,"e"
                                ^


error: <inline asm>:3:34: error: unknown flag
            .section .llvmcmd,"e"
                                 ^


error: <inline asm>:2:33: error: unknown flag
            .section .llvmbc,"e"
                                ^


error: <inline asm>:3:34: error: unknown flag
            .section .llvmcmd,"e"
                                 ^


error: <inline asm>:55:33: error: unknown flag
            .section .llvmbc,"e"
                                ^


error: <inline asm>:56:34: error: unknown flag
            .section .llvmcmd,"e"
                                 ^


error: aborting due to 4 previous errors

error: could not compile `compiler_builtins`.

To learn more, run the command again with --verbose.
warning: build failed, waiting for other jobs to finish...
error: aborting due to 2 previous errors

error: could not compile `core`.

To learn more, run the command again with --verbose.
error: `"/home/mtnygard/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/bin/cargo" "rustc" "-p" "alloc" "--release" "--manifest-path" "/tmp/cargo-xbuild.POPwZ5D42GUA/Cargo.toml" "--target" "x86_64-unknown-uefi" "--" "-Z" "force-unstable-if-unmarked"` failed with exit code: Some(101)
note: run with `RUST_BACKTRACE=1` for a backtrace
Makefile:8: recipe for target 'efiloader' failed
make: *** [efiloader] Error 1

Booting on real hardware

I would like to get Pebble booting on a Thinkpad T490, just because that's what I have lying around. This unfortunately does not have a serial port AFAIK, so let's hope it has good GOP support at least.

So far, I've managed to get efiloader booting, but it crashes soon after because there are no load options specified.

Tasks:

Don't panic if no load options are specified
If no load options, use sensible defaults (e.g. assume kernel is at \kernel.elf) and that we should try and create a framebuffer of some reasonable size
Test rendering to the framebuffer from inside the kernel
Create some sort of gfxconsole layer to print stuff nicely inside the kernel. I don't think we'll get a userspace working on real hardware for a while, so I think this is a good first step.
Build Butler infrastructure for creating an image on a USB stick or something (it would be nice for it to update existing images without needing to format the whole thing as well)

Allocate heap space for userspace tasks

Allow tasks to create a MemoryObject at a given virtual address (this will be extended before to "just give me a randomised part of the address space")

Tool to build bootable images

Such a tool would have to do a few high-level things

Construct a GPT disk image with the correct headers
Build the kernel and any userspace tasks we want
Construct a suitable bootcmd file
Build a FAT partition containing the bootloader, bootcmd, kernel image, and any driver images the bootcmd requests
(In the future) Build a PFS partition containing any files we want on it, maybe including the kernel and drivers (and just have the bootloader image on the FAT

Select and switch to graphics mode

Moving build system

It's starting to feel like we're outgrowing Make, but I'm not really sure what to move to. The options are:

Keep Make and work out how to write the platform support in
- Pros: we have to change the absolute minimum, we get to keep the make muscle memory when building
- Cons: this is only a stop gap, Make is not that suited to what we're trying to do
Ninja
- This feels quite extra because Cargo deals with most of the actual build work, and we don't have any C/C++ code to build at all yet
- But it would probably be the most "oven-ready" solution
GN
Custom: Rust
- Pros: we already know Rust, we could do stuff like image generation easily in the build tool
- Cons: people would have to build the build system, then everything else
- Migigation: use a Makefile/python/shell script to build and invoke the build tool easily
Custom: Python
- Pros: people could just run the script, instead of having to build the build system first
- Cons: I don't know python well enough for it not to be nasty
Custom: Haskell
- Pros: we know Haskell, would be nice to represent data structures we need
- Cons: people might not have Haskell installed, who builds a build system in Haskell?

System objects

System objects are kernel objects created by the kernel to provide some key resources to "system" userspace tasks, such as the linear framebuffer created by the bootloader, and PCI device config spaces. They will be accessed through a new system call, request_system_object which should be a good new test for our system call infrastructure (we need to pass a few bits of data back to userspace so we should be testing our userspace validation code). We will also be expanding the capabilities enforcement infrastructure.

Create system call and add documentation
Create representations and libpebble interface
Create MemoryObject from kernel from linear framebuffer
Make capability for accessing linear framebuffer
Get framebuffer from simple_fb driver
Get drawing into framebuffer working
Use correct cache policy for framebuffer

Vestigal complexity in bootloader page table creation

We still manually allocate a frame for the kernel P4 table here - we should instead use the BootFrameAllocator. The entire create_page_table method can then be collapsed into uefi_main

Boot on RISC-V

Unmap efiloader remnants once we're in the kernel

We need to map loader code and data into the kernel's page tables in efiloader as we jump into the kernel, so we can continue fetching code once we switch to them, but before we jump into the kernel. However, it's not great to keep it lingering around for no reason, as we want to reclaim the physical memory.

Pass sections that remain mapped in the memory map
Mark them as free in the physical memory manager
Unmap them from the kernel's page tables

Tests for physical memory manager / buddy allocator

Looks like a bug has been lying in the buddy allocator since it was written, and we just haven't been allocating enough frames for it to show itself until now. We definitely need to write more thorough tests for this area.

efiloader: only search for an old RSDP if no v2 one is found

Leading on from discussion in rust-osdev/bootloader#172, efiloader is also looking for the RSDP incorrectly: we should search first for a v2 entry, and only if not present should we look for a v1 one.

Capabilities

Basic documentation in book
Work out how we want to export caps in images
Extract 32 bytes of caps from initial images and pass them to the kernel, pad with 0x00
Build in-kernel representation for storing and querying capabilities
Build a capability format parser in the kernel
Write some tests for the parser to make sure we can't break it (might be something we want to fuzz at a later stage)
Store capabilities for each Task
Procedural macro for defining capabilities from within userspace program
Enforce our first capability - EarlyLogging to make the early_log syscall

Introduce a `PhysicalMapping` type

We have lots of places in the paging system where we take a VirtualAddress for the base of the physical address space mapping, and then talk about how the entirity of the address space needs to be mapped in comments over and over (e.g. here).

We could introduce a PhysicalMapping type that would wrap that VirtualAddress to draw attention that it needs to be created with care.

Rewrite kernel heap allocator to be more type safe

Use usize instead of u64 in address types
Use VirtualAddress type
More documentation

Boot on ARM64

TFTP server to netboot the RPi
Boilerplate for hal_aarch64 and board features
Entry point and setup (e.g. moving to EL1 and zeroing BSS)
Support for the UART on the RPi
MMU support

Centralise list of crates in Makefiles

In the build system, it would be better to have a list of crates, and to iterate over it for targets like clean and fmt, instead of trying to keep all of the targets in sync.

Install a back-up stack for page-faults

If we page-fault because we've overflown the stack, we currently double fault. This isn't ideal, and so we should install another stack in the IST and switch to it when page-faults occur.

Bring up APs

I'm leaning towards using the UEFI MP protocol for this, as it means you don't have to faff about with timing IPIs and whatnot. We should bring up each AP in the bootloader, and each one could use the protocol to gain information about itself, and then jump into the kernel.

Add bindings for the MP protocol
Bring all the APs up using the protocol
Add an entry point for the kernel, expose it using a special symbol
Find the symbol in the bootloader for APs to jump to
Initialize each AP (they need to do less than the BSP)

Message passing

Format

Decide how to do chars (probably just hardcode as 4 bytes long)
Decide on format for dynamically sized elements (potential message size savings vs. simplicity)
Document the format

Serialization and deserialization

Write serializer
Write deserializer
Write test suite for well-formed and malformed messages (can run on host)
Write fuzzer (can run on host)

Kernel and userspace

Construct buffers
WriteView and ReadView
Decide how to nicely access process info from system call interface
Serialize into buffer
Deserialize out of buffer

How should we load new tasks?

At the moment, all tasks are loaded by the bootloader as "initial tasks". Later on, however, we want to be able to load new tasks from disk in userspace. As we're designing system calls, we should prepare for this eventuality. So to create a new task, we need to (in some sort of loader task, or using a library in the creating task?):

Create a new AddressSpace
Create a MemoryObject for each segment
Map those MemoryObjects into our address space and copy the data from the image
Also map each MemoryObject into the new AddressSpace
Create a Task in the new AddressSpace
Schedule the Task
Release our access to the created AddressSpace and MemoryObjects

The weird stack/page-table/general memory(?) corruption issue

Since starting to flesh out the syscall layer and userspace functionality, we've been seeing an on-and-off issue that presents in a couple of different ways:

We've seen parts of the address space, especially the user and kernel stacks, becoming unmapped
Other parts of the page tables are corrupted
We return to address 0x0 upon a sysret, even though the correct RIP is saved to the stack and (seems to be) restored to RCX (after a bunch of successful system calls)
On a version of QEMU built from source from the current tree, we get a #GP in userspace on a ret instruction after a sysret instead of a #PF from returning to address 0x0. Again, this is after a bunch of successful system calls.
Suspiciously, the presence and presentation of the issue seems to depend on the order and number of tasks loaded by efiloader and switched to by the scheduler. The presence of a second task can even change the behaviour of the first task, which suggests a deeper issue.

I am running off the assumption that all these issues are caused by an elusive root issue that is causing UB that presents in strange ways, but this is not a known and there could well be multiple distinct issues. The most perculiar thing about this problem is that it has been 'fixed' a few times (notably by cbed8cd which fixed it until 2b81d5d), but always ends up showing back up with a (seemingly) unrelated change.

I'm using this issue to track progress on fixing this issue, which I'm imagining will also involve expanding our kernel test coverage to try and confirm that things are working as intended.

Replace boot info with more versatile tag-based approach

Our current boot info structure is not very flexible, hard-coding various limits, and also wastes a fair bit of space in normal cases. I think we could replace it with basically a bunch of separate tags, each of which holds one thing (e.g. RSDP, memory map, one loaded image, etc.). Some tags would be universal (defined in hal), and some architecture specific (defined in e.g. hal_x86_64).

This approach was inspired by the stivale2 format, so have another look at that first.