Conflicting PERFILE_DAX flag

1. What is Cloud Hypervisor?
- Objectives
2. Getting Started
3. Status
4. Relationship with Rust VMM Project
- Differences with Firecracker and crosvm
5. Community

1. What is Cloud Hypervisor?

Cloud Hypervisor is an open source Virtual Machine Monitor (VMM) that runs on top of the KVM hypervisor and the Microsoft Hypervisor (MSHV).

The project focuses on running modern, Cloud Workloads, on specific, common, hardware architectures. In this case Cloud Workloads refers to those that are run by customers inside a Cloud Service Provider. This means modern operating systems with most I/O handled by paravirtualised devices (e.g. virtio), no requirement for legacy devices, and 64-bit CPUs.

Cloud Hypervisor is implemented in Rust and is based on the Rust VMM crates.

Objectives

High Level

Runs on KVM or MSHV
Minimal emulation
Low latency
Low memory footprint
Low complexity
High performance
Small attack surface
64-bit support only
CPU, memory, PCI hotplug
Machine to machine migration

Architectures

Cloud Hypervisor supports the x86-64 and AArch64 architectures. There are minor differences in functionality between the two architectures (see #1125).

Guest OS

Cloud Hypervisor supports 64-bit Linux and Windows 10/Windows Server 2019.

2. Getting Started

The following sections describe how to build and run Cloud Hypervisor.

Prerequisites for AArch64

AArch64 servers (recommended) or development boards equipped with the GICv3 interrupt controller.

Host OS

For required KVM functionality and adequate performance the recommended host kernel version is 5.13. The majority of the CI currently tests with kernel version 5.15.

Use Pre-built Binaries

The recommended approach to getting started with Cloud Hypervisor is by using a pre-built binary. Binaries are available for the latest release. Use cloud-hypervisor-static for x86-64 or cloud-hypervisor-static-aarch64 for AArch64 platform.

Packages

For convenience, packages are also available targeting some popular Linux distributions. This is thanks to the Open Build Service. The OBS README explains how to enable the repository in a supported Linux distribution and install Cloud Hypervisor and accompanying packages. Please report any packaging issues in the obs-packaging repository.

Building from Source

Please see the instructions for building from source if you do not wish to use the pre-built binaries.

Booting Linux

Cloud Hypervisor supports direct kernel boot (the x86-64 kernel requires the kernel built with PVH support or a bzImage) or booting via a firmware (either Rust Hypervisor Firmware or an edk2 UEFI firmware called CLOUDHV / CLOUDHV_EFI.)

Binary builds of the firmware files are available for the latest release of Rust Hypervisor Firmware and our edk2 repository

The choice of firmware depends on your guest OS choice; some experimentation may be required.

Firmware Booting

Cloud Hypervisor supports booting disk images containing all needed components to run cloud workloads, a.k.a. cloud images.

The following sample commands will download an Ubuntu Cloud image, converting it into a format that Cloud Hypervisor can use and a firmware to boot the image with.

$ wget https://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64.img
$ qemu-img convert -p -f qcow2 -O raw focal-server-cloudimg-amd64.img focal-server-cloudimg-amd64.raw
$ wget https://github.com/cloud-hypervisor/rust-hypervisor-firmware/releases/download/0.4.2/hypervisor-fw

The Ubuntu cloud images do not ship with a default password so it necessary to use a cloud-init disk image to customise the image on the first boot. A basic cloud-init image is generated by this script. This seeds the image with a default username/password of cloud/cloud123. It is only necessary to add this disk image on the first boot. Script also assigns default IP address using test_data/cloud-init/ubuntu/local/network-config details with --net "mac=12:34:56:78:90:ab,tap=" option. Then the matching mac address interface will be enabled as per network-config details.

$ sudo setcap cap_net_admin+ep ./cloud-hypervisor
$ ./create-cloud-init.sh
$ ./cloud-hypervisor \
	--kernel ./hypervisor-fw \
	--disk path=focal-server-cloudimg-amd64.raw path=/tmp/ubuntu-cloudinit.img \
	--cpus boot=4 \
	--memory size=1024M \
	--net "tap=,mac=,ip=,mask="

If access to the firmware messages or interaction with the boot loader (e.g. GRUB) is required then it necessary to switch to the serial console instead of virtio-console.

$ ./cloud-hypervisor \
	--kernel ./hypervisor-fw \
	--disk path=focal-server-cloudimg-amd64.raw path=/tmp/ubuntu-cloudinit.img \
	--cpus boot=4 \
	--memory size=1024M \
	--net "tap=,mac=,ip=,mask=" \
	--serial tty \
	--console off

Custom Kernel and Disk Image

Building your Kernel

Cloud Hypervisor also supports direct kernel boot. For x86-64, a vmlinux ELF kernel (compiled with PVH support) or a regular bzImage are supported. In order to support development there is a custom branch; however provided the required options are enabled any recent kernel will suffice.

To build the kernel:

# Clone the Cloud Hypervisor Linux branch
$ git clone --depth 1 https://github.com/cloud-hypervisor/linux.git -b ch-6.2 linux-cloud-hypervisor
$ pushd linux-cloud-hypervisor
# Use the x86-64 cloud-hypervisor kernel config to build your kernel for x86-64
$ wget https://raw.githubusercontent.com/cloud-hypervisor/cloud-hypervisor/main/resources/linux-config-x86_64
# Use the AArch64 cloud-hypervisor kernel config to build your kernel for AArch64
$ wget https://raw.githubusercontent.com/cloud-hypervisor/cloud-hypervisor/main/resources/linux-config-aarch64
$ cp linux-config-x86_64 .config  # x86-64
$ cp linux-config-aarch64 .config # AArch64
# Do native build of the x86-64 kernel
$ KCFLAGS="-Wa,-mx86-used-note=no" make bzImage -j `nproc`
# Do native build of the AArch64 kernel
$ make -j `nproc`
$ popd

For x86-64, the vmlinux kernel image will then be located at linux-cloud-hypervisor/arch/x86/boot/compressed/vmlinux.bin. For AArch64, the Image kernel image will then be located at linux-cloud-hypervisor/arch/arm64/boot/Image.

Disk image

For the disk image the same Ubuntu image as before can be used. This contains an ext4 root filesystem.

$ wget https://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64.img # x86-64
$ wget https://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-arm64.img # AArch64
$ qemu-img convert -p -f qcow2 -O raw focal-server-cloudimg-amd64.img focal-server-cloudimg-amd64.raw # x86-64
$ qemu-img convert -p -f qcow2 -O raw focal-server-cloudimg-arm64.img focal-server-cloudimg-arm64.raw # AArch64

Booting the guest VM

These sample commands boot the disk image using the custom kernel whilst also supplying the desired kernel command line.

x86-64

$ sudo setcap cap_net_admin+ep ./cloud-hypervisor
$ ./create-cloud-init.sh
$ ./cloud-hypervisor \
	--kernel ./linux-cloud-hypervisor/arch/x86/boot/compressed/vmlinux.bin \
	--disk path=focal-server-cloudimg-amd64.raw path=/tmp/ubuntu-cloudinit.img \
	--cmdline "console=hvc0 root=/dev/vda1 rw" \
	--cpus boot=4 \
	--memory size=1024M \
	--net "tap=,mac=,ip=,mask="

AArch64

$ sudo setcap cap_net_admin+ep ./cloud-hypervisor
$ ./create-cloud-init.sh
$ ./cloud-hypervisor \
	--kernel ./linux-cloud-hypervisor/arch/arm64/boot/Image \
	--disk path=focal-server-cloudimg-arm64.raw path=/tmp/ubuntu-cloudinit.img \
	--cmdline "console=hvc0 root=/dev/vda1 rw" \
	--cpus boot=4 \
	--memory size=1024M \
	--net "tap=,mac=,ip=,mask="

If earlier kernel messages are required the serial console should be used instead of virtio-console.

x86-64

$ ./cloud-hypervisor \
	--kernel ./linux-cloud-hypervisor/arch/x86/boot/compressed/vmlinux.bin \
	--console off \
	--serial tty \
	--disk path=focal-server-cloudimg-amd64.raw \
	--cmdline "console=ttyS0 root=/dev/vda1 rw" \
	--cpus boot=4 \
	--memory size=1024M \
	--net "tap=,mac=,ip=,mask="

AArch64

$ ./cloud-hypervisor \
	--kernel ./linux-cloud-hypervisor/arch/arm64/boot/Image \
	--console off \
	--serial tty \
	--disk path=focal-server-cloudimg-arm64.raw \
	--cmdline "console=ttyAMA0 root=/dev/vda1 rw" \
	--cpus boot=4 \
	--memory size=1024M \
	--net "tap=,mac=,ip=,mask="

3. Status

Cloud Hypervisor is under active development. The following stability guarantees are currently made:

The API (including command line options) will not be removed or changed in a breaking way without a minimum of 2 major releases notice. Where possible warnings will be given about the use of deprecated functionality and the deprecations will be documented in the release notes.
Point releases will be made between individual releases where there are substantial bug fixes or security issues that need to be fixed. These point releases will only include bug fixes.

Currently the following items are not guaranteed across updates:

Snapshot/restore is not supported across different versions
Live migration is not supported across different versions
The following features are considered experimental and may change substantially between releases: TDX, vfio-user, vDPA.

Further details can be found in the release documentation.

As of 2023-01-03, the following cloud images are supported:

Ubuntu Focal (focal-server-cloudimg-{amd64,arm64}.img)
Ubuntu Jammy (jammy-server-cloudimg-{amd64,arm64}.img )
Fedora 36 (Fedora-Cloud-Base-36-1.5.x86_64.raw.xz / Fedora-Cloud-Base-36-1.5.aarch64.raw.xz)

Direct kernel boot to userspace should work with a rootfs from most distributions although you may need to enable exotic filesystem types in the reference kernel configuration (e.g. XFS or btrfs.)

Hot Plug

Cloud Hypervisor supports hotplug of CPUs, passthrough devices (VFIO), virtio-{net,block,pmem,fs,vsock} and memory resizing. This document details how to add devices to a running VM.

Device Model

Details of the device model can be found in this documentation.

Roadmap

The project roadmap is tracked through a GitHub project.

4. Relationship with Rust VMM Project

In order to satisfy the design goal of having a high-performance, security-focused hypervisor the decision was made to use the Rust programming language. The language's strong focus on memory and thread safety makes it an ideal candidate for implementing VMMs.

Instead of implementing the VMM components from scratch, Cloud Hypervisor is importing the Rust VMM crates, and sharing code and architecture together with other VMMs like e.g. Amazon's Firecracker and Google's crosvm.

Cloud Hypervisor embraces the Rust VMM project's goals, which is to be able to share and re-use as many virtualization crates as possible.

Differences with Firecracker and crosvm

A large part of the Cloud Hypervisor code is based on either the Firecracker or the crosvm project's implementations. Both of these are VMMs written in Rust with a focus on safety and security, like Cloud Hypervisor.

The goal of the Cloud Hypervisor project differs from the aforementioned projects in that it aims to be a general purpose VMM for Cloud Workloads and not limited to container/serverless or client workloads.

The Cloud Hypervisor community thanks the communities of both the Firecracker and crosvm projects for their excellent work.

5. Community

The Cloud Hypervisor project follows the governance, and community guidelines described in the Community repository.

Contribute

The project strongly believes in building a global, diverse and collaborative community around the Cloud Hypervisor project. Anyone who is interested in contributing to the project is welcome to participate.

Contributing to a open source project like Cloud Hypervisor covers a lot more than just sending code. Testing, documentation, pull request reviews, bug reports, feature requests, project improvement suggestions, etc, are all equal and welcome means of contribution. See the CONTRIBUTING document for more details.

Slack

Get an invite to our Slack channel, join us on Slack, and participate in our community activities.

Mailing list

Please report bugs using the GitHub issue tracker but for broader community discussions you may use our mailing list.

Security issues

Please contact the maintainers listed in the MAINTAINERS.md file with security issues.

	#[cfg(feature = "async_io")]
	pub use asyncio::FuseDevTask;

	#[cfg(feature = "async_io")]
	/// Task context to handle fuse request in asynchronous mode.
	mod asyncio {
	use std::os::unix::io::RawFd;
	use std::sync::Arc;

	use crate::api::filesystem::AsyncFileSystem;
	use crate::api::server::Server;
	use crate::transport::{FuseBuf, Reader, Writer};

	/// Task context to handle fuse request in asynchronous mode.
	///
	/// This structure provides a context to handle fuse request in asynchronous mode, including
	/// the fuse fd, a internal buffer and a `Server` instance to serve requests.
	///
	/// ## Examples
	/// ```ignore
	/// let buf_size = 0x1_0000;
	/// let state = AsyncExecutorState::new();
	/// let mut task = FuseDevTask::new(buf_size, fuse_dev_fd, fs_server, state.clone());
	///
	/// // Run the task
	/// executor.spawn(async move { task.poll_handler().await });
	///
	/// // Stop the task
	/// state.quiesce();
	/// ```
	pub struct FuseDevTask<F: AsyncFileSystem + Sync> {
	fd: RawFd,
	buf: Vec<u8>,
	state: AsyncExecutorState,
	server: Arc<Server<F>>,
	}

	impl<F: AsyncFileSystem + Sync> FuseDevTask<F> {
	/// Create a new fuse task context for asynchronous IO.
	///
	/// # Parameters
	/// - buf_size: size of buffer to receive requests from/send reply to the fuse fd
	/// - fd: fuse device file descriptor
	/// - server: `Server` instance to serve requests from the fuse fd
	/// - state: shared state object to control the task object
	///
	/// # Safety
	/// The caller must ensure `fd` is valid during the lifetime of the returned task object.
	pub fn new(
	buf_size: usize,
	fd: RawFd,
	server: Arc<Server<F>>,
	state: AsyncExecutorState,
	) -> Self {
	FuseDevTask {
	fd,
	server,
	state,
	buf: vec![0x0u8; buf_size],
	}
	}

	/// Handler to process fuse requests in asynchronous mode.
	///
	/// An async fn to handle requests from the fuse fd. It works in asynchronous IO mode when:
	/// - receiving request from fuse fd
	/// - handling requests by calling Server::async_handle_requests()
	/// - sending reply to fuse fd
	///
	/// The async fn repeatedly return Poll::Pending when polled until the state has been set
	/// to quiesce mode.
	pub async fn poll_handler(&mut self) {
	// TODO: register self.buf as io uring buffers.
	let drive = AsyncDriver::default();

	while !self.state.quiescing() {
	let result = AsyncUtil::read(drive.clone(), self.fd, &mut self.buf, 0).await;
	match result {
	Ok(len) => {
	// ###############################################
	// Note: it's a heavy hack to reuse the same underlying data
	// buffer for both Reader and Writer, in order to reduce memory
	// consumption. Here we assume Reader won't be used anymore once
	// we start to write to the Writer. To get rid of this hack,
	// just allocate a dedicated data buffer for Writer.
	let buf = unsafe {
	std::slice::from_raw_parts_mut(self.buf.as_mut_ptr(), self.buf.len())
	};
	// Reader::new() and Writer::new() should always return success.
	let reader =
	Reader::<()>::new(FuseBuf::new(&mut self.buf[0..len])).unwrap();
	let writer = Writer::new(self.fd, buf).unwrap();
	let result = unsafe {
	self.server
	.async_handle_message(drive.clone(), reader, writer, None, None)
	.await
	};

	if let Err(e) = result {
	// TODO: error handling
	error!("failed to handle fuse request, {}", e);
	}
	}
	Err(e) => {
	// TODO: error handling
	error!("failed to read request from fuse device fd, {}", e);
	}
	}
	}

	// TODO: unregister self.buf as io uring buffers.

	// Report that the task has been quiesced.
	self.state.report();
	}
	}

	impl<F: AsyncFileSystem + Sync> Clone for FuseDevTask<F> {
	fn clone(&self) -> Self {
	FuseDevTask {
	fd: self.fd,
	server: self.server.clone(),
	state: self.state.clone(),
	buf: vec![0x0u8; self.buf.capacity()],
	}
	}
	}

	#[cfg(test)]
	mod tests {
	use std::os::unix::io::AsRawFd;

	use super::*;
	use crate::api::{Vfs, VfsOptions};
	use crate::async_util::{AsyncDriver, AsyncExecutor};

	#[test]
	fn test_fuse_task() {
	let state = AsyncExecutorState::new();
	let fs = Vfs::<AsyncDriver, ()>::new(VfsOptions::default());
	let _server = Arc::new(Server::<Vfs<AsyncDriver, ()>, AsyncDriver, ()>::new(fs));
	let file = vmm_sys_util::tempfile::TempFile::new().unwrap();
	let _fd = file.as_file().as_raw_fd();

	let mut executor = AsyncExecutor::new(32);
	executor.setup().unwrap();

	/*
	// Create three tasks, which could handle three concurrent fuse requests.
	let mut task = FuseDevTask::new(0x1000, fd, server.clone(), state.clone());
	executor
	.spawn(async move { task.poll_handler().await })
	.unwrap();
	let mut task = FuseDevTask::new(0x1000, fd, server.clone(), state.clone());
	executor
	.spawn(async move { task.poll_handler().await })
	.unwrap();
	let mut task = FuseDevTask::new(0x1000, fd, server.clone(), state.clone());
	executor
	.spawn(async move { task.poll_handler().await })
	.unwrap();
	*/

	for _i in 0..10 {
	executor.run_once(false).unwrap();
	}

	// Set existing flag
	state.quiesce();
	// Close the fusedev fd, so all pending async io requests will be aborted.
	drop(file);

	for _i in 0..10 {
	executor.run_once(false).unwrap();
	}
	}
	}
	}

cloud-hypervisor / fuse-backend-rs Goto Github PK

fuse-backend-rs's Introduction