unikraft / unikraft Goto Github PK
View Code? Open in Web Editor NEWA next-generation cloud native kernel designed to unlock best-in-class performance, security primitives and efficiency savings.
Home Page: http://unikraft.org
License: Other
A next-generation cloud native kernel designed to unlock best-in-class performance, security primitives and efficiency savings.
Home Page: http://unikraft.org
License: Other
The original VFScore implementation was implementing jsut a file descriptor lookup table providing basic callbacks for read/write/... etc. If we do not need any real filesystem, just a few sockets and maybe unnamed pipes, this implementation is good enough for those unikernels and much more specialized. The idea would be to provide ukfdtab as vfscore alternative and maybe introduce macros that map to particular fd creation functionalities depending on which one of the libs is chosen. This way we can keep a single implementation for file descriptor openers provides by other libraries (e.g., socket(), ...).
Unikraft supports Pyhonv3, and the Django packages can be installed via pip. Right now the port crashes because we don't have an implementation of socketpair(), though after fixing this other issues may arise. This project consists of going through these, with the aim of having Django run on Unikraft.
Hi
When VFSCORE deals with RAMFS, it uses ramfs_read and ramfs_write to read and write data. However, these methods, in turn, call vfscore_uiomove for data transfer. Could you please help me to understand the rationale behind that? vfscore_uiomove, in essence, is just memcpy, why not do this inside RAMFS?
thank you.
There's a few inconsistencies, here are my suggestions:
-POSIX process-related functions
+posix-process: Process-related functions
-syscall_shim: Syscall shim layer
+syscall-shim: Syscall shim layer
-POSIX sysinfo: Information about system parameters
+posix-sysinfo: Information about system parameters
-POSIX user-related functions
+posix-user: User-related functions
"Processor optimization" in menuconfig sets compiler flags, but the entry code might not enable the right features to actually support this. Minimal example: add asm volatile ("VZEROALL\n" :::); (an AVX instruction) to lib/ukboot/boot.c:main(), and a kvm VM will crash:
Welcome to _ __ _____
__ _____ (_) /__ _______ _/ _/ /_
/ // / _ \/ / '_// __/ _ `/ _/ __/
\_,_/_//_/_/_/\_\/_/ \_,_/_/ \__/
Titan 0.2~b709bf3-custom
weak main() called. Symbol was not replaced!
[ 0.406781] ERR: [libukboot] boot.c @ 274 : weak main() called. Symbol was not replaced!
[ 0.424291] CRIT: [libkvmplat] traps.c @ 65 : Unhandled Trap 6 (invalid opcode), error code=0x0
Regs address 0x120030
RIP: 00000000001090c6 CS: 0008
RSP: 000000000013ff80 SS: 0018 EFLAGS: 00010246
RAX: 000000000013fd60 RBX: 0000000000108b20 RCX: 000000000000000f
RDX: 00000000000003d4 RSI: 000000000000000b RDI: 00000000000003d4
RBP: 000000000013ffa0 R08: 0000000000000000 R09: 000000000013fea8
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 0.521828] CRIT: [libkvmplat] traps.c @ 69 : Crashing
Unikraft halted
For this to work, we need to have additional code in the initialization code, e.g., something like this for AVX:
diff --git a/plat/common/include/x86/cpu_defs.h b/plat/common/include/x86/cpu_defs.h
index 9ecec96..9b16b9e 100644
--- a/plat/common/include/x86/cpu_defs.h
+++ b/plat/common/include/x86/cpu_defs.h
@@ -67,6 +67,7 @@
#define X86_CR4_PAE (1 << 5) /* enable PAE */
#define X86_CR4_OSFXSR (1 << 9) /* OS support for FXSAVE/FXRSTOR */
#define X86_CR4_OSXMMEXCPT (1 << 10) /* OS support for FP exceptions */
+#define X86_CR4_OSXSAVE (1 << 18) /* XSAVE and extended states enable */
/*
* Intel CPU features in EFER
diff --git a/plat/kvm/x86/entry64.S b/plat/kvm/x86/entry64.S
index 35738b6..116dda7 100644
--- a/plat/kvm/x86/entry64.S
+++ b/plat/kvm/x86/entry64.S
@@ -180,8 +180,13 @@ ENTRY(_libkvmplat_start64)
movq %rax, %cr0
movq %cr4, %rax
orq $(X86_CR4_OSXMMEXCPT | X86_CR4_OSFXSR), %rax
+ orq $(X86_CR4_OSXSAVE), %rax
movq %rax, %cr4
ldmxcsr (mxcsr_ptr)
+ xorl %ecx, %ecx
+ xgetbv
+ orq $(0x7), %rax
+ xsetbv
Then VZEROALL works. I guess we need some way to set these based on the CONFIG_* choices, because we can't simply add the code to all executions, because that code in itself will lead to crashes if those CPU features are not available. (Though I guess, alternatively, we could also check the VCPU's features and set them accordingly.)
This is according to POSIX:
http://pubs.opengroup.org/onlinepubs/007904975/basedefs/stdint.h.html
http://pubs.opengroup.org/onlinepubs/007904975/basedefs/limits.h.html
The change itself is not very complicated (we should just make sure while we're at it that all the correct defines are being provided, considering defines never increase code size anyway), but we need to make sure this change doesn't break any builds that relied on this wrong include.
The posix-fdtab
internal library will provide the file descriptor abstraction. A file descriptor table will be provided per-thread, if cloned that way.
The fdtable
entry is currently held within the vfscore internal microlibrary and sees use with other contexts. This includes libraries which wish to register their own file descriptor type, including vfscore
.
The library will provide the interface to register the syscalls that you can have with any file descriptor (read
, write
, close
, etc.).
The file descriptor constructor syscalls (open
, socket
, eventfd
, etc.) will be provided by the according subsystem libraries (e.g, vfscore
, posix-socket
).
Add support for VMware hypervisor.
GitHub-Depends-On: #586
epoll
and select
represent two POSIX methods for IO multipex. Currently, select
is provided by the external library LwIP and does not reflect use of alternative file descriptors.
These two methods should be provided as a driver mechanism which allows external libraries to register interest in a custom implementation.
This is because there's a race condition. My description on the mailing list: "If more than one thread waits on the same other thread, then all of those waiting threads will wait in uk_waitq_event(). The first thread that wakes up after the thread has finished will then proceed to destroy the thread management structure. Every other waiting thread will try to do the same after waking up, ending up with duplicate free's and a crash of that thread."
I first though this could be fixed by only calling uk_thread_destroy in uk_thread_wait is the waitq is empty (i.e., since we just removed ourselves from the waitq, we're the last thread waiting), however, there's also a uk_thread_destroy in the schedcoop implementation, and the race condition still exists afterwards.
Some test code to trigger the problem:
#include <stdio.h>
#include <unistd.h>
#include <uk/sched.h>
#include <uk/config.h>
void print_thread(void *arg __unused)
{
struct uk_thread *cur = uk_thread_current();
uk_pr_crit("This is thread 0x%p\n", cur);
uk_pr_crit("sleeping a bit\n");
sleep(3);
uk_pr_crit("We're done!\n");
}
void wait_for_thread(void *arg)
{
struct uk_thread *cur = uk_thread_current();
struct uk_thread *waitfor = (struct uk_thread *)arg;
int ret;
uk_pr_crit("This is thread 0x%p\n", cur);
uk_pr_crit("Waiting for thread 0x%p\n", waitfor);
ret = uk_thread_wait(waitfor);
uk_pr_crit("uk_thread_wait returned %d\n", ret);
uk_pr_crit("We're done waiting!\n");
}
int main(int argc __unused, char *argv[] __unused)
{
struct uk_thread *t, *t2, *t3;
t = uk_thread_create("sleep", print_thread, NULL);
t2 = uk_thread_create("wait1", wait_for_thread, t);
t3 = uk_thread_create("wait2", wait_for_thread, t);
uk_thread_wait(t3);
uk_thread_wait(t2);
uk_pr_crit("Before sleep\n");
sleep(10);
uk_pr_crit("After sleep\n");
sleep(1);
uk_sched_yield();
uk_pr_crit("Thread test done\n");
return 0;
}
The Xen architecture has the concept of "stub domains", where, in principle, dom0 functionality can be dissagregated onto multiple, separate VMs that together mimic the overall functionality of dom0. This improves reliability, performance/scalability and flexibility. This project consists of generating different stub domains based on Unikraft by porting the XenStore and QEMU to Unikraft..
placed at plat/xen/include/xen-{x86,arm}.
The files do not make sense. X86 version barely has anything in it, and patches are already out.
It is better to remove the arm version too, for the sake of consistency. Especially since the only meaningful part in it is irq manipulation functions, which do not belong there anyways.
It makes sense to remove that after arm32 build is fixed (patches are already out)
Provide stubs for semaphores and mutexes when scheduling is off. That makes code using them independent of scheduling support.
Hi
I am trying to benchmark sqlite by the speedtest1 benchmark on top the linuxu build. everything works fine, but sometimes and in very special circumstances I receive an I/O error caused by the unlik syscall. this syscall checks the v_flags field, and returns EBUSY if the field is VROOT. the problem is that sometimes this field has very unexpected values:
100 - 50000 INSERTs into table with no index...................... vp->v_flags = 45542063 0.120s
150 - CREATE INDEX five times..................................... vp->v_flags = 0vp->v_flags = 0vp->v_flags = 0 4.728s
230 - 10000 UPDATES, numeric BETWEEN, indexed..................... vp->v_flags = 69732c72 0.661s
or, in some cases, the flag value is 1, but this is wrong (it is an ordinary file that was created early)
is there any know bug? I am using slightly modified environment to run unikraft, maybe it is a problem.
unikraft: 00bbf2c
newlib: ddc1a4308f9ec8ce742d80e6203a4e76ae5bf802
sqlite: 21ec31d578295982619a164de96b653e93e7cf9c
I don't use pthreads:
LIBS := $(UK_LIBS)/libsqlite:$(UK_LIBS)/newlib
and build sqlite with -DSQLITE_THREADSAFE=0
thank you
n Xen, ukplat_monotonic_clock returns the time elapsed (in nanoseconds) since the xen hypervisor booted, not since ukplat_time_init (as the function docs for ukplat_monotonic_clock say).
The build system is currently not taking changes of compile flags or build commands into account for detecting which objects need to be recompiled. We have already fixdep
taken from Linux so we should be able to adopt the build rules.
The initial release of vfscore
library did not introduce closing of standard file descriptors. This is also mentioned in the code here. The underlying structure for standard file descriptors is statically allocated, unlike ordinary file descriptors. A proper fix should allocate these dynamically as well.
This issue recommends the the ability to provide a checksum for an external library's source code to be used after the make fetch
stage of the unikernel compilation.
For example, checksums would be provided in-line Makefile.uk
:
LIBMYLIB_VERSION = 1.0.0
LIBMYLIB_URL = https://github.com/mylib/mylib/archive/v$(LIBMYLIB_VERSION).zip
LIBMYLIB_SHA256 = df5c1978aa5530d8edf411f5091c904386858b8cd93ee5d3bc388f450ce12997
Applications built for Linux make use of proc filesystem entries.
With procfs support, more applications can be ported.
procfs will use lib/ukstore
entries, when available.
Replacing newlib with nolibc works, as does using newlib and removing vhost=on
Each platform installs a timer for scheduling. Rework on the API to avoid forced ticks.
THere should be a ukplat_ function to setup a time once in order to get back control to the scheduler (e.g., to support preemption).
We are never going to support 32-bit on x86, but we have "#ifdef i386" all over the place, which hurts readability. Also many files will shrink dramatically
Uniprof (http://sysml.neclab.eu/projects/uniprof/) is a Xen VM profiler able to generate flame graphs. It would be great to extend it to support KVM as well.
Hello Unikraft team,
The Xen Project is in the frontline of the Unikernel projects, but why the team working on the KVM? I can remember that the Unikraft introduced by Xen Project!
Please improve Xen supporting.
Thanks.
Hi
I am using uk_posix_memalign_ifpages
to allocate memory for a stack and get, let's say, 0x7fb2ae400000.
the corresponding metadata is stored at 0x7fb2ae3ff000 (a previous page, as it stated in alloc.c).
I look inside this structure and see that the allocation starts from 0x7fb2ae000000 and includes 801 pages. I am trying to figure out how the number 801 appears and don't understand. I think it should be 800, because the metadata page is a part of the padding.
I thought the function always returns one additional page, but, when I use it to allocate a smaller memory region, let's say 0x1008 bytes aligned at 8, I receive something like:
return = 0x7fb2ac018010, metadata= 0x7fb2ac018000, metadada.begin = 0x7fb2ac018000 metadata.num_pages = 2. This result is correct, and the metadata page is included into the range.
Should this list be alphabetical?
Add support for Musl as standard C library in Unikraft.
Summary of objectives
clone
system call)check out Bromium as related work.
Some (many?) languages/applications crash with a page fault or GPF when using KVM and setting Optimization level ---> Optimize for performance in the menu (e.g., this is the case with the Intel WAMR port). Other settings for Optimization level work fine.
Right now if we wanted to provide a different implementation of an internal library (e.g., uklock) we can, but the other internal libraries will continue to use uklock, there's no built-in, simple-to-use mechanism to select between uklock and the new library.
We inherited bits of PVH code from Minios. The problem is this code did not work even for Minios. There is little point in keeping these bits. However it might be good to keep the -DCONFIG_PARAVIRT, as hints for future
Implement memory ballooning driver for Unikraft.
Summary of objectives
Make is not taking compile flags into account for recompilation. We have some weird cases where we have to make clean
and make
again to solve some problems. Examples:
newlib
and internal nolibc
-D
) instead of the gnerated KConfig _config.h headerCurrently all stacks have a fixed size, configured at build time.
We want to be able to allocate stacks of different sizes, at runtime (see pthread_attr_setstacksize and friends). This would mean we cannot save the current thread on the stack as we do it now (this being a solution imported from MiniOS). We will use a solution inspired by classical OSes (Linux, FreeBSD, etc) where current thread is saved on per-cpu data memory.
Fixed size integer definitions are provided by the included libc. In principle it should be possible to user uint32_t for everything. Since a libc (including nolibc) is anyway required, those extra dataypes could be removed in order to reduce confusion (especially APIs in include/ folder). All code should then use the standard types. The internal definition is technically not required.
The original idea was to have API definition on include/ completely indepen dent of any libraries. A clean solution is required if this dependency is added (e.g., by moving them to into a library?, like an libukplat
API lib?)
/mnt/filesystems/roaming/workspace/Project/Unikraft/Unikraft_Review/Unikraft/plat/common/sw_ctx.c:64:2: note: in expansion of macro ‘uk_pr_debug’
uk_pr_debug("Allocating %lu bytes for sw ctx at %p\n", sz, ctx);
^~~~~~~~~~~
In file included from /mnt/filesystems/roaming/workspace/Project/Unikraft/Unikraft_Review/Unikraft/plat/common/sw_ctx.c:42:0:
/mnt/filesystems/roaming/workspace/Project/Unikraft/Unikraft_Review/Unikraft/plat/common/include/x86/cpu.h:72:3: error: impossible constraint in ‘asm’
asm volatile("xsave (%0)" :: "r"(ctx->extregs),
^~~
/mnt/filesystems/roaming/workspace/Project/Unikraft/Unikraft_Review/Unikraft/plat/common/include/x86/cpu.h:76:3: error: impossible constraint in ‘asm’
asm volatile("xsaveopt (%0)" :: "r"(ctx->extregs),
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.