deislabs / mystikos Goto Github PK

Tools and runtime for launching unmodified container images in Trusted Execution Environments

Python 2.23% Makefile 6.06% Dockerfile 0.85% C 71.37% Shell 1.78% C++ 0.68% Assembly 0.13% C# 8.44% JavaScript 0.13% Java 0.01% Roff 1.40% HTML 4.31% Smalltalk 0.01% CSS 2.47% SCSS 0.14% Go 0.01%

tee sgx docker microkernel alpha intel

mystikos's Introduction

What is Mystikos?

Mystikos is a runtime and a set of tools for running Linux applications in a hardware trusted execution environment (TEE). The current release supports Intel ® SGX while other TEEs may be supported in future releases.

Goals

Enable protection of application code and data while in memory through the use of hardware TEEs. This should be combined with proper key management, attestation and hardware roots of trust, and encryption of data at rest and in transit to protect against other threats which are out of scope for this project.
Streamline the process of lift-n-shift applications, either native or containerized, into TEEs, with little or no modification.
Allow users and application developers control over the makeup of the trusted computing base (TCB), ensuring that all components of the execution environment running inside the TEE are open sourced with permissive licenses.
Simplify re-targeting to other TEE architectures through a plugin architecture.

Architecture

Mystikos consists of the following components:

a C-runtime based on musl libc, but is glibc compatible
a "lib-os like" kernel
the kernel-target interface (TCALL)
a command-line interface
some related utilities

Today, two target implementations are provided:

The SGX target (based on the Open Enclave SDK)
The Linux target (for verification on non-SGX platforms)

The minimalist kernel of Mystikos manages essential computing resources inside the TEE, such as CPU/threads, memory, files, networks, etc. It handles most of the syscalls that a normal operating system would handle (with limits). Many syscalls are handled directly by the kernel while others are delegated to the target specified while launching Mystikos.

Installation Guide for Ubuntu

Mystikos may be built and installed Ubuntu 20.04.

Install from Released Package

To install Mystikos using one of the released packages, please follow the appropriate guide to install on Ubuntu 20.04.

Install From Source

You may also build Mystikos from source. The build process will install the SGX driver and SGX-related packages for you.

Quick Start Docs

Eager to get started with Mystikos? We've prepared a few guides, starting from a simple "hello world" C program and increasing in complexity, including demonstrations of DotNet and Python/NumPy.

Give it a try and let us know what you think!

Simple Applications

A Simple "Hello World" in C: click here
A Simple "Hello World" in Rust: click here
Dockerizing your "Hello World" app: click here
Introducing Enclave Configuration with a DotNet program: click here
Running Python & NumPy for complex calculations: click here

Samples

Mystikos samples provides a number of samples in various programming languages and serves as a good place for developers to start.

Enclave Aware Applications

Sometimes, you want to take advantage of specific properties of the Trusted Execution Environment, such as attestation. The following example shows how to write a C program which changes its behaviour when it detects that it has been securely launched inside an SGX enclave.

Getting started with a TEE-aware program: click here

More Docs!

We've got plans for a lot more documentation as the project grows, and we'd love your feedback and contributions, too.

Key features of Mystikos: click here
General concepts of Mystikos: click here
Deep dive into Mystikos architecture: [coming soon]
How to implement support for a new TEE: [coming soon]
Kernel limitations: click here
Multi-processing and multi-threading in Mystikos and limitations: [coming soon]

Developer Docs

Looking for information to help you with your first PR? You've found the right section.

Developer's jump start guide: click here
Signing and packaging applications with Mystikos: click here
Release management: click here
Notable unsupported kernel features and syscalls: [coming soon]

For more information, see the Contributing Guide.

Licensing

This project is released under the MIT License.

Reporting a Vulnerability

Please DO NOT open vulnerability reports directly on GitHub.

Security issues and bugs should be reported privately via email to the Microsoft Security Response Center (MSRC) at [email protected]. You should receive a response within 24 hours. If for some reason you do not, please follow up via email to ensure we received your original message.

Code of Conduct

This project has adopted the Microsoft Code of Conduct. All participants are expected to abide by these basic tenets to ensure that the community is a welcoming place for everyone.

mystikos's People

Contributors

Stargazers

Watchers

mystikos's Issues

umount crashes with open file descriptors

The kernel crashes on shutdown if umount() is called when the affected file system still has open file descriptors.

RFD: suggested improvements in "TEE-aware example"

The user-getting-started-tee-aware.md example is, perhaps, the most compelling (certainly the most technically deep) of our example docs right now. However, the story doesn't quite seem to land, and I think it could be better.

Suggestion:

add a section at the top, just before the Problem Statement, which describes the overall "story". What will this demo do? What makes it valuable, unique, or interesting among all the demos?
add a section telling folks what the walkthrough will do. Describe it in simple bullet points. Something like this fictitious example sentence: "In this example, you will: create two enclave processes, define a secret, pass it to one enclave, and access it from the other".
This was the first mention in the docs of MRSIGNER. I think new users won't know what this is, so we should explain it (maybe with an inserted section), and it would be helpful to link to further reading on the topic (is there some in this repo? in OE-SDK? or on Intel's site?)

Configure OE heap size when using ext2 rootfs

SYS_prlimit set rlimit support

Missing syscall prlimit support for memcached/redis solution.

failing libc tests

Out of 339 libc tests, 282 are passing, and 57 are failing.
Here are failing 57 tests:

/src/common/runtest.exe
/src/functional/env.exe
/src/functional/fcntl.exe
/src/functional/fscanf.exe
/src/functional/fwscanf.exe
/src/functional/ipc_msg.exe
/src/functional/ipc_sem.exe
/src/functional/ipc_shm.exe
/src/functional/popen.exe
/src/functional/pthread_cancel-points.exe
/src/functional/pthread_cancel.exe
/src/functional/pthread_cond.exe
/src/functional/pthread_mutex.exe
/src/functional/pthread_mutex_pi.exe
/src/functional/pthread_robust.exe
/src/functional/pthread_tsd.exe
/src/functional/sem_init.exe
/src/functional/sem_open.exe
/src/functional/setjmp.exe
/src/functional/spawn.exe
/src/functional/sscanf_long.exe
/src/functional/stat.exe
/src/functional/strptime.exe
/src/functional/ungetc.exe
/src/functional/utime.exe
/src/functional/vfork.exe
/src/math/fmal.exe
/src/math/lgamma.exe
/src/math/lgammaf.exe
/src/math/lgammal.exe
/src/math/lgammal.exe/src/math/log2f.exe
/src/math/powf.exe
/src/regression/daemon-failure.exe
/src/regression/execle-env.exe
/src/regression/fflush-exit.exe
/src/regression/fgetwc-buffering.exe
/src/regression/flockfile-list.exe
/src/regression/lseek-large.exe
/src/regression/malloc-brk-fail.exe
/src/regression/malloc-oom.exe
/src/regression/pthread-robust-detach.exe
/src/regression/pthread_atfork-errno-clobber.exe
/src/regression/pthread_cancel-sem_wait.exe
/src/regression/pthread_cond-smasher.exe
/src/regression/pthread_cond_wait-cancel_ignored.exe
/src/regression/pthread_create-oom.exe
/src/regression/pthread_exit-cancel.exe
/src/regression/pthread_exit-dtor.exe
/src/regression/pthread_once-deadlock.exe
/src/regression/rewind-clear-error.exe
/src/regression/rlimit-open-files.exe
/src/regression/setenv-oom.exe
/src/regression/sigaltstack.exe
/src/regression/sigreturn.exe
/src/regression/statvfs.exe
/src/regression/syscall-sign-extend.exe
/src/regression/tls_get_new-dtv.exe

posix_spawn: handle child death gracefully

When a child process dies, the parent should still be alive.

Remove private keys from repo

For example:

/solutions/attested_tls/oe_enclave/enc/private.pem
/solutions/attested_tls/private_key.pem

Get started with dotnet doc doesn't work with .NET framework 5.0.2

Here is the error I get when trying to run the docker container:

myst/bin/dotnet /app/sum.dll 1 2 3 Failed to create CoreCLR, HRESULT: 0x8007000E Enclave /tmp/mystQ53eX5/lib/openenclave/mystenc.so returned -2147450743

Doc Cleanup: include instructions on how to report security issues

Example text:
https://docs.opensource.microsoft.com/content/releasing/security.html

samples/rust does not build

~/mystikos/samples/rust$ make
rustc ./hello_world.rs
make: rustc: Command not found
Makefile:5: recipe for target 'all' failed
make: *** [all] Error 127

Consider formalizing tcall interface

The target interface (between the kernel and target) employs tcalls, which are defined as follows.

long tcall(long n, long params[6]);

Consider formalizing this API as an interface in C (a structure with function pointers). This way, the interface would be type safe and easier to provide new implementations for.

Support SYS_kill

Following dotnet runtime test fails when run with mystikos -

tracing/eventpipe/processinfo/processinfo/processinfo.dll 18:15:37
  0.0s: Test PID: 101
*** kernel panic: syscall.c(4182): libos_syscall(): unhandled syscall: SYS_kill()
0x7fdb003ae7f6: __libos_panic()
0x7fdb003c0438: libos_syscall()

Test location: https://github.com/dotnet/runtime/tree/v5.0.0-rtm.20519.4/src/tests/tracing/eventpipe/processinfo

Investigate LinuxKit for usability in Mystikos

https://github.com/linuxkit/linuxkit

Aeva: I'm trying to convert a docker image to a cpio file that is as small as possible, which will then be mapped to a ramfs. Yes, I lose all the metadata and layers. I know. It's OK. Who knows, I may regret it later.
Justin: you can actually just do this with LinuxKit directly, as its initrd file is just a cpio file, and if you just put in a single init container in the config you can just output as initrd.
Justin: LinuxKit basically converts images to flat tarballs, we did use docker export to do it previously but this does it directly from the image.

https://twitter.com/justincormack/status/1353008059375874048
https://twitter.com/justincormack/status/1353010773128306694

GCOV will generate garbled characters in file name

Currently gcov lib version(7.5.0) and gcov util code will generate garbled characters in file names for code coverage. This will

crash the dotnet solution running and,
generate bunch of unusable gcda and gcno files.

That error happens systematically for kernel and utils and possible for other components. That makes code coverage result not consumable.

Some thoughts and findings:

The problem is possible introduced by code update. The last normal code coverage report can be found is around Oct 20th, 2020. After that the nightly pipeline crashed for a long period(around half month). Unfortunately, nightly pipeline does not retain jobs running before October.
The currently used gcov version is built from GCC 7.5.0, other versions will not generate gcda and gnco files(verified by Xuejun). And also the code coverage report generation script need an overhaul. It may not process all gcda and gcno files correctly.

/tmp/myst* files are not cleaned up in case a program fails

If a program aborts or segfaults when run with mystikos, we don't get a chance to clean /tmp files.

Handle SYS_flock

#2 adds only a passthrough implementation for SYS_flock.

libc test failure: pthread_cancel-sem_wait

This fails occasionally.

=== start test: /src/regression/pthread_cancel-sem_wait.exe
src/regression/pthread_cancel-sem_wait.c:55: seqno == 1 failed (uncontended sem_wait)
src/regression/pthread_cancel-sem_wait.c:65: seqno == 1 failed (blocking sem_wait)
Assertion failed: WEXITSTATUS(wstatus) == 0 (run.c: _run_tests: 47)
2021-02-01T21:20:33+0000.182637Z [(H)ERROR] tid(0x7f578ecf7b80) | :OE_ENCLAVE_ABORTING [/root/OpenLibOS/third_party/openenclave/openenclave/host/calls.c:_call_enclave_function_impl:56]
/root/mystikos/build/bin/myst: error: failed to enter enclave: result=OE_ENCLAVE_ABORTING
Makefile:68: recipe for target 'tests' failed
make: *** [tests] Error 1

Cleanup of fdtables on main thread exit causes crash

We clean up fdtables on main thread exit here - https://github.com/deislabs/mystikos/blob/main/kernel/enter.c#L391-L396.

For dotnet tests added in #2, this is causing dotnet signal handler thread to read on an invalid fd here - https://github.com/dotnet/runtime/blob/v5.0.0-rtm.20519.4/src/libraries/Native/Unix/System.Native/pal_signal.c#L89.

Stack trace of failed signal handler thread -

#0  0x00007ffe003c754e in libos_assume (cond=false) at /home/vitikoo/OpenLibOS/include/libos/assume.h:13
#1  0x00007ffe003c75e9 in _unlock (pipe=0x7ffe89af0350) at pipedev.c:60
#2  0x00007ffe003c79c9 in _pd_read (pipedev=0x7ffe005e69e0 <_pipdev.4536>, pipe=0x7ffe89af0350, buf=0x7ffe627978e7, count=1) at pipedev.c:173
#3  0x00007ffe003b69a3 in libos_syscall_read (fd=23, buf=0x7ffe627978e7, count=1) at syscall.c:746
#4  0x00007ffe003bb249 in libos_syscall (n=0, params=0x7ffe62797790) at syscall.c:2237
#5  0x00007ffe87733176 in libos_syscall (n=0, params=0x7ffe62797790) at enter.c:37
#6  0x00007ffe877aabb5 in __syscall_cp_c (nr=0, u=23, v=140730550548711, w=1, x=0, y=140730550549348, z=0) at src/thread/pthread_cancel.c:32
#7  0x00007ffe877a93d4 in __syscall_cp (nr=0, u=23, v=140730550548711, w=1, x=0, y=0, z=0) at src/thread/__syscall_cp.c:19
#8  0x00007ffe877b8138 in read (fd=23, buf=0x7ffe627978e7, count=1) at src/unistd/read.c:6
#9  0x00007ffe647545c1 in SignalHandlerLoop (arg=0x7ffe48326600) at /build/runtime/src/libraries/Native/Unix/System.Native/pal_signal.c:89
#10 0x00007ffe877ac560 in start (p=0x7ffe62797960) at src/thread/pthread_create.c:192
#11 0x00007ffe003c50a6 in _call_thread_fn () at thread.c:345
#12 0x00007ffe64754530 in ?? () at /build/runtime/src/libraries/Native/Unix/System.Native/pal_signal.c:174
#13 0x00007ffe48326600 in ?? ()
#14 0x0000000000000000 in ?? ()

The main thread on exit does an SYS_exit_group, which sends a signal to all threads. However because the reader thread(dotnet signal handler thread) is sleeping on a pipe read, its not killed.

Consider using openssl rather than mbed tls

Remove SYS_recvfrom hack

We hacked the SYS_recvfrom system call to work around an application that incorrectly handled EAGAIN errors. This mitigation severely penalizes performance (100x). Consider ways to optionally enable the mitigation or remove it completely.

Let's use a Tagging taxonomy

To help with finding, tracking, and triaging issues and PRs, I suggest we adopt a thoughtful taxonomy for Tags -- and document it! By way of a strawman proposal, I'd like to propose the following:

Tags may be grouped with prefixes to describe the category of the Tag itself, and a suffix to describe the category of the Issue, separated by a slash ("/").

prefix status: indicates a status for an issue, such as: status/new, status/triaged, status/rejected, status/in-progress, and status/stale.
prefix type: indicates the type of issue, such as: type/bug, type/discussion, type/feature, type/process.
prefix area: indicates which part of the codebase the issue relates to, such as: area/docs, area/filesystem, area/kernel, area/cli.

In this way, Issues and PRs needing attention can be located efficiently by those with the relevant knowledge to attend to them, for example by searching for the intersection of status/new and area/kernel to identify new kernel bugs that need to be triaged.

Thoughts? Feedback? Suggestions for improvement?

Report peak memory usage

Need a way to report peak memory usage on app exit.

encrypted backing store block device

Consider adding an encrypted backing store to hold cached data blocks that have been written to and evicted. This will help limit use of EPC memory when there are many writes.

Implement multi-threading in kernel with M:N mapping

We should support M:N mapping from target threads to user threads in the kernel. Target threads are from underlying platforms, in the case of SGX, it would be ethreads. In the case of OP-TEE or UEFI, it's the main thread that the platform started. In any case, the kernel needs to create a mapping so that the user space still sees possibly unlimited number of threads when it calls pthread_create.

Alongside the M:N mapping, the kernel becomes the thread manager, which removes potential security risk from the host on SGX platform. The in-enclave thread manager also makes possible trust-worthy handling of syscalls related to thread priorities and affinities, such as SYS_sched_getaffinity, SYS_sched_setaffinity, SYS_sched_setparam, SYS_sched_getparam, etc.

down-sizing of ocall output lengths may compromise compatibility

In various ocalls, the lengths returned are downsized to the maximum length of the buffer. This avoids an attack vector where the untrusted code can be manipulated to return a size larger than the buffer, possibly resulting in reads beyond the end of the buffer. In some cases this may result in incorrect behavior where the API (e.g. recvmsg) is supposed to return a larger length to indicate that the buffer has been truncated. For now we assume that these cases are rare in practice. This trades off compatibility for security. This assumption could turn out to be false in practice and would require revisiting at that time.

Mystikos process gets killed for bigger rootfs sizes on DC2s_v2

myst package and myst exec fail with a rootfs of 746M.

userid in enclave issue

When running Memcached server in enclave, the service will try to detect userid and decide if root user is needed. When running in myst OS, this seems being messed up. @vtikoo can you add more details?

Here is the link to memcached source when the issue being triggered.
https://github.com/memcached/memcached/blob/c472369fed5981ba8c004d426cee62d5165c47ca/memcached.c#L5562

Create a test to protect regressions of the debugger

We need this because the multi-memory-region implementation has very fragile support for debugger. The test should ensure breakpoints in application and myst runtime (kernel + target + enc) can be reached.

Kernel processes need their own mman region

Currently kernel processes (created with posix_spawn) attempt to share the mman region. We attempt to release any mmap'd memory when a process exists, however we are unable to release sbrk memory (which is allocated from the opposite of the mman region). For this to work for many processes, each process will need its own mman region (or the single mman region could be carved up).

pthread_create occasionally returns EAGAIN

To repro:

Run tests/pthread in a loop. It fails several times per 100 tries on the second assertion below:

    for (size_t i = 0; i < max_threads; i++)
    {
        int r = pthread_create(&threads[i], NULL, _exhaust_thread, (void*)i);

        if (i + 1 == max_threads)
            assert(r == EAGAIN);
        else
            assert(r == 0);    // <---- assertion failure. r =11 (EAGAIN)
    }

rework pid/tid assignment algorithm

Currently pid's and tid's are assigned using a simple increasing integer that wraps at some point (plus 100). This could cause conflicts when more than 4 billion threads have been created.

Instead we propose using a bit string that is 64536 bits (or 8192 bytes). We propose scanning from the last bit assigned and wrapping on end of bit string.

add hostname to appconfig

Allow the user to configure the hostname through the appconfig.

The C function gethostname() performs SYS_uname syscall.

Remove direct syscall in unmapself.S

unmapself.s uses syscall instruction which is unsupported in an enclave. This function is invoked during exiting of detached pthreads.
@jxyang has added a C version of unmapself under build/crt-musl/thread, but this change is not integrated yet.
Assembly version is under build/crt-musl/thread/x86-64/__unmapself.S

Support fadvise64

Following dotnet runtime test fails when run with mystikos -

*** kernel panic: syscall.c(4202): myst_syscall(): unhandled syscall: SYS_fadvise64()
0x7f4f003ae6e6: __myst_panic()
0x7f4f003c0402: myst_syscall()

Test location: https://github.com/dotnet/runtime/blob/v5.0.0-rtm.20519.4/src/tests/readytorun/crossgen2/crossgen2smoke.csproj

solution sql_ae fails with the latest Alpine Linux

It crashes if we replace the Alpine Linux version "3.10" with "3.12" in the docker files.

add cwd to appconfig

Add the current working directory (cwd) to the appconfig and set it on startup.

add mount parameters to appconfig

Mount parameters will be needed in the appconfig file so that the Mystikos kernel may automatically mount these file systems during startup. These mount parameters include (see man 2 mount):

source
target
filesystemtype
mountflags

Support legacy usage of SYS_mknod and SYS_mknodat

According to Linux man page:

POSIX.1-2001 says: "The only portable use of mknod() is to create a
FIFO-special file. If mode is not S_IFIFO or dev is not 0, the
behavior of mknod() is unspecified." However, nowadays one should
never use mknod() for this purpose; one should use mkfifo(3), a
function especially defined for this purpose.

However, we don't support mkfifo in Myst either. And it seems some applications, such as dotnet runtime, still uses mknod to create named pipes.

Handle fchmod and fchmodat for files/pipes/non-uds sockets

Question: how do ENV vars get passed in to "myst sgx-run" ?

Docker images support the passing-in of local ENV vars at run time.

Is there equivalent functionality for myst sgx-run?

If so, let's document it; if not, can it be done?

Support SYS_msync

#2 adds a pass through implementation for SYS_msync. Per @jxyang, we should revisit msync after mmap support for files is added.

Improve code coverage measurement using hostfs

The previous code coverage measurement generate .gcda files on ramfs inside the enclave, and then exported at the end of the execution. This can be improved by using the new hostfs feature and writing the gcda files directly on hostfs.

Support generation of inotify events

Currently only solicitation of inotify events is supported but no such events are ever generated.

Consider adding hostfs integrity measurements

Consider adding hostfs integrity measurements (whole or partial file hashes and a final summary hash). Mystikos could rebuild the hashes during mounting and check the summary hash (included in the signed image or obtained from a policy service). Each file when read would be verified against this hash metadata (either as a whole or in chunks). This simple approach would only work for read-only host file systems. This would hostfs to be used as a root file system.

need test for verity tampering attack

It would be helpful to have a test to verify that the verity module (Merkle tree checks on block devices) correctly rejects tampering attacks. The verity module:

load the hash tree into memory
re-compute the hash tree from the hash-tree leaves
check that the root hash matches the one in the enclave.
check the hash of blocks against the hash tree leaves on each read.

The last step detects is most surely correct, based on the following logic.

        if (memcmp(&hash, phash, hash_size) != 0)
        {
            memset(block, 0, block_size);
            ERAISE(-EIO);
        }

But it would be worthwhile adding a test in case the code is accidentally changed to avoid the check.

Need a way to specify user heap size without config.json

Myst provides user applications a default heap size of 64MB. This can be customized in config.json.

However for internal development purpose, where we don't want to go through signing to change the user heap size, and we don't care about the MRENCLAVE which is impacted by the user heap size, we should have a way to customize user heap size quickly. For example, adding a command line as "myst exec-sgx user_heap=512m ..."

Add dotnet runtime tests to pipeline

Blockers:

Due to number of tests(p0 ~2.7k, p0+p1 ~10k) using rootfs is too slow. Each test takes ~15 seconds, running just the p0 tests would take over 8 hours. Once the ext2 PR #4 is merged, migrate samples/coreclr to use ext2 fs.
Resolution to #9