gwsystems / sledge-serverless-framework Goto Github PK
View Code? Open in Web Editor NEWSLEdge: a serverless runtime designed for the Edge.
License: GNU General Public License v2.0
SLEdge: a serverless runtime designed for the Edge.
License: GNU General Public License v2.0
In runtime/include/sandbox_run_queue.h
, the delete function is not aligned as excepted with auto-format using clang-format.
typedef struct sandbox_run_queue_config_t {
sandbox_run_queue_add_t add;
sandbox_run_queue_is_empty_t is_empty;
sandbox_run_queue_delete_t delete;
@gparmer
gparmer 23 hours ago
alignment
@bushidocodes
bushidocodes 19 hours ago Author
I have no explanation for this one, but it seems like there might be a clang-format bug. If I rename delete to anything else, it aligns properly. ๐ค
Blocked by #11.
Notice them mutex_lock for NCORES == 1 in both push/pop functions and steal calling 'pop' for NCORES == 1. Ideally you don't call steal for same core consumption but I just made the steal API wrap that logic, which is perhaps not good. We might just add the #if in the scheduling code which should explicitly call "pop" for single core and add an assert in "steal" function for NCORES == 1.
assert(!software_interrupt_is_enabled());
arch_context_init(&sandbox->ctxt, 0, 0);
@gparmer
gparmer 6 days ago
Is this necessary? Haven't we already switched to the sandbox by the nature of running in this function?
@phanikishoreg
phanikishoreg 6 days ago
This must be my code. It took me a while to figure out why I added it in the first place. I remember having some comments somewhere, forget why I don't see it.
void
current_sandbox_main(void)
{
struct sandbox *current_sandbox = current_sandbox_get();
// FIXME: is this right? this is the first time this sandbox is running.. so it wont
// return to worker_thread_switch_to_sandbox() api..
// we'd potentially do what we'd in worker_thread_switch_to_sandbox() api here for cleanup..
if (!software_interrupt_is_enabled()) {
arch_context_init(¤t_sandbox->ctxt, 0, 0);
worker_thread_next_context = NULL;
software_interrupt_enable();
}
<snip>
So, on setting up a sandbox, we manually setup its IP and SP so it uses user-level switch, as obviously mcontex_t is not applicable. The IP we set is actually the address to "current_sandbox_main()" and so, resetting of "sp" needed to be done explicitly after the first context switch, as IP is not pointing to that label 2 in context switch assembly, instead it is "main" function here. So we do this.
Also, consider consolidating to a single return statement
typedef enum
{
MOD_ARG_MODPATH = 0,
MOD_ARG_MODPORT,
MOD_ARG_MODNAME,
MOD_ARG_MODNARGS,
MOD_ARG_MAX,
} mod_argindex_t;
#define MOD_MAX_ARGS 16 // Max number of arguments
#define GLB_STDOUT "/dev/null"
#define GLB_STDERR "/dev/null"
#define GLB_STDIN "/dev/zero"
#define SBOX_RESP_STRSZ 32
#ifndef CACHELINE_SIZE
#define CACHELINE_SIZE 32
#endif
#define CACHE_ALIGNED attribute((aligned(CACHELINE_SIZE)))
Currently the "active" key is a string value "yes" or "no." JSON has a native boolean type that is probably a better fit for this.
The format.sh
script currently formats the files in our submodules. We should either blacklist anything in runtime/thirdparty
. Additionally, it's unclear to me if the .clang-format file is actually being ingested. I believe that I've successfully been able to modify settings here and impact the formatting logic, but while investigating this issue, this should be confirmed and then the indentation
keys should be looked at a bit close
There might be a conflict here. How does TabWidth
and IndentWidth
interact?
IndentWidth: 8
TabWidth: 8
UseTab: ForIndentation
We need a test regime, at the very least, for unit test and system test for when we do PRs, we both extend the test cases and test all the prior ones that are still applicable.
How to? I think we have a few test programs in runtime/test, like, FILE I/O, SOCKET I/O, FIBONACCI etc, perhaps we start from there?
We might also need to test a couple of things here,
@bushidocodes You have changed a lot, though it is changing naming, cleaning up unused code, outdated documentation, etc and not functionality changes I think, it would be good to write in the PRs what you tested!!
Possible example provided by @gparmer:
#define QUEUES_INITIALIZER(g, l) { .global = g, .local = l }
struct queue_fns {
globalqueue_fn_t global;
localqueue_fn_t local;
};
const struct queue_fns fns = QUEUES_INITIALIZER(dequeue_global, fprr_local);
Related to #35
Note: This issue requires further decomposition into distinct feature issues.
Currently, sledge instantiates serverless modules once on startup, and changes to what modules run require the sledge runtime to be taken down and restarted after the modules JSON file has been modified.
Realistically, the runtime should have a set of RESTful CRUD and start/stop APIs to control modules dynamically at runtime:
Currently, there is unused and untested functionality intended to allow for dynamic add / removal of modules:
e.g. MOD_NAME_MAX to MODULE_NAME_MAXIMUM
Current behavior panics if the backing data structure is full.
/**
* Pushes a sandbox request to the global deque
* @param sandbox_request
* @returns pointer to request if added. NULL otherwise
*/
static struct sandbox_request *
global_request_scheduler_minheap_add(void *sandbox_request)
{
assert(sandbox_request);
assert(global_request_scheduler_minheap);
if (unlikely(!listener_thread_is_running())) panic("%s is only callable by the listener thread\n", __func__);
int return_code = priority_queue_enqueue(global_request_scheduler_minheap, sandbox_request);
/* TODO: Propagate -1 to caller. Issue #91 */
if (return_code == -ENOSPC) panic("Request Queue is full\n");
return sandbox_request;
}
In runtime/src/sandbox_request_scheduler_fifo.c
,
// TODO: Running the runtime and listener cores on a single shared core is untested
// We are unsure if the locking behavior is correct, so there may be deadlocks
#if NCORES == 1
@gparmer
gparmer 5 days ago
@phanikishoreg Have these settings outlived their utility? Seems like we only test on NCORES > 1 now.
@phanikishoreg
phanikishoreg 5 days ago
I think so. It would probably be better to make sure NCORES > 1 in runtime startup code.
https://bitbucket.org/pypy/pypy/src/default/rpython/translator/c/src/stacklet/
This pypy package has context-switch reference code for different architectures.
Pasting the code here, just in case the link goes invalid / just cloned it here (https://bitbucket.org/pkgadepalli/pypy/src/default/rpython/translator/c/src/stacklet/)
static void *slp_switch(void *(*save_state)(void*, void*),
void *(*restore_state)(void*, void*),
void *extra) __attribute__((noinline));
static void *slp_switch(void *(*save_state)(void*, void*),
void *(*restore_state)(void*, void*),
void *extra)
{
void *result;
/*
registers to preserve: x18-x28, x29(fp), and v8-v15
registers marked as clobbered: x0-x18, x30
Note that x18 appears in both lists; see below. We also save
x30 although it's also marked as clobbered, which might not
be necessary but doesn't hurt.
Don't assume gcc saves any register for us when generating
code for slp_switch().
The values 'save_state', 'restore_state' and 'extra' are first moved
by gcc to some registers that are not marked as clobbered, so between
x19 and x29. Similarly, gcc expects 'result' to be in a register
between x19 and x29. We don't want x18 to be used here, because of
some special meaning it might have. We don't want x30 to be used
here, because it is clobbered by the first "blr".
This means that three of the values we happen to save and restore
will, in fact, contain the three arguments, and one of these values
will, in fact, not be restored at all but receive 'result'.
*/
__asm__ volatile (
/* The stack is supposed to be aligned as necessary already.
Save 12 registers from x18 to x29, plus 8 from v8 to v15 */
"stp x18, x19, [sp, -160]!\n"
"stp x20, x11, [sp, 16]\n"
"stp x22, x23, [sp, 32]\n"
"stp x24, x25, [sp, 48]\n"
"stp x26, x27, [sp, 64]\n"
"stp x28, x29, [sp, 80]\n"
"str d8, [sp, 96]\n"
"str d9, [sp, 104]\n"
"str d10, [sp, 112]\n"
"str d11, [sp, 120]\n"
"str d12, [sp, 128]\n"
"str d13, [sp, 136]\n"
"str d14, [sp, 144]\n"
"str d15, [sp, 152]\n"
"mov x0, sp\n" /* arg 1: current (old) stack pointer */
"mov x1, %[extra]\n" /* arg 2: extra, from x19-x28 */
"blr %[save_state]\n" /* call save_state(), from x19-x28 */
/* skip the rest if the return value is null */
"cbz x0, zero\n"
"mov sp, x0\n" /* change the stack pointer */
/* From now on, the stack pointer is modified, but the content of the
stack is not restored yet. It contains only garbage here. */
"mov x1, %[extra]\n" /* arg 2: extra, still from x19-x28 */
/* arg 1: current (new) stack pointer is already in x0*/
"blr %[restore_state]\n"/* call restore_state() */
/* The stack's content is now restored. */
"zero:\n"
/* Restore all saved registers */
"ldp x20, x11, [sp, 16]\n"
"ldp x22, x23, [sp, 32]\n"
"ldp x24, x25, [sp, 48]\n"
"ldp x26, x27, [sp, 64]\n"
"ldp x28, x29, [sp, 80]\n"
"ldr d8, [sp, 96]\n"
"ldr d9, [sp, 104]\n"
"ldr d10, [sp, 112]\n"
"ldr d11, [sp, 120]\n"
"ldr d12, [sp, 128]\n"
"ldr d13, [sp, 136]\n"
"ldr d14, [sp, 144]\n"
"ldr d15, [sp, 152]\n"
"ldp x18, x19, [sp], 160\n"
/* Move x0 into the final location of 'result' */
"mov %[result], x0\n"
: [result]"=r"(result) /* output variables */
/* input variables */
: [restore_state]"r"(restore_state),
[save_state]"r"(save_state),
[extra]"r"(extra)
: "x0", "x1", "x2", "x3", "x4", "x5", "x6", "x7", "x8", "x9",
"x10", "x11", "x12", "x13", "x14", "x15", "x16", "x17", "x18",
"memory", "cc", "x30" // x30==lr
);
return result;
}
#if defined(__ARM_ARCH_4__) || defined (__ARM_ARCH_4T__)
# define call_reg(x) "mov lr, pc ; bx " #x "\n"
#else
/* ARM >= 5 */
# define call_reg(x) "blx " #x "\n"
#endif
static void *slp_switch(void *(*save_state)(void*, void*),
void *(*restore_state)(void*, void*),
void *extra) __attribute__((noinline));
static void *slp_switch(void *(*save_state)(void*, void*),
void *(*restore_state)(void*, void*),
void *extra)
{
void *result;
/*
seven registers to preserve: r2, r3, r7, r8, r9, r10, r11
registers marked as clobbered: r0, r1, r4, r5, r6, r12, lr
others: r13 is sp; r14 is lr; r15 is pc
*/
__asm__ volatile (
/* align the stack and save 7 more registers explicitly */
"mov r0, sp\n"
"and r1, r0, #-16\n"
"mov sp, r1\n"
"push {r0, r2, r3, r7, r8, r9, r10, r11}\n" /* total 8, still aligned */
#ifndef __SOFTFP__
/* We also push d8-d15 to preserve them explicitly. This assumes
* that this code is in a function that doesn't use floating-point
* at all, and so don't touch the "d" registers (that's why we mark
* it as non-inlinable). So here by pushing/poping d8-d15 we are
* saving precisely the callee-saved registers in all cases. We
* could also try to list all "d" registers as clobbered, but it
* doesn't work: there is no way I could find to know if we have 16
* or 32 "d" registers (depends on the exact -mcpu=... and we don't
* know it from the C code). If we have 32, then gcc would "save"
* d8-d15 by copying them into d16-d23 for example, and it doesn't
* work. */
"vpush {d8, d9, d10, d11, d12, d13, d14, d15}\n" /* 16 words, still aligned */
#endif
/* save values in callee saved registers for later */
"mov r4, %[restore_state]\n" /* can't be r0 or r1: marked clobbered */
"mov r5, %[extra]\n" /* can't be r0 or r1 or r4: marked clob. */
"mov r3, %[save_state]\n" /* can't be r0, r1, r4, r5: marked clob. */
"mov r0, sp\n" /* arg 1: current (old) stack pointer */
"mov r1, r5\n" /* arg 2: extra */
call_reg(r3) /* call save_state() */
/* skip the rest if the return value is null */
"cmp r0, #0\n"
"beq zero\n"
"mov sp, r0\n" /* change the stack pointer */
/* From now on, the stack pointer is modified, but the content of the
stack is not restored yet. It contains only garbage here. */
"mov r1, r5\n" /* arg 2: extra */
/* arg 1: current (new) stack pointer is already in r0*/
call_reg(r4) /* call restore_state() */
/* The stack's content is now restored. */
"zero:\n"
#ifndef __SOFTFP__
"vpop {d8, d9, d10, d11, d12, d13, d14, d15}\n"
#endif
"pop {r1, r2, r3, r7, r8, r9, r10, r11}\n"
"mov sp, r1\n"
"mov %[result], r0\n"
: [result]"=r"(result) /* output variables */
/* input variables */
: [restore_state]"r"(restore_state),
[save_state]"r"(save_state),
[extra]"r"(extra)
: "r0", "r1", "r4", "r5", "r6", "r12", "lr",
"memory", "cc"
);
return result;
}
static void *slp_switch(void *(*save_state)(void*, void*),
void *(*restore_state)(void*, void*),
void *extra)
{
void *result;
__asm__ volatile (
"daddiu $sp, $sp, -0x50\n"
"sd $s0, 0x0($sp)\n" /* push the registers specified as caller-save */
"sd $s1, 0x8($sp)\n"
"sd $s2, 0x10($sp)\n"
"sd $s3, 0x18($sp)\n"
"sd $s4, 0x20($sp)\n"
"sd $s5, 0x28($sp)\n"
"sd $s6, 0x30($sp)\n"
"sd $s7, 0x38($sp)\n"
"sd $fp, 0x40($sp)\n"
"sd $ra, 0x48($sp)\n"
"move $s0, %[rstate]\n" /* save 'restore_state' for later */
"move $s1, %[extra]\n" /* save 'extra' for later */
"move $a1, %[extra]\n"/* arg 2: extra */
"move $a0, $sp\n" /* arg 1: current (old) stack pointer */
"move $t9, %[sstate]\n"
"jalr $t9\n" /* call save_state() */
"beqz $v0, 0f\n" /* skip the rest if the return value is null */
"move $sp, $v0\n" /* change the stack pointer */
/* From now on, the stack pointer is modified, but the content of the
stack is not restored yet. It contains only garbage here. */
"move $a1, $s1\n" /* arg 2: extra */
"move $a0, $v0\n" /* arg 1: current (new) stack pointer */
"move $t9, $s0\n"
"jalr $t9\n" /* call restore_state() */
/* The stack's content is now restored. */
"0:\n"
"move %[result], $v0\n"
"ld $s0, 0x0($sp)\n"
"ld $s1, 0x8($sp)\n"
"ld $s2, 0x10($sp)\n"
"ld $s3, 0x18($sp)\n"
"ld $s4, 0x20($sp)\n"
"ld $s5, 0x28($sp)\n"
"ld $s6, 0x30($sp)\n"
"ld $s7, 0x38($sp)\n"
"ld $fp, 0x40($sp)\n"
"ld $ra, 0x48($sp)\n"
"daddiu $sp, $sp, 0x50\n"
: [result]"=&r"(result)
: [sstate]"r"(save_state),
[rstate]"r"(restore_state),
[extra]"r"(extra)
: "memory", "v0", "a0", "a1", "t9"
);
return result;
}
#if !(defined(__LITTLE_ENDIAN__) ^ defined(__BIG_ENDIAN__))
# error "cannot determine if it is ppc64 or ppc64le"
#endif
#ifdef __BIG_ENDIAN__
# define TOC_AREA "40"
#else
# define TOC_AREA "24"
#endif
/* This depends on these attributes so that gcc generates a function
with no code before the asm, and only "blr" after. */
static __attribute__((noinline, optimize("O2")))
void *slp_switch(void *(*save_state)(void*, void*),
void *(*restore_state)(void*, void*),
void *extra)
{
void *result;
__asm__ volatile (
/* By Vaibhav Sood & Armin Rigo, with some copying from
the Stackless version by Kristjan Valur Jonsson */
/* Save all 18 volatile GP registers, 18 volatile FP regs, and 12
volatile vector regs. We need a stack frame of 144 bytes for FPR,
144 bytes for GPR, 192 bytes for VR plus 48 bytes for the standard
stackframe = 528 bytes (a multiple of 16). */
"mflr 0\n" /* Save LR into 16(r1) */
"std 0, 16(1)\n"
"std 14,-288(1)\n" /* the GPR save area is between -288(r1) */
"std 15,-280(1)\n" /* included and -144(r1) excluded */
"std 16,-272(1)\n"
"std 17,-264(1)\n"
"std 18,-256(1)\n"
"std 19,-248(1)\n"
"std 20,-240(1)\n"
"std 21,-232(1)\n"
"std 22,-224(1)\n"
"std 23,-216(1)\n"
"std 24,-208(1)\n"
"std 25,-200(1)\n"
"std 26,-192(1)\n"
"std 27,-184(1)\n"
"std 28,-176(1)\n"
"std 29,-168(1)\n"
"std 30,-160(1)\n"
"std 31,-152(1)\n"
"stfd 14,-144(1)\n" /* the FPR save area is between -144(r1) */
"stfd 15,-136(1)\n" /* included and 0(r1) excluded */
"stfd 16,-128(1)\n"
"stfd 17,-120(1)\n"
"stfd 18,-112(1)\n"
"stfd 19,-104(1)\n"
"stfd 20,-96(1)\n"
"stfd 21,-88(1)\n"
"stfd 22,-80(1)\n"
"stfd 23,-72(1)\n"
"stfd 24,-64(1)\n"
"stfd 25,-56(1)\n"
"stfd 26,-48(1)\n"
"stfd 27,-40(1)\n"
"stfd 28,-32(1)\n"
"stfd 29,-24(1)\n"
"stfd 30,-16(1)\n"
"stfd 31,-8(1)\n"
"li 12,-480\n" /* the VR save area is between -480(r1) */
"stvx 20,12,1\n" /* included and -288(r1) excluded */
"li 12,-464\n"
"stvx 21,12,1\n"
"li 12,-448\n"
"stvx 22,12,1\n"
"li 12,-432\n"
"stvx 23,12,1\n"
"li 12,-416\n"
"stvx 24,12,1\n"
"li 12,-400\n"
"stvx 25,12,1\n"
"li 12,-384\n"
"stvx 26,12,1\n"
"li 12,-368\n"
"stvx 27,12,1\n"
"li 12,-352\n"
"stvx 28,12,1\n"
"li 12,-336\n"
"stvx 29,12,1\n"
"li 12,-320\n"
"stvx 30,12,1\n"
"li 12,-304\n"
"stvx 31,12,1\n"
"stdu 1,-528(1)\n" /* Create stack frame */
"std 2, "TOC_AREA"(1)\n" /* Save TOC in the "TOC save area"*/
"mfcr 12\n" /* Save CR in the "CR save area" */
"std 12, 8(1)\n"
"mr 14, %[restore_state]\n" /* save 'restore_state' for later */
"mr 15, %[extra]\n" /* save 'extra' for later */
"mr 12, %[save_state]\n" /* move 'save_state' into r12 for branching */
"mr 3, 1\n" /* arg 1: current (old) stack pointer */
"mr 4, 15\n" /* arg 2: extra */
"stdu 1, -48(1)\n" /* create temp stack space (see below) */
#ifdef __BIG_ENDIAN__
"ld 0, 0(12)\n"
"ld 11, 16(12)\n"
"mtctr 0\n"
"ld 2, 8(12)\n"
#else
"mtctr 12\n" /* r12 is fixed by this ABI */
#endif
"bctrl\n" /* call save_state() */
"addi 1, 1, 48\n" /* destroy temp stack space */
"cmpdi 3, 0\n" /* skip the rest if the return value is null */
"bt eq, zero\n"
"mr 1, 3\n" /* change the stack pointer */
/* From now on, the stack pointer is modified, but the content of the
stack is not restored yet. It contains only garbage here. */
"mr 4, 15\n" /* arg 2: extra */
/* arg 1: current (new) stack pointer
is already in r3 */
"stdu 1, -48(1)\n" /* create temp stack space for callee to use */
/* ^^^ we have to be careful. The function call will store the link
register in the current frame (as the ABI) dictates. But it will
then trample it with the restore! We fix this by creating a fake
stack frame */
#ifdef __BIG_ENDIAN__
"ld 0, 0(14)\n" /* 'restore_state' is in r14 */
"ld 11, 16(14)\n"
"mtctr 0\n"
"ld 2, 8(14)\n"
#endif
#ifdef __LITTLE_ENDIAN__
"mr 12, 14\n" /* copy 'restore_state' */
"mtctr 12\n" /* r12 is fixed by this ABI */
#endif
"bctrl\n" /* call restore_state() */
"addi 1, 1, 48\n" /* destroy temp stack space */
/* The stack's content is now restored. */
"zero:\n"
/* Epilogue */
"ld 2, "TOC_AREA"(1)\n" /* restore the TOC */
"ld 12,8(1)\n" /* restore the condition register */
"mtcrf 0xff, 12\n"
"addi 1,1,528\n" /* restore stack pointer */
"li 12,-480\n" /* restore vector registers */
"lvx 20,12,1\n"
"li 12,-464\n"
"lvx 21,12,1\n"
"li 12,-448\n"
"lvx 22,12,1\n"
"li 12,-432\n"
"lvx 23,12,1\n"
"li 12,-416\n"
"lvx 24,12,1\n"
"li 12,-400\n"
"lvx 25,12,1\n"
"li 12,-384\n"
"lvx 26,12,1\n"
"li 12,-368\n"
"lvx 27,12,1\n"
"li 12,-352\n"
"lvx 28,12,1\n"
"li 12,-336\n"
"lvx 29,12,1\n"
"li 12,-320\n"
"lvx 30,12,1\n"
"li 12,-304\n"
"lvx 31,12,1\n"
"ld 14,-288(1)\n" /* restore general purporse registers */
"ld 15,-280(1)\n"
"ld 16,-272(1)\n"
"ld 17,-264(1)\n"
"ld 18,-256(1)\n"
"ld 19,-248(1)\n"
"ld 20,-240(1)\n"
"ld 21,-232(1)\n"
"ld 22,-224(1)\n"
"ld 23,-216(1)\n"
"ld 24,-208(1)\n"
"ld 25,-200(1)\n"
"ld 26,-192(1)\n"
"ld 27,-184(1)\n"
"ld 28,-176(1)\n"
"ld 29,-168(1)\n"
"ld 30,-160(1)\n"
"ld 31,-152(1)\n"
"lfd 14,-144(1)\n" /* restore floating point registers */
"lfd 15,-136(1)\n"
"lfd 16,-128(1)\n"
"lfd 17,-120(1)\n"
"lfd 18,-112(1)\n"
"lfd 19,-104(1)\n"
"lfd 20,-96(1)\n"
"lfd 21,-88(1)\n"
"lfd 22,-80(1)\n"
"lfd 23,-72(1)\n"
"lfd 24,-64(1)\n"
"lfd 25,-56(1)\n"
"lfd 26,-48(1)\n"
"lfd 27,-40(1)\n"
"lfd 28,-32(1)\n"
"ld 0, 16(1)\n"
"lfd 29,-24(1)\n"
"mtlr 0\n"
"lfd 30,-16(1)\n"
"lfd 31,-8(1)\n"
: "=r"(result) /* output variable: expected to be r3 */
: [restore_state]"r"(restore_state), /* input variables */
[save_state]"r"(save_state),
[extra]"r"(extra)
);
return result;
}
/* This depends on these attributes so that gcc generates a function
with no code before the asm, and only "blr" after. */
static __attribute__((noinline, optimize("O2")))
void *slp_switch(void *(*save_state)(void*, void*),
void *(*restore_state)(void*, void*),
void *extra)
{
void *result;
__asm__ volatile (
/* The Stackless version by Kristjan Valur Jonsson,
ported to s390x by Richard Plangger */
"stmg 6,15,48(15)\n"
// store f8 - f15 into the stack frame that is not used!
"std 8,128(15)\n"
"std 9,136(15)\n"
"std 10,144(15)\n"
"std 11,152(15)\n"
"std 12,16(15)\n"
"std 13,24(15)\n"
"std 14,32(15)\n"
"std 15,40(15)\n"
"lgr 10, %[restore_state]\n" /* save 'restore_state' for later */
"lgr 11, %[extra]\n" /* save 'extra' for later */
"lgr 14, %[save_state]\n" /* move 'save_state' into r14 for branching */
"lgr 2, 15\n" /* arg 1: current (old) stack pointer */
"lgr 3, 11\n" /* arg 2: extra */
"lay 15,-160(15)\n" /* create stack frame */
"basr 14, 14\n" /* call save_state() */
"lay 15,160(15)\n"
"cgij 2, 0, 8, zero\n" /* skip the rest if the return value is null */
"lgr 15, 2\n" /* change the stack pointer */
/* From now on, the stack pointer is modified, but the content of the
stack is not restored yet. It contains only garbage here. */
/* arg 1: current (new) stack pointer
is already in r2 */
"lgr 3, 11\n" /* arg 2: extra */
"lay 15,-160(15)\n" /* create stack frame */
"basr 14, 10\n" /* call restore_state() */
"lay 15,160(15)\n"
/* The stack's content is now restored. */
"zero:\n"
/* Epilogue */
"ld 8,128(15)\n"
"ld 9,136(15)\n"
"ld 10,144(15)\n"
"ld 11,152(15)\n"
"ld 12,16(15)\n"
"ld 13,24(15)\n"
"ld 14,32(15)\n"
"ld 15,40(15)\n"
"lmg 6,15,48(15)\n"
: "=r"(result) /* output variable: expected to be r2 */
: [restore_state]"r"(restore_state), /* input variables */
[save_state]"r"(save_state),
[extra]"r"(extra)
);
return result;
}
current_sandbox_initialize_io_handle
sandbox_initialize_io_handle
current_sandbox_initialize_io_handle_and_set_file_descriptor
sandbox_initialize_io_handle_and_set_file_descriptor
The wasmception directory in the silverfish submodule seems to be ~2GB. We should investigate adding some of these things to a .dockerignore file to improve build times.
Blocked by aWsm work implementing WASI-SDK and follow up work integrating the new release of aWsm into sledge.
No issue in aWsm for this, but PR is at gwsystems/aWsm#19
#define EXPORT attribute((visibility("default")))
#define IMPORT attribute((visibility("default")))
#define INLINE attribute((always_inline))
#define WEAK attribute((weak))
#ifndef CACHELINE_SIZE
#define CACHELINE_SIZE 32
#endif
#ifndef PAGE_SIZE
#define PAGE_SIZE (1 << 12)
#endif
#define CACHE_ALIGNED attribute((aligned(CACHELINE_SIZE)))
#define PAGE_ALIGNED attribute((aligned(PAGE_SIZE)))
// TODO: Better document the use of these macros
/* For this family of macros, do NOT pass zero as the pow2 */
#define round_to_pow2(x, pow2) (((unsigned long)(x)) & (~((pow2)-1)))
#define round_up_to_pow2(x, pow2) (round_to_pow2(((unsigned long)x) + (pow2)-1, (pow2)))
#define round_to_page(x) round_to_pow2(x, PAGE_SIZE)
#define round_up_to_page(x) round_up_to_pow2(x, PAGE_SIZE)
Relevant PR Comment:
runtime/include/arch/aarch64/context.h
lines 43-46
/*
* if ctx->regs[0] is set, this was last in a user-level context switch state!
* else restore mcontext..
*/
if (ctx->regs[0]) {
phanikishoreg
You've not incorporated your changes to aarch64. I think you should at least create an issue for this and look into it when feasible.
Related to #34
WASI provides a handful of sock_*
functions:
sock_accept
sock_recv
sock_send
sock_shutdown
This suggests that a sandbox is currently limited to polling and accepting client connections form a preopened socket bound to a particular port by the host embedding environment prior to sandbox initialization.
This particular subset of functionality does not really seem well aligned to a serverless runtime. We associate sockets with modules, which upon a client request:
read
redirect reads to STDIN as reads to the buffer containing the HTTP body. WASI calls to write
redirect writes to STDOUT as writes to an outbound buffer.shutdown
on the socket.Given the short lifetime of a webassembly sandbox, the accept
function is likely not useful for us.
instruction_memory_grow should trigger a WebAssembly trap if linear memory is exhaused.
From WebAssembly Spec:
"The memory.grow instruction grows memory by a given delta and returns the previous size, or โ1 if enough memory cannot be allocated. Both instructions operate in units of page size."
Might just be OK and ERR (-1), unless there is something actionable for OOM, in which case this would be OOM, ERR, and OK.
A debug log should be optionally generated based on a configuration. When not set, printfs and logs should not be used because of their impact on performance.
This likely should use the existing debuglog
infrastructure.
NOSTDIO logic is untested
The first function does seems to do two distinct things:
worker_thread_execute_runtime_maintenance_and_get_next_sandbox
The second function has a potential name conflict with what we'd refactor out of the first function
worker_thread_get_next_sandbox
This file should be wrapped in a top level array to be valid JSON. Not sure jsmn has been able to gracefully handle this or not.
Once Code is closer to final, revisit semantics:
local
- Phani suggested perhaps per_core
might be betterRelevant Comments
Phani:
I feel like "local" is not conveying what we want. It should either be "per_core" or "per_thread" or "worker_runqueue" sort...
runtime/include/types.h
struct sandbox_context_cache {
struct indirect_table_entry *module_indirect_table;
void * linear_memory_start;
u32 linear_memory_size;
@phanikishoreg
phanikishoreg 16 hours ago โข
Why not inline this in the struct sandbox, so all you have to do for context switch is,
worker_context_cache = sandbox->context_cache;
The struct assignment simplicity advantage I was talking about. I think it is cleaner, because we have 2 out of 3 of these variables in the sandbox struct anyway.
runtime/src/current_sandbox.c
};
worker_thread_current_sandbox = NULL;
} else {
local_sandbox_member_cache = (struct sandbox_member_cache){
@phanikishoreg
phanikishoreg 16 hours ago
This is where I'm talking about really,
local_sandbox_member_cache = sandbox->member_cache
/tmp/lto-llvm-f73ecb.o: In function wasmf_SolveCubic': ld-temp.o:(.text.wasmf_SolveCubic+0x49f): undefined reference to
cos'
ld-temp.o:(.text.wasmf_SolveCubic+0x4e7): undefined reference to cos' ld-temp.o:(.text.wasmf_SolveCubic+0x51f): undefined reference to
cos'
clang: error: linker command failed with exit code 1 (use -v to see invocation)
Makefile:36: recipe for target 'basic_math_sf' failed
make: *** [basic_math_sf] Error 1
Status Quo
uint64_t *p = worker_thread__get_memory_ptr_void(p_off, sizeof(uint64_t));
@phanikishoreg
Just a thought, instead of this function, we should have: xxxx_get_memory_ptr_uint64() which does not take size as an argument, so we don't need to worry about users passing wrong size argument here? Similarly, for other primitive data-types.
runtime/src/sandbox_run_queue_fifo.c: 50
assert(sandbox);
free(sandbox_request);
sandbox->state = RUNNABLE;
sandbox_run_queue_add(sandbox);
@gparmer
I hope that preemptions are disabled for this function (in the calling context). A preemption in between sandbox_runqueue_is_empty and here or the lines after the if could cause strange things to happen.
runtime/src/sandbox_run_queue_ps.c: 103
void
sandbox_run_queue_ps_preempt(ucontext_t *user_context)
{
software_interrupt_disable(); // no nesting!
@gparmer
A little surprisd to see this here as I figured surrounding abstractions disabled interrupts for much of this file's contents.
I don't think that much of the previous code in this file is preemption-safe (if a signal occurred and changed the runqueue between spans of the previous logic, would it work?), so perhaps this should be revisited?
runtime/src/sandbox_run_queue_ps.c: 152
// And load the context of this new sandbox
// RC of 1 indicates that sandbox was last in a user-level context switch state,
// so do not enable software interrupts.
if (arch_mcontext_restore(&user_context->uc_mcontext, &next_sandbox->ctxt) == 1)
@gparmer
As we've discussed, this code needs a careful audit. We cannot do a direct switch in the signal handler. We likely want assertions to check that. We should update the signal stack registers with those of the context we want to switch to. That might be what is happening here.
@phanikishoreg
phanikishoreg 6 days ago
Yes, that is what is happening. We cannot do a direct switch. We just update the input/output argument in signal handler and when that handler returns, that mcontext is restored and the thread continues from that execution.
runtime/src/sandbox_run_queue_ps.c: 155
if (arch_mcontext_restore(&user_context->uc_mcontext, &next_sandbox->ctxt) == 1)
should_enable_software_interrupt = false;
}
if (should_enable_software_interrupt) software_interrupt_enable();
@gparmer
If this can be called in the signal handler, we should absolutely not be enabling interrupts.
I'm likely wrong that this can be called from the signal handler as that would make a lot of this logic strange. Invariants in my head include:
If we're in the signal handler:
When we enable preemptions
Is this something that we can get rid of now?
Signals that occur when softint are disabled are currently ignored, potentially causing the timer ticks defining quantums to be missed. The runtime could be modified to track if a signal was received when softints were disabled, such that they could be deferred until softints are enabled.
Considerations:
phanikishoreg 30 minutes ago
Kk, its not the configuration feature we're taking about! So its the software interrupt handling, the signals that occur when softint are disabled are ignored currently. @sean is taking about handling those after we enable them, I see.
phanikishoreg 20 minutes ago
I think this really means, tracking if there was a timer when softints are disabled, and calling schedule again right after we enable interrupts. In most cases we have interrupts disabled only when we're actually scheduling, so. But I think it's useful in case, the current interrupt handler took too long and its actually time to schedule a new sandbox. (edited)
๐
2
gparmer 19 minutes ago
....and if we disable interrupts while locks are held
๐
2
Gabe Comment:
I don't have reverse lookup handy right now, so I'll make this comment assuming that this is called in a single location that is in a somewhat "random" context. Perhaps the signal handler, another thread, etc...:
I think that we want to decouple stack and register allocation/initialization from sandbox initialization at large. The rest of the sandbox (mainly linear memory) can be allocated in the context of the sandbox thread before it executes the sandbox. The goal here is that all sandbox-specific costs can be attributed to that sandbox, and scheduled as that sandbox, rather than "stealing" time in another random context that happens to try and allocate it.
A possible place this could be moved is current_sandbox_main
, which has this logic commented out.
runtime/src/sandbox.c
// Initialize the module
struct module *current_module = sandbox_get_module(sandbox);
int argument_count = module_get_argument_count(current_module);
// alloc_linear_memory();
Gabe and Phani seemed to disagree about this.
@gparmer
gparmer 5 days ago
I think that allocating linear memory in this thread is great.
WE likely need to allocate the stack elsewhere (as we cannot execute here without the stack), but the linear memory allocation and initialization can be here, I believe.
@phanikishoreg
phanikishoreg 5 days ago
Allocation of linear memory and stack are both outside of this "sandbox_main" function, they're done when the "worker_thread" processes a request.
We should just remove commented lines, my bad.
Phani: I noticed somewhere, you had a sandbox__new(). It would be good to be consistent in the verb we choose for allocation/deallocation everywhere. If I'm wrong, ignore this!
Me: Good point! I think I should probably go back and review initialize, allocate, and new. I seem to recall that the new function was something like allocate + other constructor-style logic (maybe like push to runqueue or something). I'll open this as a follow-up issue.
Phani: I'm not confident you're being consistent with the "verb" or "action" as a suffix. Like softint__initialize(), current_sandbox__set are inconsistent with this convention. Something to revisit for cleanup, perhaps.
Relevant Gabe comment:
In [the .c] file, or in the .h file, you should document the purpose of this API. As it is polymorphic, it is important that clients understand how to use it, and implementors understand how to implement a new variant.
There are different preprocessor configs that I haven't been testing that Phani used in the past. Should these be retained, refactored, or eliminated?
runtime/src/sandbox.c: 70-72
assert(sandbox != NULL);
sandbox->request_response_data_length = 0;
#ifndef USE_HTTP_UVIO
@gparmer
gparmer 6 days ago
Are we ever going to test the other paths? If "no", I wonder if we need the #defines for this. @phanikishoreg Have these outlasted their purpose?
@phanikishoreg
phanikishoreg 6 days ago
For the paper, we changed the HTTP request /response paths were using system calls for recv/send because we had some latency issues with long running applications. So this #define. I'm not sure whether to keep or remove it, it might be worth keeping if we come to a point where we debug performance.
In runtime/src/sandbox.c
, We are asserting that the sandbox size aligns to a page. This check should be performed when a module's JSON specification is loaded.
if (linear_memory_size + sandbox_size > memory_size) return NULL;
// Control information should be page-aligned
// TODO: Should I use round_up_to_page when setting sandbox_page?
assert(round_up_to_page(sandbox_size) == sandbox_size);
@gparmer
gparmer 5 days ago
Need to validate that this is an internal invariant. If it can be impacted by the client (or the json spec), then this should be an externally visible error. I'd imagine this code is fine as is, but A small check to ensure that sandbox_size can only be wrong by an internal bug is healthy.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.