GithubHelp home page GithubHelp logo

mkantor / operator Goto Github PK

View Code? Open in Web Editor NEW
14.0 14.0 2.0 2.06 MB

A web server for static and dynamic content.

Home Page: https://operator.mattkantor.com

License: GNU General Public License v3.0

Rust 100.00%

operator's People

Contributors

mkantor avatar steffahn avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

operator's Issues

`OPTIONS *` requests cause panics

To reproduce:

cargo run -- serve -vvv --content-directory=samples/realistic-basic --bind-to=127.0.0.1:8080
curl -X OPTIONS --request-target '*' http://localhost:8080

For more info on this somewhat-special type of request, see this MDN page.

No release automation

I published v0.0.1 myself by running cargo publish from my laptop. It'd be better to trigger releases from a controlled environment where it can be guaranteed that CI checks have passed, etc.

This could be implemented as a GitHub Action that is triggered when a new version tag is pushed (or something along those lines). Here are a few examples of that sort of thing in the wild:

Template rendering should be async

Right now handlebars templates are rendered on the thread that's handling the request. I think this could result in denial-of-service if all threads in the actix worker pool are busy rendering templates when a new request comes in. I haven't personally observed this, but it can be confirmed with some load testing (maybe using templates that get content from a slow executable or large static file).

The handlebars library is inherently blocking, but Operator's availability could still be improved by moving rendering to another thread (and preferably deferring it until the Stream is polled).

Unfortunately there are tradeoffs regarding how render errors are surfaced. Currently clients get a nice HTTP 500 if a template explodes, but with this change they'd become stream errors (which are not as obvious in browsers). Executables already work like that (an executable which exits with failure is still 200 OK), so maybe that's okay.

Operator needs benchmarks & load tests

There's currently no programmatic way to measure how scalable/performant Operator is and therefore it's hard to notice regressions. Benchmarks and load testing would address this.

Ideally these would run in CI and pull requests that cause more than X% performance regression would get flagged. I'm concerned that the GitHub Actions runners are so heavily virtualized that benchmark numbers will be all over the place, but clap, actix, etc run benchmarks there so presumably it works for them.

Cargo provides some scaffolding for this, and criterion looks handy.

Websites should reflect all content changes without a server restart

Currently a running server will pick up some types of content changes but not others. For example, if you make an in-place edit to a static file or executable the update will be immediately visible on your website, but if you add a new file or edit a handlebars template you have to restart the server to see the changes. Ideally Operator would notice all types of changes and automatically refresh its internal state (updating indexes, re-compiling templates, etc) on the fly.

Restarting the server isn't a big deal for my use cases, so this issue lower priority for me. It'd be convenient while developing sites, though (it's basically hot reloading).

This needs some design work. Some things to consider:

  • What if the changes made the content directory invalid? Should the server self-destruct or just warn and keep its old state?
  • How to keep the heavy process of crawling the filesystem and creating the registries/index from bogging down the server? Can this be given a lower priority than request handling?
  • How to update the state in a way that doesn't impact in-flight requests?
  • Should probably debounce refreshes to avoid churn when users do things like copy a bunch of files into the content directory at once.

If the watching part is problematic or flaky for some reason, an explicit poke could be used to trigger refresh instead (or maybe "as well"). Some other programs (ab)use SIGHUP for this.

Security policy is not documented

This project could use a security audit and some documentation about "this is what it means for Operator to be secure" and "this is how to keep your site from getting hacked".

The current stance is something like "Operator is only as secure as the stuff in your content directory, so be careful", but it would be nice to come up with a set of best practices to go along with that. Some ideas:

  • Always create a dedicated user with limited permissions to run the server and/or run it inside a locked-down container/chroot jail.
  • Make sure that nobody except you has write permissions for the content directory and everything inside it (and, again, don't run Operator as yourself in production).
  • Any executables you put in the content directory must be trustworthy, and you should think carefully about what effects they can have. An executable that can run rm -rf / or fork bomb your system is not good!

Executables are scary

Of particular concern are executables, since they could be abused to cause lasting damage, data exfiltration, or resource exhaustion.

Capability-wise, executables are intentionally very limited right now in order to sidestep some of the scarier concerns (e.g. they have no way to observe any request data, which makes request-based code injection impossible). However, these limitations mean that some clearly-desirable functionality is impossible to implement (collecting form data, user agent sniffing, inspecting cookies, using query strings, etc). It would be nice to give executables more capabilities, but this is blocked on a coherent security story.

Operator could do some sandboxing when running executables to make them safer (e.g. unsetting environment variables might be a good start, but we could go all out with something like Linux namespaces and/or seccomp (although cross-platform-ness is an important consideration too)). Operator could also perform a mini audit of the content directory during startup (e.g. logging a warning if any of its contents are writable). It might also be totally acceptable to do none of this and put the burden on users to make sure their executables are trusted; that just needs to be documented loudly.

There's no way to use request data

Dynamic content (executables and templates) should be able to access request data (headers, the request payload, the query string, etc).

See #7 for some notes on security concerns.

Tradeoffs between handlebars partials and `get` helper stink

Currently the get helper and handlebars partial syntax ({{> blah}}) serve similar purposes (including content from other files), but they don't fully overlap:

  • Both can embed content from other templates.
  • The get helper can also embed content from executables or static files, but partials can't.
  • Handlebars partials can pass parameters and/or a block to included templates, but get can't.
  • get is passed a route, but partial syntax uses filesystem paths ({{get "/foo/bar"}} vs {{> foo/bar.html.hbs}}).
  • Handlebars partials have the concept of an "inline partial", and there's no analogue for get.

This is confusing and annoying. It would be great to eliminate these tradeoffs so that you don't need to think about when to use each one, either by combining them or making one a superset of the other.

Only Unix-like operating systems are supported

The way that the content directory is loaded expects Unixy filesystem semantics (e.g. executeableness indicated by a permission bit, / as the path separator, etc). There may be other platform-specific stuff in the codebase too (so far I've only developed/run Operator on macOS and Ubuntu).

I don't think it'll be difficult to make things more portable, but CI should be set up to run checks on multiple platforms so that once things are working they stay working.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.