divviup / prio-server Goto Github PK
View Code? Open in Web Editor NEWA Prio server implementation.
License: Mozilla Public License 2.0
A Prio server implementation.
License: Mozilla Public License 2.0
The AWS IAM role assumption policy we define for an ingestion server in Google cloud looks like this:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": "accounts.google.com"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"accounts.google.com:sub": "${var.ingestor_google_service_account_id}"
}
}
}
]
}
So this federates identity with accounts.google.com and lets service account var.ingestor_google_service_account_id
assume the role. @yuriks points out in #51 that we could include an accounts.google.com:oaud
condition in the policy. To do that, we'd first have to agree with ingestion server authors on what aud
value they would specify when they request auth tokens from Google, making this a protocol issue.
As a workaround for a mismatch between connection pool timeouts in Hyper and AWS S3, #36 carefully constructs HTTP clients with a carefully chosen timeout value. For more robustness, we should implement retries.
Our GKE cluster is currently configured to use public networks managed by Google/GKE. This means each worker node gets a public IP, and (I think) communication between worker nodes goes over the public internet. While this makes it trivial for our jobs to perform the egress they need (e.g. to AWS or GCP APIs or to fetch peer manifests), this is wasteful (ISRG is committed to environmentally responsible practices, which means reducing, reusing, recycling IPv4 addresses) and could be more secure. We should configure a private network for the GKE cluster and then narrowly control what egress and ingress we permit to worker nodes.
Running a test job on a PR today I saw GitHub throwing some warnings:
https://github.com/abetterinternet/prio-server/actions/runs/296499641
Apparently some GH Actions features are being deprecated to address a vulnerability: https://github.blog/changelog/2020-10-01-github-actions-deprecating-set-env-and-add-path-commands/
I don't see where we directly use this, so it might be the Terraform actions we depend on.
We could use avro_rs' support for compression codecs. However, the bulk of the data in the ingestion batches is encrypted, so we might not gain much from compression, and it's not impossible that we'll actually lose time doing the deflate/inflate.
It may not be practical to decrypt incoming packets with keys in a product like Amazon KMS, but we might want to it for the less frequently used Avro message signing key. This would require teaching the facilitator to use remote reference keys, as well as the ring::signature
stuff it does now.
We will use GitHub Actions to build and test the Rust code in prio-server/facilitator and libprio-rs. This will at least do build and test, emitting code coverage. If it's easy to do, we will emit x86_64 Linux binaries, but anyone else will have to cargo build
for their own platform.
Currently as was implemented in #40, a failure in cargo fmt
or clippy
will immediately abort the job without running the actual cargo build
or tests. This isn't ideal because, while fmt or clippy failures should be addressed before a PR is merged, in most cases they wouldn't actually cause compilation errors, and so it would save an edit-push-CI cycle if those steps were executed regardless, so that any possible compilation or test errors would be surface right during the first build attempt, rather than requiring fmt/lint problems to be fixed before we even attempt to run those.
I see two approaches to fixing this:
if:
conditions on every single later build step to instruct them to execute regardless.)Get a security assessment, review its findings and address the scary ones. Individual issues will be files for remediation of individual findings.
For Narnia and the foreseeable future, the workflow manager won't know how to discover peer manifests or validate batch signatures, instead delegating that work to individual facilitator (data plane) jobs. That will make it harder to restrict network egress for those jobs. We can move this work into the workflow manager and have it hand static parameters to facilitator jobs it runs.
Besides the application-level signature scheme over Avro batches, Amazon S3 PUT requests can sign over the content being uploaded, which gives us an integrity check before the workload manager evaluates an object. We should configure the bucket policies on buckets to require this so Amazon can do the signature verification for us.
https://docs.aws.amazon.com/AmazonS3/latest/API/sig-v4-header-based-auth.html
#51 makes the assumption that all peer data share processors are operated by a single organization (i.e., ISRG operates all facilitators, NIH/NCI operates all PHA servers). In particular, the Terraform modules assume a single global manifest is associated with all peer data share processors. This won't hold forever, so we should refactor the representation of peer data share processors to allow for multiple operators and multiple global manifests. This would mean restructuring the peer_share_processor_names
in the top-level .tfvars
to either be a list of pha-name, global manifest
pairs, or perhaps a map structure like
{
operator-name-1 => {
global-manifest-url => <url>
pha-names => [pha-name1, pha-name2, ...]
}
operator-name-2 => {
global-manifest-url => <url>
pha-names => [...]
}
}
...and then create and configure data share processors appropriately.
For Narnia and the immediate future past that, we create a single GCP service account and corresponding Kubernetes service account for each data share processor, and then have both the workflow manager and individual facilitator jobs it dispatches run as that account. We could create service accounts for each individual workflow steps and use them to construct more restrictive policies. This would let us deny the workflow manager the ability to delete certain objects, and deny facilitator jobs access to Kubernetes API.
In order to de-bias the final data, we need to include the total number of shares that are included in the sum part. We will have to extend the PrioSumPart
message to include a field for this and have the reduce step(s) record the value.
Once we have a Terraformized application, we still need tools around that which can do things like issue keys, construct keyfiles, post them to S3 buckets, get certificates, etc. This tool would fit into a PHA onboarding workflow in which we obtain a minimal set of parameters from new PHAs (e.g., S3 bucket URLs, keyfile location) and let the tool do the rest accordingly.
Right now we have some statically deployed manifest files - we should make these get deployed using GCP.
From #4
- I think we should probably split each subcommand into a separate binary or at least a separate file. That huge main.rs command parser look very unwieldy.
- I think we probably don't want to do all configuration through flags, because there's a lot of knobs already and it seems we'll only have more, but we'll need to figure out a config file schema or what to do instead first.
- Maybe consider using clap v3 (no stable release yet) or https://github.com/TeXitoi/structopt (which clap v3 is based on) to do the flag parsing. The canonical API being a struct also makes it a bit easier to load those values in from other sources too.
Annotated structs would definitely cut down on the volume of argument handling code. We also have to figure out how parameters will be provided when this is run by the execution manager, which could be done with command line arguments passed to the Docker container, environment variables or a config file placed into the container.
We need to move our entire deployment, including all the buckets we use as mailboxes, to GCP for budget reasons. This will entail some protocol changes because we will have to figure out how Apple will authenticate, and what parameters we need to discover from them to be able to configure the ingestion buckets (probably a GCP service account).
Data share processors must provide their ECIES public keys to the mobile device OS owners, and then the ingestors, facilitators and PHA must exchange public keys. We need to work out an automation friendly means of doing these key exchanges each time a new facilitator-PHA pair is brought online.
In the original protobuf schema defined in the IDL document, PrioSumPart
contained a single bytes value_sum
field, and a repeated batch_uuid
list. The batch_uuid
list represents the uuids of all of the batches (tracing back to the ingestor) that participated in this aggregation.
At some point in the conversion to Avro, this got flipped and the current schema now has only a single batch_uuid
, but an array of sums
.
Not only does this not let us properly represent the batch_uuid
s when we do the final multi-batch reduction sum, the sums
array will also always contain only one value (because all aggregations we'll do always produce a single output sum per file, both the per-batch sum, and the overall sum for a given time range), so this seems like an oversight and should be fixed to match the intent of the original schema.
In order to read and write messages from S3, the facilitator needs to implement an S3Transport
alongside its existing FileTransport
. Rusoto appears to be the crate of choice for working with AWS APIs.
We can use this GitHub action to get code coverage reports from test runs in CI.
For simplicity we chose to server server manifest files over HTTPS instead of live config endpoints. The drawback there is that stale manifests could be cached by, I dunno, proxies or CDNs or ISPs. We should investigate ways to mitigate this.
This eventually will be a Kubernetes service, so the facilitator needs to be dockerized. We will provide a Dockerfile, build docker images in CI, and store them in Dockerhub (https://hub.docker.com/u/letsencrypt).
Persistent data (ingestion batches, validation batches and sum parts) should be purged after some delay has passed to keep a lid on storage costs. Yuri suggested implementing this in the execution manager, since it already is a cronjob that periodically scans the various buckets to dispatch work. We should also carefully devise a retention policy.
Per https://docs.google.com/document/d/1eKIXOVK6W8AsSnoisw26R1rpKtPBf0rnjn9kSSEC3PE/edit?disco=AAAAHJj4NAg, Apple needs the whole cert chain given to them, not just the leaf. The ACME protocol exposes an endpoint for this, so this should be possible, but we have to figure out how to get it from CertMagic.
Complete https://docs.google.com/document/d/1MdfM3QT63ISU70l63bwzTrxr93Z7Tv7EDjLfammzo6Q/edit# and have it approved by stakeholders
We concluded in the design doc that the ingestion servers will use a single batch signing key for all messages, regardless of which facilitator-PHA pair is the recipient. I need to update the design doc and make any corresponding code changes.
The IDL document describes an "invalid UUID" file alongside the sum part emitted by the facilitator. We need to resolve open questions about how to handle these packets at different pipeline stages and how to represent these packets in the intermediate and final product. Final decisions to be recorded in the design document.
Write the tool that uses libprio_rs to construct, validate and aggregate Prio data batches. It should be possible to exercise the end to end pipeline from the command line, with realistic Avro encoded data being emitted at each step.
@yuriks points out that the crate we call facilitator could just as easily be the basis of a PHA server. We should rename it to something like "prio-data-share-processor".
Eventually, both Apple and Google servers will be populating the encryption_key_id
field in the PrioDataSharePacket
messages they send. For data share processors to handle them generically, we need to agree on what value goes in there, and make sure it's a value that lets the data share processor look up the appropriate packet decryption key. We would have to make sure that whatever value we use is available to both Apple and Google ingestion servers and mobile devices, as necessary.
One proposal is the serial number of the X.509 certificate used to transmit the packet encryption key to Apple.
Enable CI and testing/code cov for the deploy tool
Once we have all participants in the system advertising public keys and other params from keyfiles or manifests, we can automatically rotate keys used to sign messages written into S3 buckets.
@winstrom informs me that Apple's ingestion server will not be populating the encryption_key_id
field in PrioDataSharePacket
messages at first. This means:
We store the JSON Avro schema files alongside the Rust implementation here in prio-server. @yuriks points out that this means we have to rev the implementation in lockstep with the schema. This would be easier if the schema was pulled in as a versioned dependency, and it would also be better for other projects wanting to consume the schemas.
For example, the permalinks don't work and there are no annotations in the diff: https://github.com/abetterinternet/prio-server/pull/53/checks?check_run_id=1282426037
We should investigate Kubernetes or GCP level resource limits, to mitigate the risk of runaway jobs causing problems.
Reviewers of #2 pointed out we should include an explicit key identifier in headers so that message recipients can gracefully handle key rotations instead of having to try all the keys listed in a peer's manifest.
I think using
Path
here is actually not the right call:Path
inherently deals with OS-native paths. While that's the natural key forFileTransport
, it doesn't really apply for something like S3. If I run this on Windows, for example, I actually still want to keep using/
as the separator when I upload to S3, not\
. So I think the right way of doing this is to have key be a generic path value (can either just usestr
with/
as the separator, or create your ownnewtype
over it) and then insideFileTransport
you can parse it and re-convert to aPath
when accessing thefilesystem
.
Figure out what kind of metrics we want to emit from the server, what conditions to alert on, and where alerts should go. For instance, the facilitator could encounter failures because of bad data emitted by an ingestion server, so perhaps we should figure out how to route such alerts to the other organisation.
While the ingestion share packets are encrypted with the ECIES keys anyway, we should turn on server side encryption of all bucket contents, ideally with a KMS key, to also protect metadata, validation shares and sum parts. This should be easy to do in Terraform.
https://docs.aws.amazon.com/AmazonS3/latest/dev/bucket-encryption.html
Right now we're using debug builds because they're faster. For prod deployments we'll want to do release builds. Docker has passthrough environment variables that could be good for this. Follow-up from #97.
Eventually share processors have to evaluate a sequence of (own validation, peer validation, data share) triples, and must ensure that all three packets have the same UUIDs. Can the share processor assume that packets will appear in the same order in all three sequences of packets? If not, then I have to sort each packet sequence lexicographically by UUID in O(n log n) before I even begin processing triples. If instead we require that both share processors maintain the order of packets emitted by the ingestor, we can skip that step. Further, if we replace the packet UUID with a sequence number, then share processors can easily defend themselves against malformed peer validation files by verifying that sequence numbers increase monotonically as they process validation packets.
The pair (batch_uuid, packet sequence number) remains unique globally.
The design doc describes an execution manager responsible for coordinating the map/reduce steps as Kubernetes jobs. We should write that.
The IDL document contains a semi-formal spec of the ingestion, validation and sum part batches, but it is no longer authoritative, especially since we have changed some things about the signature format in the Avro schema. We should formally document the batch file layouts, including the signatures, in the server design doc.
The implementation of aggregation in #4 implements peer validation share validation and per-batch summing in the final reduce/aggregation step, but those steps could be implemented as a map step in parallel across the batches. We should revisit the implementation in lib/aggregation.rs
and break the aggregate_share
method into a separate step.
Once we have settled on a public cloud to use, write a Terraform module that can spin up a facilitator instance.
@yuriks made a great point about handling fewer credentials, and since our environment is mainly built around GCP it makes sense to reduce the dependence on CF as the only DNS provider.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.