Describe the feature you'd like to request
Turborepo offers remote caching, allowing to share cached artifacts with team members.
Currently two remote caches are available:
- Remote cache hosted by Vercel
- Custom remote cache implemented by the end-user, which mimics the Vercel API
Some users might want to store their remote caches in other object-store-like services, like:
- AWS S3 (or any S3-compatible stored like MinIO)
- Google Cloud Storage
- Azure Blob Storage
- OpenStack Swift
- Anything else that can store files (HDFS, NFS, etc.)
This can be for several reasons. Here are three I can think about:
- I am not using Vercel with my team, and I don't plan to in the forseeable future. I don't want to use yet another cloud provider
- I do not want to share my build artifacts on a product that I can't control the security on (aka I want to manage my S3 security)
- I need to store my artifacts on-prem (aka I want to store this in my own datacenter)
Describe the solution you'd like
Turborepo would offer a configuration option to choose between supported remote cache stores. Users would need to configure access parameters and credentials in the configuration file of Turborepo in their monorepo.
In this case, Turborepo would write and reads its cache from the external cache host configured (S3 bucket for example).
Implicit login would need to be supported too: in the case of S3, Turborepo should be able to use the aws
CLI to authenticate its calls directly, without needing an API token / secret key. This is usually supported by the Cloud provider's SDK.
Describe alternatives you've considered
The problem with this is that Turborepo will need to implement code for each supported caching provider. This can lead to a lot of code, and possibly go outside of the scope of the project.
Here are two other solutions I can think about:
A separate proxy for each wanted cache host
Since Turborepo already supports custom HTTP endpoints (that mimic the Vercel API), we could just develop a proxy that will act as the Vercel API, and will just forward operations to the end cache provider (S3, Google Cloud Storage, etc.)
This has the advantage of not bloating the Turborepo main codebase and binary, and allowing users with exotic requirements to push a PR to Turobrepo for a very specific use-case (or forking Turborepo as a last resort).
The downside is that the proxy will have to be hosted somewhere, and that each user will need to host their own version of the proxy.
A plugin-based approach
Another way of doing this would be to delegate the communication with the remote cache to a second binary that lives in the same machine as Turborepo. Turborepo would be configured to delegate all operations to this binary.
This is similar to what Docker does with authentication to alternative registries. Here are examples of documentation: AWS ECR, Google Cloud Container Registry (their standalone helper is the closest I could find to what I'm suggesting)
Turborepo would fork a process of the provider's plugin (aka adapter), and would forward the instructions to it directly.
Again, this has the advantage of not bloating Turborepo's code, and allow for custom implementations without PRing or forking Turborepo.
Terraform also has a plugin based approach, but I'm not sure how they handle it. The Docker model will probably be easier to implement.