Configuration management tools install and manage software on a machine that already exists. Terraform is not a configuration management tool, and it allows existing tooling to focus on their strengths: bootstrapping and initializing resources.
We drastically simplify the problems NixOps has to solve by breaking this problem down more. I propose we implement a minimal Terraform plugin to interop with NixOps.
Terraform does what it is good at: performing initial resource setup, and is not involved with regular deployments.
NixOps does what it is good at: building NixOS configurations and deploying them to the instances. NixOps is not involved with marshaling resources or interacting with third party APIs.
This breakdown allows reduces the coupling to NixOps, allowing NixOS users to benefit from the marshaling support. It also dramatically increases the community who is focusing on supporting more providers.
I think a big reason NixOps is insufficient is it grew too many responsibilities without the community to back it up. The only reason it supports AWS well is because the primary user of NixOps is a huge user of AWS. If we strip out all the support for server providers, it becomes a much simpler tool, and I think a much better tool.
- Support centralized configuration management of 2-50 NixOS hosts
- Eliminate provider-specific code from NixOps
- Take advantage of NixOps’ existing support for networks, peer nodes, and automatic VPN configuration.
A Terraform Provisioner is run on a system the first time it is created via Terraform.
We can use a NixOS provisioner to install NixOS when it cannot be installed through normal, supported means.
I propose we create a provisioner to replace an existing Linux
installation with NixOS based on the existing work of nixos-infect
,
nixos-lustrate
, and nixos-kexec
.
When the provisioner finishes, the instance will be:
- Running a stock NixOS configuration
- Usable SSH and firewall settings
- A public SSH key used by the controlling NixOps deploy host trusted by the system
- An appropriately generated hardware-configuration.nix on the host.
It might look like:
resource "packet_device" "web_frontend" {
hostname = "web-frontend"
plan = "baremetal_1"
facility = "ewr1"
operating_system = "ubuntu"
provisioner "nixos_kexec" {
provision_key = "${file("./provisioner-key.pub")}"
ip = "${packet_device.web_fronted.ip}"
disk_commands = <<COMMANDS
parted --script /dev/sda \
mklabel gpt \
mkpart primary 0MiB 100MiB \
mkpart primary 100MiB 100GiB
mkfs.vfat -L boot /dev/sda1
mkfs.ext4 -L nixos /dev/sda2
mount /dev/disk/by-label/nixos /mnt
mkdir /mnt/boot
mount /dev/disk/by-label/boot /mnt/boot
COMMANDS
}
}
resource "digitalocean_droplet" "web_frontend" {
image = "ubuntu-14-14-x64"
name = "web-frontend"
region = "nyc2"
size = "512mb"
provisioner "nixos_infect" {
provision_key = "${file("./provisioner-key.pub")}"
ip = "${digitalocean_droplet.web_fronted.ip}"
}
}
Note, long term the Provisioner should eventually accept custom config and possibly custom config-generating programs to use during the provisioning, to handle complex deployments like bare metal without DHCP support. These configuration generating programs should modify the hardware-configuration.nix to ensure the changes are propagated back to Terraform.
These custom config programs should likely be copied to the target with nix-copy-closure or a tarball of Nix stores, to avoid dependency hell.
A Terraform Provider takes some configuration parameters, does “some stuff” (usually making API calls) and then updates the state inside Terraform from what happened.
I propose we make a provider which exports a host’s
hardware-configuration.nix
and loads it in to the Terraform state.
From there, the configuration can be written to disk inside a NixOps
network.
provider "nixops" {
nixops_root = "./my-nixops-repo"
}
resource "nixos_node" "web_frontend" {
ip = "${digitalocean_droplet.web_frontend.ip}"
}
The hardware-configuration.nix
will be stored in the tfstate under
nixos_node.web_frontend.hardware_configuration
.
The configuration will also be saved under:
./my-nixops-repo └── nodes └── web_frontend ├── default.nix └── hardware-configuration.nix
The hardware-configuration.nix
should be obvious, and the
default.nix
will contain:
{ imports = [ ./hardware-configuration.nix ];
deployment.targetHost = "17.1.71.7";
}
At this point, the web_frontend/default.nix
is ready to be imported
in to a network. For example:
{
web_frontend = {
imports = [ ./nodes/web_frontend ];
services.openssh.enable = true;
};
}
provider "nixops" {
nixops_root = "./my-nixops-repo"
}
resource "nixos_node" "web_frontend" {
ip = "${digitalocean_droplet.web_frontend.ip}"
config = <<NIX
services.nginx.enable = true;
NIX
}
In this case, the default.nix
will contain:
{ imports = [ ./hardware-configuration.nix ./terraform.nix ];
deployment.targetHost = "17.1.71.7";
}
And terraform.nix
will contain:
{
services.nginx.enable = true;
}
It should be encouraged to not embed complex Nix in to the
nixos_node
resource, but to turn on a service or two described by
custom modules in the user’s NixOps network.
NixOps could be convinced to load all these nodes automatically via a combination of builtins.readDir, import, and map.