GithubHelp home page GithubHelp logo

mantle's Introduction

๐Ÿšจ Mantle for Fedora CoreOS and RHEL CoreOS has been merged into coreos-assembler. :rotating_light:
This cl branch is for CoreOS Container Linux.

Mantle: Gluing Container Linux together

This repository is a collection of utilities for developing Container Linux. Most of the tools are for uploading, running, and interacting with Container Linux instances running locally or in a cloud.

Overview

Mantle is composed of many utilities:

  • cork for handling the Container Linux SDK
  • gangue for downloading from Google Storage
  • kola for launching instances and running tests
  • kolet an agent for kola that runs on instances
  • ore for interfacing with cloud providers
  • plume for releasing Container Linux

All of the utilities support the help command to get a full listing of their subcommands and options.

Tools

cork

Cork is a tool that helps working with Container Linux images and the SDK.

cork create

Download and unpack the Container Linux SDK.

cork create

cork enter

Enter the SDK chroot, and optionally run a command. The command and its arguments can be given after --.

cork enter -- repo sync

cork download-image

Download a Container Linux image into $PWD/.cache/images.

cork download-image --platform=qemu

Building Container Linux with cork

See Modifying Container Linux for an example of using cork to build a Container Linux image.

gangue

Gangue is a tool for downloading and verifying files from Google Storage with authenticated requests. It is primarily used by the SDK.

gangue get

Get a file from Google Storage and verify it using GPG.

kola

Kola is a framework for testing software integration in Container Linux instances across multiple platforms. It is primarily designed to operate within the Container Linux SDK for testing software that has landed in the OS image. Ideally, all software needed for a test should be included by building it into the image from the SDK.

Kola supports running tests on multiple platforms, currently QEMU, GCE, AWS, VMware VSphere, Packet, and OpenStack. In the future systemd-nspawn and other platforms may be added. Local platforms do not rely on access to the Internet as a design principle of kola, minimizing external dependencies. Any network services required get built directly into kola itself. Machines on cloud platforms do not have direct access to the kola so tests may depend on Internet services such as discovery.etcd.io or quay.io instead.

Kola outputs assorted logs and test data to _kola_temp for later inspection.

Kola is still under heavy development and it is expected that its interface will continue to change.

By default, kola uses the qemu platform with the most recently built image (assuming it is run from within the SDK).

kola run

The run command invokes the main kola test harness. It runs any tests whose registered names matches a glob pattern.

kola run <glob pattern>

--blacklist-test can be used if one or more tests in the pattern should be skipped. This switch may be provided once:

kola --blacklist-test linux.nfs.v3 run

multiple times:

kola --blacklist-test linux.nfs.v3 --blacklist-test linux.nfs.v4 run

and can also be used with glob patterns:

kola --blacklist-test linux.nfs* --blacklist-test crio.* run

kola list

The list command lists all of the available tests.

kola spawn

The spawn command launches Container Linux instances.

kola mkimage

The mkimage command creates a copy of the input image with its primary console set to the serial port (/dev/ttyS0). This causes more output to be logged on the console, which is also logged in _kola_temp. This can only be used with QEMU images and must be used with the coreos_*_image.bin image, not the coreos_*_qemu_image.img.

kola bootchart

The bootchart command launches an instance then generates an svg of the boot process using systemd-analyze.

kola updatepayload

The updatepayload command launches a Container Linux instance then updates it by sending an update to its update_engine. The update is the coreos_*_update.gz in the latest build directory.

kola subtest parallelization

Subtests can be parallelized by adding c.H.Parallel() at the top of the inline function given to c.Run. It is not recommended to utilize the FailFast flag in tests that utilize this functionality as it can have unintended results.

kola test namespacing

The top-level namespace of tests should fit into one of the following categories:

  1. Groups of tests targeting specific packages/binaries may use that namespace (ex: docker.*)
  2. Tests that target multiple supported distributions may use the coreos namespace.
  3. Tests that target singular distributions may use the distribution's namespace.

kola test registration

Registering kola tests currently requires that the tests are registered under the kola package and that the test function itself lives within the mantle codebase.

Groups of similar tests are registered in an init() function inside the kola package. Register(*Test) is called per test. A kola Test struct requires a unique name, and a single function that is the entry point into the test. Additionally, userdata (such as a Container Linux Config) can be supplied. See the Test struct in kola/register/register.go for a complete list of options.

kola test writing

A kola test is a go function that is passed a platform.TestCluster to run code against. Its signature is func(platform.TestCluster) and must be registered and built into the kola binary.

A TestCluster implements the platform.Cluster interface and will give you access to a running cluster of Container Linux machines. A test writer can interact with these machines through this interface.

To see test examples look under kola/tests in the mantle codebase.

For a quickstart see kola/README.md.

kola native code

For some tests, the Cluster interface is limited and it is desirable to run native go code directly on one of the Container Linux machines. This is currently possible by using the NativeFuncs field of a kola Test struct. This like a limited RPC interface.

NativeFuncs is used similar to the Run field of a registered kola test. It registers and names functions in nearby packages. These functions, unlike the Run entry point, must be manually invoked inside a kola test using a TestCluster's RunNative method. The function itself is then run natively on the specified running Container Linux instances.

For more examples, look at the coretest suite of tests under kola. These tests were ported into kola and make heavy use of the native code interface.

Manhole

The platform.Manhole() function creates an interactive SSH session which can be used to inspect a machine during a test.

kolet

kolet is run on kola instances to run native functions in tests. Generally kolet is not invoked manually.

ore

Ore provides a low-level interface for each cloud provider. It has commands related to launching instances on a variety of platforms (gcloud, aws, azure, esx, and packet) within the latest SDK image. Ore mimics the underlying api for each cloud provider closely, so the interface for each cloud provider is different. See each providers help command for the available actions.

Note, when uploading to some cloud providers (e.g. gce) the image may need to be packaged with a different --format (e.g. --format=gce) when running image_to_vm.sh

plume

Plume is the Container Linux release utility. Releases are done in two stages, each with their own command: pre-release and release. Both of these commands are idempotent.

plume pre-release

The pre-release command does as much of the release process as possible without making anything public. This includes uploading images to cloud providers (except those like gce which don't allow us to upload images without making them public).

plume release

Publish a new Container Linux release. This makes the images uploaded by pre-release public and uploads images that pre-release could not. It copies the release artifacts to public storage buckets and updates the directory index.

plume index

Generate and upload index.html objects to turn a Google Cloud Storage bucket into a publicly browsable file tree. Useful if you want something like Apache's directory index for your software download repository. Plume release handles this as well, so it does not need to be run as part of the release process.

Platform Credentials

Each platform reads the credentials it uses from different files. The aws, azure, do, esx and packet platforms support selecting from multiple configured credentials, call "profiles". The examples below are for the "default" profile, but other profiles can be specified in the credentials files and selected via the --<platform-name>-profile flag:

kola spawn -p aws --aws-profile other_profile

aws

aws reads the ~/.aws/credentials file used by Amazon's aws command-line tool. It can be created using the aws command:

$ aws configure

To configure a different profile, use the --profile flag

$ aws configure --profile other_profile

The ~/.aws/credentials file can also be populated manually:

[default]
aws_access_key_id = ACCESS_KEY_ID_HERE
aws_secret_access_key = SECRET_ACCESS_KEY_HERE

To install the aws command in the SDK, run:

sudo emerge --ask awscli

azure

azure uses ~/.azure/azureProfile.json. This can be created using the az command:

$ az login`

It also requires that the environment variable AZURE_AUTH_LOCATION points to a JSON file (this can also be set via the --azure-auth parameter). The JSON file will require a service provider active directory account to be created.

Service provider accounts can be created via the az command (the output will contain an appId field which is used as the clientId variable in the AZURE_AUTH_LOCATION JSON):

az ad sp create-for-rbac

The client secret can be created inside of the Azure portal when looking at the service provider account under the Azure Active Directory service on the App registrations tab.

You can find your subscriptionId & tenantId in the ~/.azure/azureProfile.json via:

cat ~/.azure/azureProfile.json | jq '{subscriptionId: .subscriptions[].id, tenantId: .subscriptions[].tenantId}'

The JSON file exported to the variable AZURE_AUTH_LOCATION should be generated by hand and have the following contents:

{
  "clientId": "<service provider id>", 
  "clientSecret": "<service provider secret>", 
  "subscriptionId": "<subscription id>", 
  "tenantId": "<tenant id>", 
  "activeDirectoryEndpointUrl": "https://login.microsoftonline.com", 
  "resourceManagerEndpointUrl": "https://management.azure.com/", 
  "activeDirectoryGraphResourceId": "https://graph.windows.net/", 
  "sqlManagementEndpointUrl": "https://management.core.windows.net:8443/", 
  "galleryEndpointUrl": "https://gallery.azure.com/", 
  "managementEndpointUrl": "https://management.core.windows.net/"
}

do

do uses ~/.config/digitalocean.json. This can be configured manually:

{
    "default": {
        "token": "token goes here"
    }
}

esx

esx uses ~/.config/esx.json. This can be configured manually:

{
    "default": {
        "server": "server.address.goes.here",
        "user": "user.goes.here",
        "password": "password.goes.here"
    }
}

gce

gce uses the ~/.boto file. When the gce platform is first used, it will print a link that can be used to log into your account with gce and get a verification code you can paste in. This will populate the .boto file.

See Google Cloud Platform's Documentation for more information about the .boto file.

openstack

openstack uses ~/.config/openstack.json. This can be configured manually:

{
    "default": {
        "auth_url": "auth url here",
        "tenant_id": "tenant id here",
        "tenant_name": "tenant name here",
        "username": "username here",
        "password": "password here",
        "user_domain": "domain id here",
        "floating_ip_pool": "floating ip pool here",
        "region_name": "region here"
    }
}

user_domain is required on some newer versions of OpenStack using Keystone V3 but is optional on older versions. floating_ip_pool and region_name can be optionally specified here to be used as a default if not specified on the command line.

packet

packet uses ~/.config/packet.json. This can be configured manually:

{
	"default": {
		"api_key": "your api key here",
		"project": "project id here"
	}
}

qemu

qemu is run locally and needs no credentials, but does need to be run as root.

qemu-unpriv

qemu-unpriv is run locally and needs no credentials. It has a restricted set of functionality compared to the qemu platform, such as:

  • Single node only, no machine to machine networking
  • DHCP provides no data (forces several tests to be disabled)
  • No Local cluster

mantle's People

Contributors

ajeddeloh avatar anakaiti avatar arithx avatar ashcrow avatar barthy1 avatar bcwaldon avatar bgilbert avatar brianredbeard avatar cgwalters avatar crawford avatar dm0- avatar dustymabe avatar ethernetdan avatar euank avatar glevand avatar jlebon avatar joeatwork avatar jonboulle avatar lucab avatar lupan2005 avatar marineam avatar miabbott avatar mike-nguyen avatar mischief avatar oremj avatar philips avatar sayanchowdhury avatar yichengq avatar yuqi-zhang avatar zbwright avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mantle's Issues

kola: etcdctl

Need to test etcdctl since we removed its use as part of the basic etcd discovery tests. This depends on etcdctl returning to using non-zero exit codes to facilitate scripting with it.

rfc: test naming scheme

our test names are super disorganized, and have no consistency. having a proper naming scheme that is globbable by category would be nice.

something like:

base/adduser
fleet/submitunit
etcd/discovery
etcd/atomicswap
docker/push
docker/pull
systemd/journald/remote
systemd/nspawn
net/nfs/v3
net/nfs/v4
ext/deis
ext/kubernetes

then i can tell kola run "docker/*", for example.

kola: systemd.journal.remote is broken with systemd v229

the remote journal stuff now generates a journal file with no port, e.g. /var/log/journal/remote/remote-10.0.0.2.journal instead of /var/log/journal/remote/remote-10.0.0.2:19531.journal. the test needs to be fixed to handle this for v229.

mantle: clean-up manual instance spawning

ore and kola now both share the ability to manually spawn a VM. plume shares code with ore and the copied code is out of sync. The kola spawn command probably belongs in ore and we should try and factor common code out among all three binaries and clean-up the user-interfaces. See: #160

Limit Kola tests to applicable architectures

The kola test docker.oldclient will only work for amd64 hosts. As we develop more support for arm devices, we'll need some mechanism to tag certain tests as arm or x86 only.

kola: cannot reboot machines

if we reboot a machine during a test, it breaks our ssh client. this limits our ability to test CoreOS beyond the first boot.

Split --gce-project

The flag --gce-project is used as both the gce-image-project and the gce-project. So if you want to use an image from a different project then the project on which machine's are spawned it can't be done. Gcloud differentiates these projects and we should too.

kola: check coreos semver

kola tests should be able to specify which CoreOS versions they can execute on, and if the remote machine does not have the appropriate version, the tests should be skipped.

i'm not sure what this looks like yet, since the actual test functions nor the platform receive information (*kola.Test) about the currently running test.

plume: build aws images

Plume should build AWS images free of java and python dependencies. This will unblock doing automated tests of AWS images.

update Google Cloud API client import paths and more

The Google Cloud API client libraries for Go are making some breaking changes:

  • The import paths are changing from google.golang.org/cloud/... to
    cloud.google.com/go/.... For example, if your code imports the BigQuery client
    it currently reads
    import "google.golang.org/cloud/bigquery"
    It should be changed to
    import "cloud.google.com/go/bigquery"
  • Client options are also moving, from google.golang.org/cloud to
    google.golang.org/api/option. Two have also been renamed:
    • WithBaseGRPC is now WithGRPCConn
    • WithBaseHTTP is now WithHTTPClient
  • The cloud.WithContext and cloud.NewContext methods are gone, as are the
    deprecated pubsub and container functions that required them. Use the Client
    methods of these packages instead.

You should make these changes before September 12, 2016, when the packages at
google.golang.org/cloud will go away.

coreos.filesystem.writabledirs is flaky

find: `/etc/gshadow.lock': No such file or directory
find: `/etc/shadow.lock': No such file or directory
find: `/etc/passwd.lock': No such file or directory
find: `/etc/group.lock': No such file or directory
...
2016-08-23T03:27:09Z kola: --- FAIL: coreos.filesystem.writabledirs on gce (34.599s)
2016-08-23T03:27:09Z kola:         Failed to run find: output [], status: Process exited with: 1. Reason was:  ()

it looks like there's a race of find reading the direntries of /etc. possibly racing against the gce agent adding users.

what's a sane way to fix this race? stop google-accounts-manager.service?

kola: run external Docker unit/functional tests

Docker has its own set of unit/functional tests. Lets investigate making these run on kola without having to import and manually update this test code. Ideally, this test runs the set of Docker tests that most closely match the version of Docker in the CoreOS image being run.

kola: selinux tests

right now there's some outstanding quirks with selinux on coreos. kola should have tests that set up selinux in enforcing mode and do some basic sanity checks.

kola: manual test jenkins job against AWS

To have parity with the release tests that we are running today we need to run against a set of hosts on AWS. Steps:

  • Setup up a "manual AWS jenkins job" that takes an existing AMI as a parameter, write docs on how to do this
  • Setup a job that can look at AMI ids for a release on the release mirror and run the AWS tests
  • Setup a job that can look at AMI ids for a release on a private release URL and run the AWS tests

kola: comprehensive flannel tests

we need to have tests for flannel. currently, we cannot test flannel on our qemu target because it has no internet connectivity and thus cannot pull the docker image.

however, we can write tests for gce and aws, and test flannel's gce and aws-vpc backends.

Publish Azure

This is going to require either a client on the Windows machine or reverse engineering their utilities.

kola: etcd1 tests fail

etcd1 tests fail because etcdctl now tries to reach a uri which 404s with etcd1. we either need to rewrite the tests to use e.g. curl for etcd1, or nuke the tests.

kola: separate testing of kola from testing of OS using kola

This is a summary of @marineam's suggestion on having stable kola releases:

Since its undesirable to have our release builds break because kola happened to break, we want the release tests to use a known-good version of kola. Testing kola itself will continue to happen by trigging test runs from a PR in mantle and always use the latest commit to master.

Bumping the ebuild in the SDK can be the definitive process for cutting a release of kola. Doing this means you've tested that the latest kola commit works fine against the current SDK builds. This also supports developers working in the SDK so they can locally run and test their latest image builds using a stable kola commit.

To automatically propagate ebuild bumps (new kola releases) to be immediately used in our release tests, we will have to upload the latest kola builds alongside the latest OS image builds. The release tests can then just use that version of kola rather then compiling from master.

kola: test kubernetes using CoreOS docs

Currently, a kubernetes multi-node smoke test exists that uses a fairly direct translation of some upstream community docs. This test should have its cloud-configs replaced to use our own docs which include TLS and using the built-in CoreOS kubelet service file. This is important to test our built in kubelet binary (soon to be an aci).

kola: figure out why ssh fails

almost every test run in qemu we see:

2015-12-09T22:00:26Z kola: Cluster failed starting machines: ssh unreachable: dial tcp 10.0.0.2:22: getsockopt: connection refused

causing tests to fail.

GitHub bug wrangler

We need a new tool to help manage GitHub bugs, possible features include:

  • View/sort bugs across repos
  • View related bugs across repos (bug report and PR are often in different places)
  • Update bugs as a fix rolls through the release process so people can track when it hits alpha, beta, stable.
  • Help establish a pattern we can use to organize and prioritize bugs.

In short, we are terrible at tracking bugs right now. We need to fix that.

kola: test ignition

kola currently tests coreos-cloudinit pretty thoroughly, although indirectly.

kola should also test ignition, with a good base set of ignition configurations that meet common use cases.

kola: list tests by available platforms

UI bug essentially. If you do kola list and see all the tests and then try to run just that test on a platform for which it is not available, 0 tests will be listed. This is confusing, so just make it clear in kola list which platforms the tests are available on.

Update CoreOS docs

Similar to updating the releases page, this will update the docs in the rolling fashion discussed in person.

lack of internet access prevents testing flannel

flannel is pulled from quay.io, and since kola does not bridge to the wan under qemu, it cannot be tested.

would it be possible to connect the bridge kola creates to a nic connected to the wan via a flag?

serializing cluster state

i'd like to be able to serialize cluster state and reconstitute it later. this means saving the platform, and the instance IDs into a json file.

however, it's currently not possible to save the SSH private key, because the ssh agent used in the platform code doesn't expose private keys.

a simple fix is to expose the private key in network.SSHAgent.

for this to work, these things need to be serializable and able to be created from serialized form -

  • platform.Cluster
  • platform.Machine
  • network.SSHAgent

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.