GithubHelp home page GithubHelp logo

the-process's Introduction

Table of Contents

Introduction

This repository contains the source code for docker images that are used in citus testing. The images are pushed to docker hub. There is no hooking logic between this repository and the docker hub account. It is used purely for the storage of our images source code.

1. Makefile

The creation of the images is driven by the Makefile. The Makefile has the list of pinned versions of postgres we build against. For images specific to the postgres version there will be targets to build and push the image for a specific postgres version, or all pinned versions at once. Secondly all images can be build with the build-all target and pushed with push-all.

During development and maintenance of the images you can freely call make with the desired targets. The images will be tagged with a -devYYYYmmddHHMM suffix to indicate these are development images. Since the minute is included in the tag, most often this will create new tags for every run. A new tag doesn't mean new images. The normal docker caching system is active. When a layer does not change it will be reused in a new tagged artifact.

When ready to release run make with the REALESE veriable set to 1.

$ RELEASE=1 make push-all

This will push all images, building all layers that might have changed since the last run of build. Make sure you have tested the images before pushing a release. CI might start using the newly pushed images directly, depending on the availability of a cache and how it is invalidated.

Before being able to push to the docker registry you need to have your cli authenticated to the docker hub and have sufficient privileges to push to the registery.

If you don't have access, or want to push the images to a private repo, the repo can be changed at runtime with the DOCKER_REPO variable like:

$ DOCKER_REPO=private-repo make push-all

2. Images

Details on the images. Mostly uninteresting for users. Please refer to the Makefile section above.

extbuilder

The extbuilder image is the first image that other jobs depend on in our tests. The extbuilder:

This image contains all the artifacts required to produce a build of citus binaries for exactly 1 postgres version. This image is built for every supported Postgres version. Any scripts driving the build are contained in the citus repostiroy.

The postgres version is installed from the pgdg apt archive. This allows us to install older versions, and therefore keep the versions of postgres pinned during normal release cycles. To bump the version of the Postgres to build against one should change the version as pinned in the Makefile

exttester

Very comparable to the extbuilder (todo: merge the images together - yes they are that similar). This image however is slightly optimized for actually running the tests of citus against 1 postgres version.

failtester

This image is functionally a specialization of the exttester image. It has extra tools for running the failure tests of Citus. Due to how the image is structured there is very little in common. This image starts from a python based and add the postgres versions on top. Finally it includes all the python libraries

pgupgradetester

This image is also a specialization of the exttester on a functional level, and has many overlaps with failtester, so much so that I also feel we can merge these together at some point in the future.

citusupgradetester

This container is a special beast. Besides having the testing dependencies installed like pgupgradetester and failtester, it also contains the binaries of older citus versions. During the testrun they can actually be installed at will by the testing harness to simulate a citus upgrade.

stylechecker

This image is a small alpine image that contains necessary tools to run various scripts in our CI environment. It is not based on any of the images listed here.

the-process's People

Contributors

aykut-bozkurt avatar gledis69 avatar gurkanindibay avatar hanefi avatar jasonmp85 avatar jeltef avatar naisila avatar onurctirtir avatar rajeshkt78 avatar saittalhanisanci avatar thanodnl avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

the-process's Issues

Initial status value

Currently we have a structure like this in our scripts:

status = 0
# run the test and assign the exit code to status
exit(status)

Wouldnt it be better to start with a nonzero status code so that if we dont run our tests the script will return with a nonzero code? Currently if the tests are not run the initial status which is 0 will be returned.

An example can be seen here:
https://github.com/citusdata/the-process/blob/upgradeTester/circleci/images/exttester/files/sbin/install-and-test-ext#L15

Set up repository

Wanted to start with a clean slate to document my ideas for improving Citus quality, processes, and productivity.

  • Pick a fun name
  • Pick Kanban implementation (using GitHub Projects and waffle.io in tandem)
  • Set up GitHub label taxonomy (OCD, I know)

Missing Readme

We should add a readme to explain how the images in this repository are used for testing citus so that it is easier for someone new to add/change images.

Postgres upgrade image

We should include our postgres upgrade tester image here so that if someone wants to make a change they can check it out from here.

Minor improvements to Issues UX

I put a lot of time into making the Label icons and colors nice, but threw in the towel after a few colors, and am still not sure about a few of the labels…

  • Decide whether the Effort icons are what we want (some form of progressing smileys could be a reasonable alternative)
  • Finalize background colors for Needs, Area, and State categories. Use this site to determine good candidates for background colors (use prominent icon colors and see what it suggests)
  • Where does testing go as a category? Continuous integration? Tools, Automation, and Docker also seem to have some overlap

Can we use CircleCI's Codecov orb?

We could reimplement the logic ourselves, but I'd worry about the semantics of stitching together coverage reports from the parallel test jobs in our workflow.

Instead, we should determine the suitability of the official Codecov orb.

Add back Uncrustify

This shouldn't be terribly difficult…

  • Write a Dockerfile with the needed uncrustify version
  • Write a script to test report to expose style failures to developers
  • Push the image to Docker Hub
  • Add the test to Citus' config (it can run in parallel with builds in its own container)

Add back failure testing

Short for time, I omitted the failure testing suite from the PoC. We need to add it back.

This will require:

  • Determining relevant versions of Python and dependencies
  • Writing a new Dockerfile to build a postgres-based image for failure tests
  • Getting things going locally (possibly with the CircleCI CLI)
  • Pushing the Docker image to Docker Hub
  • Adding the failure tests back into Citus' CircleCI configuration

Automate image creation

We should have a script that builds all the images and pushes them to docker hub automatically.

We can then have this as a CI job, however we should be careful about this and not push images if anything fails, otherwise it will stop the development.

So maybe we can push with a dev tag and create a PR where we use dev tagged images to manually check if all tests pass.

For now the script part is more important than the periodic runs.

Add launcher to Citus regression options

In order to instrument test run times, we need to ensure pg_regress calls our special test-timer–wrapper rather than calling psql directly. Basically, we need a way to pass in an extra option in order to ensure something like this happens.

Improve test report details

It sure would be nice to have things like error logs, differences in test output, etc., right in our UI. I think CircleCI should be able to do this. Their documentation on Collecting Test Metadata gives a high-level overview, but their reliance on the JUnit format (which is documented only in a few ancient XSD files floating around the web) means we'll need to use other implementations as inspiration.

Fortunately, CircleCI themselves forked assumed maintainership of minitest, so we can look at precisely how they generate test metadata after a run.

The RSpec JUnit formatter's xml_dump method could be another good place to look.

Basically, look at what they do and do something similar for the properties we care about in our tests.

Switch to quilt for PGDG package modifications

build-pgdebs makes several modifications to the base PGDG PostgreSQL packages in order to ensure the isolation tester, vanilla tests, regress.so, etc. are all in place at test-time. While it's OK to script changes to files within the debian subdirectory, edits of the source code itself really should be performed with quilt.

The following PostgreSQL source modifications should be performed using quilt patches:

The vpath issue is actually a bug upstream and should be reported.

Should we estimate cards or use labels?

Citus has previously used GitHub labels to size tasks, but we can estimate tasks directly on our Waffle board. I'm leaning toward that (as it would help keep burndown obvious, etc.), but am not very strongly opinionated on this point.

Use tags for different postgres versions of an image

We currently generate one image per postgres version. For example if we are supporting 3 postgres major versions then we have exttester-11, exttester-12, exttester-13. Instead it could be exttester:11.8 etc. That way we have less images and we are more explicit about the major version.

When we do this, we should update our test images in community and enterprise.

Ensure failed tests produce valid report

At the moment, format-results cannot process a test run in which there have been failures, which is obviously an oversight. Add that capability! Though the tests fail, we should still upload a report in order to ensure continuity of test data across failures (if a test suite fails fast, tests which are never run should be marked skipped, etc.)

Address compiler warnings on some packaging platforms

I believe these are from an RPM build; they've been sitting on my desktop for quite a while. At minimum, we should address them in the codebase. Ideally, we'd add the flags to our own builds to ensure we don't regress.

Output from a sort | count run

   3 /7.6.0.citus~git.20181101.3616939/src/backend/distributed/commands/multi_copy.c:1291:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
   3 /7.6.0.citus~git.20181101.3616939/src/backend/distributed/utils/citus_clauses.c:77:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
   1 /7.6.0.citus~git.20181101.3616939/src/backend/distributed/utils/ruleutils_10.c:4487:8: warning: this statement may fall through [-Wimplicit-fallthrough=]
   1 /7.6.0.citus~git.20181101.3616939/src/backend/distributed/utils/ruleutils_11.c:4489:8: warning: this statement may fall through [-Wimplicit-fallthrough=]
   3 /usr/include/postgresql/10/server/utils/elog.h:108:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
   3 /usr/include/postgresql/11/server/utils/elog.h:108:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
   3 /usr/include/postgresql/9.6/server/utils/elog.h:108:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
   1 W: postgresql-9.6-citus-dbgsym: package-has-long-file-name 72 (81) > 80
  24 configure: WARNING: unrecognized options: --disable-dependency-tracking
   5 dpkg-buildpackage: warning: using a gain-root-command while being root
  12 planner/query_colocation_checker.c:63: warning: 'colocatedJoinChecker.anchorAttributeEquivalences' may be used uninitialized in this function
  12 planner/query_colocation_checker.c:63: warning: 'colocatedJoinChecker.subquery' may be used uninitialized in this function
  12 planner/query_colocation_checker.c:63: warning: 'colocatedJoinChecker.subqueryPlannerRestriction' may be used uninitialized in this function

Failtester dockerfiles

For failtester image, we have 2 dockerfiles for postgres 10 and 11. These dockerfiles contain postgres dockerfile in them, which has the postgres version hardcoded in them.

Instead of this, we can have a single dockerfile and take postgres version as an argument. We can use a postgres base image with that argument.

Devise process for Docker Hub automated builds

We've got three different images in this repository…

  • How often are they built?
  • How do we tag them?
  • Can we optimize layers in Docker Hub?
  • Are secrets needed at build-time?

Basically, I never want developers running docker build locally: they should always have something available from Docker Hub they can Just Use™.

Teach Citus to understand our custom PostgreSQL package

The ability for Citus to find the vanilla test and isolation tester was added in citusdata/citus@ 7996a956b3893fc141be1d5f0202d0b01fdea4ac, but that commit essentially hardcodes the knowledge. To permit Travis and developers to continue uninterrupted (for a bit, at least), we should hide this logic behind a flag of some sort.

Should we optimize the CircleCI Docker images further?

It's unclear to me whether it's worth the readability tradeoff to inline our shell scripts directly into our Dockerfiles… on one hand, having fewer layers will shrink them somewhat, but perhaps it might be sufficient to use something like the new official --squash flag (still experimental, I believe), first discussed in this blog post from around two years ago.

Whoever performs this task should also poke around the layer diffs to see whether we're missing any low-hanging fruit we can knock off at the same time.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.