packit / deployment Goto Github PK

View Code? Open in Web Editor NEW

7.0 7.0 24.0 1.44 MB

Ansible playbooks and scripts for deploying packit-service to OpenShift

License: MIT License

Makefile 1.82% Python 23.78% Dockerfile 0.31% Shell 7.90% Jinja 66.19%

ansible hacktoberfest

deployment's Introduction

Packit

Packit is a CLI tool that helps developers auto-package upstream projects into Fedora operating system.

You can use packit to continuously build your upstream project in Fedora.

With packit you can create SRPMs, open pull requests in dist-git, submit koji builds and even create bodhi updates, effectively replacing the whole Fedora packaging workflow.

To start using Packit

See our documentation

To start developing Packit

The Contributing Guidelines hosts all information you need to know to contribute to code and documentation, run tests and additional configuration.

Workflows covered by packit

This list contains workflows covered by packit tool and links to the documentation.

Requirements

Packit is written in Python 3 and is supports version 3.9 or later.

Installation

For complete information on how to start using packit, please click here.

User configuration file

User configuration file for packit is described here.

Who is interested

For the up to date list of projects which are using packit, click here.

Logo design

Created by Marián Mrva - @surfer19

deployment's People

Contributors

Stargazers

Watchers

deployment's Issues

Rename stream to [centos-]stream-source-git

For the Fedora Source-git bot we have fedora-source-git Openshift project and DNS domain.
But the Openshift project and DNS domain for CentOS Stream Source-git bot has only "stream", i.e. stream-[prod|stg] projects in Openshift and [prod|stg].stream.packit.dev DNS domains.

Some time ago we discussed with @csomh that the CentOS Stream Source-git bot projects and DNS domain names should also include the "source-git".

Rename DNS domains: *[.stg].stream.packit.dev to *[.stg].stream-source-git.packit.dev
- DNS records
- Regenerate TLS certs
- Update URLs in src group hooks settings & rpms group hooks settings (Ask Tomas or Stephen Gallagher, they have access)
Rename Openshift projects: stream-[prod|stg] to stream-source-git-[prod|stg], update deployment/vars

Would be nice to do this together with #380

Start new rollouts when secrets or configuration changes

With Ansible it should be possible to register when a secret or configuration was changed in the cluster. Register these changes when they happen and force a new rollout of the deployments which depend on them.

This should save us the time to start rollouts manually, when all that a make deploy does is to update some secrets or configs.

Make 'make check' more consistent (and better documented) across all projects

We need to make 'make check' more consistent (and better documented) across all projects:
packit
packit-service
ogr
sandcastle
deployment

Redis image is gone

registry.fedoraproject.org/f28/redis

Trigger packit-service image build after a new change to ogr or packit master is merged

Try to figure out how to trigger packit-service image rebuild when a new change is merged to ogr or packit master

Possible solutions currently in my mind:

via DockerHub: add webhook
- seems inflexible - webhook is triggered when any of specified builds is finished
via zuul-ci:
- probably will require more complex changes

Configuring Secrets while deploying packit service for development

I see there is a requirement of the following files:

private-key.pem - mentioned in the secrets list in

https://github.com/packit-service/deployment/blob/master/secrets/README.md

This might be a mundane question, but the documentation asks me to generate a private key using a github app.
https://developer.github.com/apps/building-github-apps/authenticating-with-github-apps/#authenticating-as-a-github-app
Do I need to create a new Github app for the same? How exactly am I to proceed regarding creating the private key pem

Do not remove PVCs for repository cache during the `delete-pvcs` cron job

We need to preserve those PVCs.

Here is the script: https://github.com/packit/deployment/blob/main/cron-jobs/delete-pvcs/delete-pvcs.sh

UX of moving stable branches

Planned:

Print warning if init has not been run and exit
Add makefile target that could run init if necessary and then script (#162)
~~(refactor) clean up the subprocess calls~~

Bump base image to F36/37

Fedora 35 is getting EOL mid-December.

~~We need to port playbooks to ansible>=5 but still keep compatibility with ansible-2 because of the packit-service-tests-openshift which runs the deployment on a centos-7 node with ansible-2.~~

Any other blocker?

Replace 'set-output' in GitHub workflows with environment files

set-output used in several workflows in github.com/packit is going to be deprecated on 31st May 2023. See the blogpost explaining the change and how it should be replaced with "environment files".

Trailing newlines stripped from secret files

When a file, which we put into a k8s Secret, contains a newline at the end of the file, the newline is removed somewhere in the process.
This can cause a problem for example in ssh keys.
This occurs when running make deploy (if you remove the comment suffix from the secrets files)

Add ImageChange triggers

validation tests: make the time constraints more generous

Last two days, two validation checks failed even though all GitHub check runs were green after some time:

https://github.com/packit/hello-world/runs/6587387284
https://sentry.io/share/issue/3d1ddfb488cd40eaae3550af18f15d00/
https://sentry.io/organizations/red-hat-0p/issues/3297006723/?referrer=alert_email&alert_type=email&alert_timestamp=1653371582691&alert_rule_id=855946&environment=production

Let's make the time limits more generous and aligned with our actual OKRs.

Example: check Basic test case: copr build and test (https://github.com/packit/hello-world/pull/648) failed: These check runs were not completed 20 minutes is too strict.

`move-stable` fails when repository is not present

provide reasonable output instead of an exception
clone the repository if necessary

Move staging projects from auto-stage to auto-prod

Per Hubert's e-mail from 9/29/22, the auto-stage cluster will be de-provisioned after we move our staging projects to the auto-prod cluster.

Create packit-stg, stream-stg and fedora-source-git-stg projects at auto-prod cluster.
Switch the DNS. We probably want to regenerate the TLS certs but I'm not sure.

(For each project)

Update the host in vars/*/stg[_template].yaml
First, deploy only postgres (DEPLOYMENT=stg SERVICE=xyz make deploy TAGS=postgres); copy db data
Deploy the rest of the project (DEPLOYMENT=stg SERVICE=xyz make deploy)
Add the team members to the project. (Developer -> Project -> Project Access)
Scale down the old project

Don't deploy on Saturday

I was just checking production deployment and realized it was deployed on Saturday - can we change the auto-deployment thingy to run on Sunday night and not on Saturday?

Add more common deps into base image

check our Dockerfiles/Containerfiles (in all our repos) and images build logs (Actions tab in each repo) and see if there's any commonly installed dependency
add it to the base image

Build and push the cron-job images

Create a GitHub workflows which when a change in cron-jobs/*/ is merged to the main branch, builds the corresponding image and pushes it to Quay.io (delete-pvcs, import-images, packit-service-validation).

Additionally: check if jobs pull a new image from Quay.io or not when they are started, so that we know whether a new deployment needs to be triggered, additionally to rebuilding the image.

Fill the repository-cache

This issue just tracks the work and does not require any work on this repository.

Locally, prepare (=git clone) a set of repositories we want to include in the cache.
- kernel
- + one of our projects to test this workflow
- ~~[ ] + some other big project(s)~~ Let's start slowly and verify on our projects.
Use oc rsync to sync the content to all of the volumes in both of our deployments.
- For kernel, this is the only solution. For other projects, you can clone the repo from the pod shell.
Document the workflow in this repository.
~~[ ] Think about possible automation and tooling when you are working on this.~~ #234

[Spike-ish] Use kubernetes.core.k8s apply=true by default?

The apply parameter docs says:
"apply compares the desired resource definition with the previously supplied resource definition, ignoring properties that are automatically generated. apply works better with Services than force=yes."

I don't quite understand how it's actually different from the default behavior (what are the "properties that are automatically generated"?), but @csomh says it helps when there are properties being removed in the object.

Investigate:

What it actually does?
Why some tasks already use it and others not?
Would it make sense to default to apply=true?

Automate moving of the stable branches

We'd like to automate moving the stable branches and creating a blog post so that the scripts don't need to be run locally.

Definition of done:

moving of the stable branches and creation of the weekly packit.dev blog posts are run in a Github action
- the work that is currently done by running a script locally will be moved to a GitHub action that can be triggered manually in Github UI (=> the script can run non-interactively)
- the PR with the blog post will be opened automatically by running the action and the content can be modified by a person
instructions for the Service Guru role are updated

When this is done, our team will benefit from elimination of human errors during the process.

Failed to patch object: sts.packit-worker, updates to statefulset spec are forbidden

From time to time make deploy fails with

fatal: [localhost]: FAILED! => {"changed": false, "error": 422, "msg": "Failed to patch object: b'{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"StatefulSet.apps \\"packit-worker\\" is invalid: spec: Forbidden: updates to statefulset spec for fields other than \'replicas\', \'template\', and \'updateStrategy\' are forbidden.","reason":"Invalid","details":{"name":"packit-worker","group":"apps","kind":"StatefulSet","causes":[{"reason":"FieldValueForbidden","message":"Forbidden: updates to statefulset spec for fields other than \'replicas\', \'template\', and \'updateStrategy\' are forbidden.","field":"spec"}]},"code":422}\n'", "reason": "Unprocessable Entity", "status": 422}

One has to oc delete sts/packit-worker and re-run again.

Add redis & postgres image streams

On one of the recent architecture meetings, we (cc @praiskup) realized that we don't update our redis & postgres, i.e. we still use the same versions of container images even though they've been meanwhile updated in the external container registry.

TODO:

Add image streams for redis & postgres, they will (as the already existing ones) have scheduled: {{ auto_import_images }} so for stg they'd be automatically synchronized with the external image registry and for prod they'd be synchronized manually/periodically by our import-images cron job. Use those image streams in the DeploymentConfigs.
Add those image streams into import-images cron job, re-build and redeploy it.
Bonus point: since it means redeploying the redis & postres, we can, at the same time, change those from DeploymentConfig to Deployment, which is basically a successor of DC and in general preferred in OCP 4.

packit-service-validation: check both stg and prod

With the more granular split of prod and stg, we can now better pinpoint jobs and /packit[-stg] comments to specific deployments.

At the same time, the validation script did not account for the stg deployment.

This issue tracks work so that the validation script can check both prod and stg (design and implementation are open).

This will give us daily update about the status of the stg deployment - we can be aware of problems in stg faster.

We should be able to define which test cases are triggered against both deployments to not overload the infra.

Please comment if you disagree or have ideas about stg validation.

Refinement notes:

create a new cron job
make the cron job configurable

Automate usage of `changelog.py`

Our release process is already quite automatic but we can take it even one step further!

The changelog script could be automatically run based on some kind of indication that we want to create a release and a PR for the release could be created automatically so that we would just have to review and merge. Perhaps Github actions could be used for this?

We also need to keep in mind that the not always is the generated output perfect and may need some manual love so the approach must support this flow.

CentOS Stream 9 based base image

The base image, which we use for building

images, is Fedora based.

Tomáš has suggested using CentOS Stream 9 instead so that we don't have to bump the base image (to newer Fedora) so often.

I checked what base image Software Collections use (we use their c9s based redis & postgres) and if I check the right place it's quay.io/centos/centos:stream9.

~~The aim of this spike is to check whether all the dependencies installed in our (above-mentioned) images are available for c9s as well and what's missing.~~

Fedora-36/37 and Centos-7 compatible deployment

We need our deployment to be able to run in Fedora-36/37 and Centos-7 (Zuul).
Figure out how to do that, options:

have ansible 2 and 5 compatible playbooks
install ansible-5 from PyPI in Zuul
try ansible-navigator (it runs in a container environment)

To be done before #384

Run validation job on GitLab

Improve the validation script to run on the GitLab as well, namely, we can use the https://gitlab.com/packit-service/hello-world repo.

Info about deployment should cover forge instance:

deployment/cron-jobs/packit-service-validation/packit-service-validation.py

Lines 43 to 51 in c3a9a73

 @dataclass 

 class ProductionInfo: 

 name: str = "prod" 

 app_name: str = "Packit-as-a-Service" 

 pr_comment: str = "/packit build" 

 opened_pr_trigger__packit_yaml_fix: YamlFix = None 

 copr_user = "packit" 

 push_trigger_tests_prefix = "Basic test case - push trigger" 

 bot_name = "packit-as-a-service[bot]"

Replace the Github-only code with OGR:

e.g.

deployment/cron-jobs/packit-service-validation/packit-service-validation.py

Lines 138 to 146 in c3a9a73

 commit: Commit = project.github_repo.update_file( 

 path=contents.path, 

 message=f"Commit build trigger ({date.today().strftime('%d/%m/%y')})", 

 content="Testing the push trigger.", 

 sha=contents.sha, 

 branch=self.pr.source_branch, 

 committer=user, 

 author=user, 

 )["commit"]

Make the whole code forge-independent:
- e.g.
  
  deployment/cron-jobs/packit-service-validation/packit-service-validation.py
  
  Line 23 in c3a9a73
  
  service = GithubService(token=getenv("GITHUB_TOKEN"))

Part of packit/packit-service#1821 epic.

Provide a script for manipulating with the repository-cache

placed in the deployment repository
can add a new repository to all of the volumes /for start, we can somehow specify a list of pods/volumes/..)
- rsync mode for bigger repositories
- clone in pod mode (optional)

Implementation notes:

For workers, the volumes are mounted and accessible from the worker pod.
For sandcastle pods, we need to create a new pod that will mount the volume => it blocks the deployment.
=> We can use a different approach for both or create a new pod each time and scale down the workers.

Have in mind that this will be extended to support updates and be run as a cron job in the future.

Make it easier to create secrets for local deployment

Documentation for how to create secrets for a local deployment is missing.

We know that want to have similar steps to the ones taken when setting up an env in CI (thanks @TomasTomecek!).

Ideally we should try to factor out those tasks in a playbook in this repo, and create a make-target, which would run this playbook, and create all the secrets needed for a local deployment.

Also update the README documenting the above.

CentOS Stream based sclorg postgres & redis images

Currently, we're using Centos 7 based postgres & redis images.

In https://quay.io/organization/sclorg there are CentOS Stream 8 and 9 based postgres and redis images:

Would be nice to do this together with #327

EDIT: https://fedoramagazine.org/community-container-images-available-for-applications-development/
EDIT2: https://bugzilla.redhat.com/show_bug.cgi?id=1950420

Preserve release note formatting when producing changelogs for the weekly blogpost

The changelog script should have an option to be asked to preserve release notes markdown formatting. This would be useful when producing the weekly blogpost with the help of move_stable.

set up zuul for this repo

pre-commit
bring an openshift cluster
deploy p-s in it
sanity check all's good

More aggressive delete-pvcs cron job

Currently the cron job runs every hour and deletes PVCs older than 1 day.
Recently it happened that we hit a noncompute / storage quota and when checking PVCs in packit-prod-sandbox there were about dozen of PVCs no older than 2 hours.
So we should either delete PVCs older than 1-2hours or even better somehow check whether a PVC is used in any running pod and if not, delete it.

Unify openshift cronjob deployment approach.

Current state:

delete-pvcs and import-images use own make file
validation is deployed via separate target in root make file

The separate target in the root make file is preferred.

Consider incorporating into the global deployment process (make deploy).

Refine alerting setup

We're getting too many alerts from the alert manager and successfully ignoring most of them. Not cool.

Let's refine the existing setup and improve it:

send less emails
separately configure SLO-related alerts
think about configuration for failing tasks
- especially document how are these meant to be processed

Store the non-secret Packit service configuration options in vars/

Currently, the Packit service configuration contains both secrets and real configuration options. This causes that each time we want to update the real configuration options, we need to interact with Bitwarden. To avoid this, store these options in this repo publicly (probably in vars/) and inject the secrets to the config during the deployment.
TODO:

move the real configuration options from Bitwarden to vars in this repo
inject the secret values into the Packit service config during the deployment

make sure that 'make deploy' does also make import-images

Remove centosmsg

tl;dr; revert #71

centosmsg deployment was added when we started working on CentOS Stream.
Back then, repositories were in git.centos.org Pagure, which exposes events to mqtt.git.centos.org, which the centosmsg listens to.
Since then the repositories have been moved to Gitlab and we don't serve the git.centos.org repos anymore.

Note, we still use the centosmsg in dist2src service, but it has its own deployment.

Bitwarden CLI broken on Fedora Linux 36

I noticed only today, that the bw get item secrets-tls-certs is not working on Fedora Linux 36. This makes download_secrets.sh unusable.

$ bw --version
1.22.1

The above seems to be happening, b/c bw fails to decrypt the name and notes fields of secrets shared within an organization (this does not happen to secrets of my own). When I get a secret by ID, I see lines like:

...
  "name": "[error: cannot decrypt]",
  "notes": "[error: cannot decrypt]",
...

Indeed, I can run bw get item '[error: cannot decrypt]', which will complain that there are more than one items with this name:

$ bw get item '[error: cannot decrypt]'
More than one result was found. Try getting a specific object by `id` instead. The following objects were found:
...

bitwarden/clients#2726 is just about this issue, and I made a note that it's also happening on Fedora Linux 36.

The issue might be that these newer distros switched to use openssl3, but I don't know enough about crypto to be able to figure out a solution for this. Theoretically openssl1.1 is still present in Fedora to serve as a compatibility layer, but for some reason it doesn't seem to work.

Until then the solution is to run deployments from a container environment which is running Fedora Linux 35.

Conditionaly (not) create packit-worker-repository-cache PVC

The packit-worker playbook requests 16Gi PVC for repository cache.
We use the playbook also to deploy hardly, which doesn't use that PVC (yet).
And since our quota is reaching a hard limit we need to figure out how to conditionaly not request that PVC when deploying hardly.

(Disclaimer: I manually removed that part from the playbook before deploying stg/prod.)

Remove AWS secrets

tl; dr; revert 7f55a37 & f6d8d31 & 2143993

At some point, we almost moved our broker and database to AWS in order to be able to have distributed workers.
Now it looks like we most likely won't use AWS directly so those pieces can be removed again.

Provide a clear list of tools, services and processes for investigation

You're a Chief of Monitors. Something goes wrong. Panic 😨😱. "How do I find out what's wrong?"

TODO:

It's fine to provide only links to other documents or documentation sites.

Enable individual deployment of all the services

#60 Started with breaking out the deployment of Redis and Redis Commander. Would be nice to have the same mechanism for all the other services.

SPIKE: Auto-scalling of workers

Take a look at the possible ways how to scale our workers automatically, when needed and provide the findings as a document in the research repo.

TODO:

describe the possible triggers for the scaling (and how to get such info)
propose options we have for the implementation
discuss the findings with the team and agree on a solution we will implement
create follow-up cards (=issues) to implement it

changelog generation: markdown is lost

This is a bit frustrating that GitHub strips markdown formatting in the commit metadata in a merge commit.

We should instead parse description of a PR to get the changelog entry instead of the merge commit.

make `move-stable.py` and `changelog.py` friends

I know that you cannot make friends forcefully, but with scripts... we can! :)

Seriously now: let's integrate changelog.py script with move-stable so that moving stable branches produces a blog-post-ready changelog for you.

move-stable should still print git-log of the changes that are being pushed to stable. The ask here is to produce a complete changelog after all repos are updated.

Add splunk-fluentd-hec sidecar container

We're going to use Fluentd-hec to forward logs to Splunk.

Image Builder's composer uses quay.io/app-sre/fluentd-hec which seems to be a copy of https://hub.docker.com/r/splunk/fluentd-hec but is private - info on how to use it (according to Diaa)

There already is a WIP #352, I'm creating this to have a tracking issue card to put on the board.

Research deployment methodologies

Research different ways we could deploy our projects. Focus on time-saving, manual steps, rolling back to previous versions, time-to-deploy a change.

Talk to other teams in our organization and check how they are doing, esp. those working with the SRE team.

Definition of done: we have data to make a decision about the way we deploy our services and can make a commitment to change it.

Fix the deployment tests

Deploying on the cluster started failing, b/c images couldn't be pulled from the image streams. No clue why.

Until fixed 71f67c0 disabled the tests.

Figure out why the failure occurred, fix it, and re-enable the deployment test.

	@dataclass
	class ProductionInfo:
	name: str = "prod"
	app_name: str = "Packit-as-a-Service"
	pr_comment: str = "/packit build"
	opened_pr_trigger__packit_yaml_fix: YamlFix = None
	copr_user = "packit"
	push_trigger_tests_prefix = "Basic test case - push trigger"
	bot_name = "packit-as-a-service[bot]"

	commit: Commit = project.github_repo.update_file(
	path=contents.path,
	message=f"Commit build trigger ({date.today().strftime('%d/%m/%y')})",
	content="Testing the push trigger.",
	sha=contents.sha,
	branch=self.pr.source_branch,
	committer=user,
	author=user,
	)["commit"]

packit / deployment Goto Github PK

deployment's Introduction

Packit

To start using Packit

To start developing Packit

Workflows covered by packit

Requirements

Installation

User configuration file

Who is interested

Logo design

deployment's People

Contributors

Stargazers

Watchers

Forkers

deployment's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs