GithubHelp home page GithubHelp logo

docker-yum-mirror's Introduction

docker-yum-mirror

Builds yum package mirrors for you in a containers via a yaml formatted config file.

Why?

In a few environments I've worked in, we've needed local yum mirrors, but needed to not only have up-to-date copies of them, but also point-in-time snapshots of some/all of them (HPC systems, kernel bound filesystems, etc. Basically, be able to bring up 1000s of systems "updated" to the exact same version of all RPMS.)

So, this container + script will help do this. It takes a config file and a directory mounted into it, and supports mirrors using rsync or reposync (and thus, anything yum itself can use).

Features

  • Supports rsync or reposync
  • Can create a large 'all' repo (all pacakges form all repos, smushed together + generated repodata)
  • This 'all' repo can be snapshot'ed, to remain frozen in time.
  • Likewise, individual repos can be snapshot'd, freezing them in time.
  • These 'snaps' are datestamped, and created (optionally) via hardlinks, to be space efficient.
  • hardlink (the program) is also used to file-level de-dupe your repos.

A resulting directory structure may look like:

- /mirror/
|__ centos-7-x86_64/
  |__all
  |__all.2016-06-14
  |__os
  |__updates
|__ centos-6_x86_64/
  ...

all would be all the rpms from os + updates, as hardlinks. all.2016-06-14 would be what all looked like on that date, all hardlinks to save space. os and updates would be mirrors of the upstream repos.

In production i would, most likley, enable all+datestamp_all, do a sync, then disable then until i wanted to make another snapshot. subsequent runs would sync upstream, but leave teh datestamped 'all' directory as is.

Usage

docker run -v /path/to/storage:/mirror -v /path/to/config.yaml:/config.yaml sjoeboo/docker-yum-mirror:latest

config example:

---
:hardlink: true
:hardlink_dir: '/mirror'
:all: true
:all_name: 'all'
:datestamp_all: true
:mirror_base: '/mirror'
mirrors:
  os:
    :dist: 'centos-7-x86_64'
    :type: 'rsync'
    :url: 'rsync://mirrors.kernel.org/centos/7/os/x86_64/'
  extras:
    :dist: 'centos-7-x86_64'
    :type: 'rsync'
    :url: 'rsync://mirrors.kernel.org/centos/7/extras/x86_64/'
    :datestamp: true
    :hardlink_datestamp: true
  plus:
    :dist: 'centos-7-x86_64'
    :type: 'reposync'
    :url: 'http://mirrors.kernel.org/centos/7/centosplus/x86_64/'
    :dest: '/some/other/location/'
    :datestamp: true
    :link_datestamp: true

Lets dive into the above config a little, it would:

  • Create mirrors in /mirror/centos-7-x86_64 (since centos-7-x86_64 is the dist for the mirrors listed)
  • Except for plus which would be created elsewhere (dist the then ignored)
  • plus would, after being sync'd, be moved to plus.YYYY-MM-DD, with plus becoming a symbloic link to plus.YYYY-MM-DD
  • extras would, after getting sync'd, have extras.YYYY-MM-DD created, as a tree of hardlinks back to extras. You could do this on multiple days to have multiple space efficent snapshots for extras as well as track upstream.
  • additionally, a all repo, named all would be created, containing al of the rpms from each dist listed. It would also have a hardlink tree created all.YYYY-MM-DD
  • finally, the hardlink program would be run on ``/mirror` to file-level de-dup the rpms.

The above config is....silly. I can't think of why one would want to datestamp all AND individual repos, but, you could. I would either datestamp all, or, datestamp individual repos and not create an all whatsoever.

options

:hardlink: Boolean. Should we run hardlink at the end to try to find duplicates. Default: true :hardlink_dir: String. What directory to run hardlink on Default: /mirror :all: true Boolean. Create an 'all' repo. Default: true :all_name: String. name for all repo. Default: all :datestamp_all: Boolean. To make a hardlinked, datestamped copy of 'all'. Default True. :mirror_base: String. Base directory to use to create destinations, dist will be appended. Default: /mirror :mirrors: Hash. List of mirrors to create. Format/parameters are:

name: (name of repo, will be appended to mirror_base + dist unless dest is specified)
  :dist: (distribution, whatever you want, basically, a grouping of repos. Appended to mirror_base unless dest is specified. Also how 'all' is created/grouped.)
  :type: (rsync or reposync)
  :url: (url to sync from. rsync:// for rsync, something yum supports for reposync)
  :datestamp: (Boolean, should a datestamped copy be made)
  :hardlink_datestamp: (Boolean, should that copy be made of hardlinks)
  :link_datestamp: (Should the original repo be turned into a link to the most recent datestamped version)

docker-yum-mirror's People

Contributors

sjoeboo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

docker-yum-mirror's Issues

Ruby problem

Have problem when trying to build the dockerfile

ERROR:  Error installing bundler:
	bundler requires Ruby version >= 2.3.0.

More error logging

Hi,
Attempting to run this and seems there is a problem with the rsync command. Perhaps it's a permission with the destination? Not sure still digging in. However some more debug messages would be helpful:

[root@1d1ee6e37f3d /]# ruby yum_mirror.rb
Now syncing os
Setting destination to /mirror/centos-7-x86_64/os
Error starting client-server protocol
Now syncing updates
Setting destination to /mirror/centos-7-x86_64/updates
Error starting client-server protocol
Now syncing epel
Setting destination to /mirror/centos-7-x86_64/epel
Syncing done!
Running hardlink on /mirror

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.