GithubHelp home page GithubHelp logo

gargravarr2112 / mirror-rsync Goto Github PK

View Code? Open in Web Editor NEW
18.0 4.0 3.0 18 KB

A simple APT archive mirroring script using rsync, configurable to specific releases

DIGITAL Command Language 2.73% Shell 97.27%

mirror-rsync's Introduction

mirror-rsync

A simple APT archive mirroring script using rsync, configurable to specific releases

Requirements

The only significant requirement is rsync. Standard *nix tools cover all the rest:

  • awk
  • sed
  • gunzip or xzcat

Everything else is pure bash.

Installing

  • Clone this repository to a folder
  • Create a folder /etc/mirror-rsync.d
  • Create one or more files in that folder named for the URL of your desired rsync mirror, containing the following lines in bash syntax (no spaces, see example):
    • name=string the name of the APT repository (e.g.ubuntu)
    • releases=(array) the releases under dists/ to sync packages from (e.g. jammy jammy-updates jammy-backports)
    • repositories=(array)the repositories under each release to sync from (e.g.main restricted universe)
    • architectures=(array) the CPU architectures to use (e.g. i386 amd64)
  • Edit the mirror-rsync.sh script and edit the top lines to specify your desired location on disk to store the repository (and if necessary, the location of the mirror-rsync.d folder if /etc/ is not suitable)
  • Run ./mirror-rsync.sh without arguments either manually or via cron.

Rationale

I've previously tried apt-mirror and debmirror with varying degrees of success; with Ubuntu, I had regular problems with apt-mirror creating the dep11 folder trees. With debmirror and HTTP, the process is quite slow due to each file being its own HTTP request. rsync is designed for this purpose and is much faster, but has the unwanted side effect with APT that it has to download the entire remote repository - this may include releases you don't use and don't have the space for. This script will download only the releases you want, quickly and efficiently.

Enhancements

This was written while I worked for a startup and is more than a little hacky. Things that should be configurable (e.g. sources, architectures, branches) weren't at the time. I have now rewritten it to support syncing from multiple repositories and specifying the actual contents of the remote repositories to get.

License

For now, consider it licensed under the WTFPL - you can do whatever you like with this script. No warranty is included or implied. It should do what you expect, but the author is not responsible for loss of data, excessive usage bills, WWIII or any other issues that may arise from use (proper or improper) of this script.

mirror-rsync's People

Contributors

gargravarr2112 avatar sanbrother avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

mirror-rsync's Issues

No package list if uncompressed file not found

I tried to run script to mirror parrot-os repo. I think if an uncompressed file is found the script doesn't export package list to file.
Here how I corrected it:


for release in ${releases[*]}; do
		for repo in ${repositories[*]}; do
			for arch in ${architectures[*]}; do
				if [[ ! -f "$localPackageStore/dists/$release/$repo/binary-$arch/Packages" ]]; then  #uncompressed file not found
					
					if [[ $(which gunzip) ]]; then #See issue #5 - some distros don't provide gunzip by default but have xz
						
					  if [[ -f "$localPackageStore/dists/$release/$repo/binary-$arch/Packages.gz" ]]; then
							packageArchive="$localPackageStore/dists/$release/$repo/binary-$arch/Packages.gz";
							echo "$(date +%T) Extracting $release $repo $arch Packages file from archive $packageArchive";
							if [[ -L "$packageArchive" ]]; then #Some distros (e.g. Debian) make Packages.gz a symlink to a hashed filename. NB. it is relative to the binary-$arch folder
								echo "$(date +%T) Archive is a symlink, resolving";
								packageArchive=$(readlink $packageArchive | sed --expression "s_^_${packageArchive}_" --expression 's/Packages\.gz//');
							fi
							gunzip <"$packageArchive" >"$localPackageStore/dists/$release/$repo/binary-$arch/Packages";
						fi
					elif [[ $(which xzcat) ]]; then
						if [[ -f "$localPackageStore/dists/$release/$repo/binary-$arch/Packages.xz" ]]; then
							packageArchive="$localPackageStore/dists/$release/$repo/binary-$arch/Packages.xz";
							echo "$(date +%T) Extracting $release $repo $arch Packages file from archive $packageArchive";
							if [[ -L "$packageArchive" ]]; then #Same as above
								echo "$(date +%T) Archive is a symlink, resolving";
								packageArchive=$(readlink $packageArchive | sed --expression "s_^_${packageArchive}_" --expression 's/Packages\.xz//');
							fi
							xzcat <"$packageArchive" >"$localPackageStore/dists/$release/$repo/binary-$arch/Packages";
						fi
					else
						echo "$(date +%T) Error: uncompressed package list not found in remote repo and decompression tools for .gz or .xz files not found on this system, aborting. Please install either gunzip or xzcat to use this script." 1>&2;
						exit 1;
					fi
				fi
				echo "$(date +%T) Extracting packages from $release $repo $arch";
				if [[ -s "$localPackageStore/dists/$release/$repo/binary-$arch/Packages" ]]; then #Have experienced zero filesizes for certain repos
					awk '/^Filename: / { print $2; }' "$localPackageStore/dists/$release/$repo/binary-$arch/Packages" >> "/tmp/$filename";
				else
					echo "$(date +%T) Package list is empty, skipping";
				fi
			done
		done
	done

Rpm mirrors

Hello there, any plans to add rpm repos support?

error unkown module ubuntu

Hi,

was wondering if i could get your advise on why im getting following error
i am trying to collect ubuntu repo packages using RHEL.

Have copied your package and placed in a folder.
created /etc/mirror-rsync.d folder
in this dir, have added a file "sg.archive.ubuntu.com"
sg.archive.ubuntu.com has content focal

in mirror-rsync.sh, have modified masterSource to 'sg.archive.ubuntu.com'

when executing ./mirror-rsync.sh , observed the following error

@error: Unknown module 'ubuntu'
rsync error: error starting client-server protocol (code 5) at main.c(1661) [Receiver=3.1.3]

appreciate your advise thanks.

The logging feature doesn't

Hi Rob,
After I ran the script, there was no log output, and I saw the "exporting to /tmp/" directory, this file is not there...
Can you add a log function so that you can see the synchronized content and the percentage of the entire synchronization completed?
Thank you !

image
image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.