The CI transactional mirrors
The CI transactional mirrors were created to protect CI jobs from failures that have to do with external package repositories.
There are 3 types of failures that can be caused by using external package repositories in CI:
- Failures due to connectivity and communication issues with the external repository
- Failures due to the external repository being updated while the CI job is still running
- Failures due to packages in the external repository being incompatible with the code being tested in CI
The transactional mirrors were designed to prevent failures of types 1 and 2 and also provide means to mitigate failures of type 3.
To prevent failures of type 1 it may be enough to have a local mirror of remote package repositories or even just a caching proxy server.
To prevent failures of type 2, the mirror needs to be transactional which means it needs to provide a mechanism by which a job can have a fixed-state view of the mirror that shows how it looked in a certain point in time (Typically when the CI job started) while the mirror itself can continue to be updated.
That mechanism is achieved in the CI mirrors via the use of snapshots. When a mirror is updated, a snapshot of the updated state is created. From that point on, the snapshot is guaranteed to never be changed. When a job starts, it checks which are the latests snapshots of the repositories it needs to use, and sets things up so that the snapshots are used instead of the external repositories. This process is called "the mirror injection process".
The mirror injection process
The mirror injection process is the process by which the external package repository URLs a job is made to use are replaced dynamically by the CI system to poing to the CI mirror snapshots instead.
The mirror injection itself is done by the "
To know which mirror snapshot URLs to inject to which repos, the repos are
identified by their 'repo id', which is the identifier placed in square brackets
yum configuration files. For example, for the following configuration:
[centos-base-el7] name=CentOS-7 - Base baseurl=http://mirror.centos.org/centos/7/os/x86_64/ gpgcheck=1 gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7
The 'repo id' would be
centos-base-el7. Given that we have a mirror that is
called "centos-base-el7" (That mirrors the CentOS 7 base repository, as
expected), and that its latest snapshot was created at Dec 13th, 2016, when the
mirror injection process runs and finds the above configuration, it will replace
the repo URL, The resulting configuration would be:
[centos-base-el7] name = CentOS-7 - Base baseurl = http://mirrors.phx.ovirt.org/repos/yum/centos-base-el7/2016-12-13-13-30 gpgcheck = 1 gpgkey = file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7 proxy = _none_
Note that the mirror injection process also puts settings in place to ensure the mirrors are accessed directly and not via the proxy server.
When the injection process comes across a 'repo id' for which no mirror exists, it leaves the configuration for it in place as-is. This means that repo ids need to be properly specified for the mirrors to be in use.
Using mirrors for standard-CI jobs
The CI mirrors were designed to be used primarily by "Standard-CI" jobs.
Standard CI jobs gain access to repositories in two ways. Some repositories are
pre-configured in the environment by default. Such repositories include the OS
"base" and "updates" repos and some commonly used repos such as "EPEL". For
other repositories, projects can specify "
*.repos" files in their
For pre-configured repositories, mirror injection happens automatically.
For repositories defines in "
*.repos" files, one needs to carefully specify
repo ids for then so that they will be recognised by the mirror injection
process. To do that, one need to prepend the repo id followed by a comma to the
repo url in the "
*.repos" file. For example, here is how to include the
GlusterFS CentOS SIG repo:
To get a list of possible repo ids to use, please see the List of CI mirrors.
The mirror sync process
The CI mirrors are synchronised with their equivalent upstream repositories every 8 hours.
Each mirror has its own set of synchronization jobs, one per supported architecture (Currently only x86_64 mirrors are supported), that is called something like:
All the synchronization jobs run directly on the "
server. The core job functionality is implemented by the "
Each mirror has a "base" directory that is synchronised with the upstream
repository by using the "
reposync" tool. If the synchronization job detects
that changes were synchronized from the upstream, it create a snapshot
directory, named after the time in which is was created, that contains a copy of
yum" metadata in the "base" directory. This way, when the "base"
directory changes, the snapshot keeps pointing to the older package versions.
Each mirror has a "
latest.txt" file that contains the name of the latest
snapshot. It is updated when a new snapshot is created. This makes it easy to
tell which snapshot is the latest with a tool like "
If the mirror was updated, the synchronization job triggers another job that is
system-mk_mirrors_index". This jobs also runs the "
script but with different parameters. This job scans all the mirrors, and
build a mapping data structure mapping the name of each mirror to the URL of its
latest mirror. This data is then saved in the JSON, YAML and Python structure
source data formats in the "
all_latest.py" files respectively.
The "all_latest" files make it possible to obtain all the information the mirror injection process needs with a single HTTP request to the mirrors server.
Rolling mirrors back to previous snapshot
Since we have a snapshot creation process in place for the mirrors, it is easy to make clients use an older snapshot in case the most recent snapshot becomes unusable for some reason.
Here is the process to effectively roll back a mirror to a previous snapshot:
- Log in to the mirrors server.
- Edit the mirror's "
latest.txt" file. For a mirror named "
foo" it would be at "
- Replace the name of the latest snapshot in the file to the name of an older
snapshot. Be careful to use a name of an existing snapshot. You can see
the available snapshots for "
foo" by listing the "
/var/www/html/repos/yum/foo" directory. The snapshots would be directories with dates for names.
- Run the "
system-mk_mirrors_index" job to update the "
all_latest.*" files with the new desired snapshot state.