curl | rpm -i from a random URL.
If you’ve ever configured yum or dnf, you’ve probably edited a .repo file. That’s the starting point.
What’s in a .repo file
A typical entry in /etc/yum.repos.d/ looks like this:
baseurl points to the root of the repository, and everything (package lists, metadata, signatures) lives under that path in a predictable layout. No nested concepts like distributions or components to worry about.
How dnf fetches packages
When dnf (or yum, same idea) processes a repository, it does roughly this:
- It fetches
$baseurl/repodata/repomd.xml. This is the entry point. It contains references to all the metadata files and their checksums. - From
repomd.xml, it learns the paths toprimary.xml.gz,filelists.xml.gz, andother.xml.gz. The important one isprimary.xml.gz, which contains the actual list of packages. - If GPG checking is enabled, it verifies
repomd.xmlagainstrepomd.xml.asc.
primary.xml.gz file contains entries like this:
repomd.xml. And the location field points to the actual .rpm file relative to baseurl. A signed index that references checksummed packages.
Generating the metadata
The standard tool for this iscreaterepo_c. You point it at a directory full of .rpm files:
repodata/ directory with all the XML metadata, checksums, and (if configured) GPG signatures. One command, everything generated.
The resulting folder structure:
aws s3 sync . s3://.... Done.
(Side note: we ended up building repogen, a CLI tool that handles this metadata generation for RPM, Debian, Pacman, etc repositories all at once. But createrepo_c works perfectly fine on its own if RPM is all you need.)
… well, almost.
Authentication
These are internal repositories. Public S3 buckets are not an option. Theyum/dnf ecosystem handles this with plugins. Which one you use depends on your package manager:
yum-s3-iamforyum-based systems (RHEL 7 and older). It’s yum-only and doesn’t work with DNF.dnf-plugin-s3transportis the one fordnf-based systems (RHEL 8+, Fedora). If you’re setting this up today, this is most likely the one you want.
.repo file becomes:
Versioning with separate repositories
If you want different streams of packages (per Git tag, per release channel), you create separate repositories. Each one is its ownbaseurl, its own repodata/, its own S3 prefix.
.repo file at the one you want, run dnf update, and you get exactly the packages from that tag. Upgrades are deterministic, and rollbacks are a config change.
If you’re running a fleet of RHEL-based systems and you’ve been thinking about hosting your own RPM repository, consider skipping the server entirely. S3 (or blob storage in general) does the job,
createrepo_c generates the metadata, and an IAM plugin handles auth. The whole thing is a CI pipeline and a bucket.