> ## Documentation Index
> Fetch the complete documentation index at: https://developer.upsun.com/llms.txt
> Use this file to discover all available pages before exploring further.

# High-performance backups: How Ceph snapshots enable incremental full backups

> Learn how Upsun leverages Ceph's RBD export-diff feature to implement high-performance incremental backups that maintain full restore capabilities for large-scale container storage.

export const PostMeta = ({data = {}}) => {
  const {author, date, image} = data;
  const authors = Array.isArray(author) ? author : author ? [author] : [];
  const resolveAuthor = slug => {
    const entry = AUTHOR_MAP[slug] || ({});
    const name = entry.name || slug;
    const github = entry.github || null;
    const linkedin = entry.linkedin || null;
    const url = github ? `https://github.com/${github}` : linkedin || null;
    const avatarUrl = github ? `https://github.com/${github}.png?size=64` : null;
    return {
      name,
      url,
      avatarUrl
    };
  };
  const formattedDate = date ? new Date(date).toLocaleDateString('en-US', {
    year: 'numeric',
    month: 'long',
    day: 'numeric'
  }) : null;
  if (!image && authors.length === 0 && !formattedDate) return null;
  const AUTHOR_MAP = {
    "aaron-collier": {
      "name": "Aaron Collier"
    },
    "aaron-dudenhofer": {
      "name": "Aaron Dudenhofer"
    },
    "aaron-porter": {
      "name": "Aaron Porter"
    },
    "adriaan-odendaal": {
      "name": "Adriaan Odendaal"
    },
    "ajmal": {
      "name": "Ajmal Siddiqui"
    },
    "akalipetis": {
      "name": "Antonis Kalipetis"
    },
    "alexander-varwijk": {
      "name": "Alexander Varwijk"
    },
    "alicia-bevilacqua": {
      "name": "Alicia Bevilacqua"
    },
    "amelie-deguerry": {
      "name": "Amelie Deguerry"
    },
    "anacidre": {
      "name": "Ana Cidre",
      "linkedin": "https://www.linkedin.com/in/ana-cidre"
    },
    "andoni": {
      "name": "Andoni Auzmendi"
    },
    "andrei-taranu": {
      "name": "Andrei (Alex) Taranu",
      "linkedin": "https://www.linkedin.com/in/andrei-alex-taranu/"
    },
    "andrew-baxter": {
      "name": "Andrew Baxter"
    },
    "andrew-melck": {
      "name": "Andrew Melck"
    },
    "antoine-crochet-damais": {
      "name": "Antoine Crochet Damais"
    },
    "augustin-delaporte": {
      "name": "Augustin Delaporte",
      "linkedin": "https://www.linkedin.com/in/augustindelaporte/"
    },
    "branislav-bujisic": {
      "name": "Branislav Bujisic"
    },
    "carl-smith": {
      "name": "Carl Smith"
    },
    "caroline-leroy": {
      "name": "Caroline Leroy"
    },
    "cati-mayer": {
      "name": "Cati Mayer"
    },
    "catplat": {
      "name": "C Trinkwon"
    },
    "ceelolulu": {
      "name": "Celeste van der Watt"
    },
    "chadwcarlson": {
      "name": "Chad Carlson",
      "github": "chadwcarlson",
      "linkedin": "https://www.linkedin.com/in/chadwcarlson"
    },
    "chris-ward": {
      "name": "Chris Ward"
    },
    "chris-yates": {
      "name": "Chris Yates"
    },
    "christian-sieber": {
      "name": "Christian Sieber"
    },
    "christopher-lockheardt": {
      "name": "Christopher Lockheardt"
    },
    "christopher-skene": {
      "name": "Christopher Skene"
    },
    "chuck-morgan": {
      "name": "Chuck Morgan"
    },
    "corey-dockendorf": {
      "name": "Corey Dockendorf"
    },
    "crell": {
      "name": "Crell"
    },
    "damz": {
      "name": "Damz"
    },
    "dan-morrison": {
      "name": "Dan Morrison"
    },
    "davidbonachera": {
      "name": "David Bonachera",
      "github": "davidbonachera",
      "linkedin": "https://www.linkedin.com/in/davidbonachera"
    },
    "dereliahmet1": {
      "name": "Ahmet Faruk Dereli"
    },
    "devicezero": {
      "name": "Jonas Kröger",
      "github": "devicezero",
      "linkedin": "https://www.linkedin.com/in/jonaskroeger/"
    },
    "doug-goldberg": {
      "name": "Doug Goldberg"
    },
    "duncan-naves": {
      "name": "Duncan Naves",
      "github": "duncannaves",
      "linkedin": "https://www.linkedin.com/in/duncan-naves-a94423aa"
    },
    "erika-bustamante": {
      "name": "Erika Bustamante"
    },
    "fabpot": {
      "name": "Fabien Potencier"
    },
    "flovntp": {
      "name": "Florent Huck",
      "github": "flovntp",
      "linkedin": "https://www.linkedin.com/in/florenthuck"
    },
    "fred-plais": {
      "name": "Fred Plais"
    },
    "gauthier-garnier": {
      "name": "Gauthier Garnier"
    },
    "gilzow": {
      "name": "Paul Gilzow"
    },
    "gmoigneu": {
      "name": "Guillaume Moigneu",
      "github": "gmoigneu",
      "linkedin": "https://www.linkedin.com/in/guillaumemoigneu/"
    },
    "gregqualls": {
      "name": "Greg Qualls"
    },
    "guguss": {
      "name": "Augustin Delaporte"
    },
    "haylee-millar": {
      "name": "Haylee Millar"
    },
    "ivana-kotur": {
      "name": "Ivana Kotur"
    },
    "jackrabbithanna": {
      "name": "Mark Hanna"
    },
    "jared-wright": {
      "name": "Jared Wright",
      "github": "jww-sh",
      "linkedin": "https://www.linkedin.com/in/jaredwaynewright"
    },
    "jessica-orozco": {
      "name": "Jessica Orozco"
    },
    "joey-stanford": {
      "name": "Joey Stanford"
    },
    "john-grubb": {
      "name": "John Grubb"
    },
    "jonas-kruger": {
      "name": "Jonas Kruger"
    },
    "kathryn-frazer": {
      "name": "Kathryn Frazer"
    },
    "kemiojo": {
      "name": "Kemi Elizabeth Ojogbede"
    },
    "kieronsambrook-smith": {
      "name": "Kieronsambrook Smith"
    },
    "laurent-arnoud": {
      "name": "Laurent Arnoud"
    },
    "letoya-boyne": {
      "name": "Letoya Boyne"
    },
    "lolautruche": {
      "name": "Jérôme Vieilledent"
    },
    "lyly-lepinay": {
      "name": "Lyly Lepinay"
    },
    "manauwar-alam": {
      "name": "Manauwar Alam"
    },
    "marc-antoine-porri": {
      "name": "Marc Antoine Porri"
    },
    "maria-antinkaapo": {
      "name": "Maria Antinkaapo"
    },
    "maria-de-anton": {
      "name": "Maria De Anton"
    },
    "mark-dorison": {
      "name": "Mark Dorison"
    },
    "markus-hausammann": {
      "name": "Markus Hausammann"
    },
    "mary-thomas": {
      "name": "Mary Thomas"
    },
    "mathias-bolt-lesniak": {
      "name": "Mathias Bolt Lesniak"
    },
    "mathieu-strauch": {
      "name": "Mathieu Strauch"
    },
    "matthias-van-woensel": {
      "name": "Matthias Van Woensel",
      "linkedin": "https://www.linkedin.com/in/matthias-van-woensel-267a069"
    },
    "michael-sharp": {
      "name": "Michael Sharp"
    },
    "mupsi": {
      "name": "Marine Gandy"
    },
    "natalie-harper": {
      "name": "Natalie Harper"
    },
    "ngommenginger": {
      "name": "Nicolas Gommenginger",
      "linkedin": "https://www.linkedin.com/in/nicolas-gommenginger"
    },
    "nicholas-bennison": {
      "name": "Nicholas Bennison"
    },
    "nicholas-vahalik": {
      "name": "Nicholas Vahalik"
    },
    "nick-hardiman": {
      "name": "Nick Hardiman"
    },
    "nickanderegg": {
      "name": "Nickanderegg"
    },
    "nicolas-grekas": {
      "name": "Nicolas Grekas",
      "github": "nicolas-grekas",
      "linkedin": "https://www.linkedin.com/in/nicolasgrekas/"
    },
    "niti-malwade": {
      "name": "Niti Malwade"
    },
    "opensocialteam": {
      "name": "Opensocialteam"
    },
    "ori-pekelman": {
      "name": "Ori Pekelman"
    },
    "otavio-santana": {
      "name": "Otavio Santana"
    },
    "palwandi": {
      "name": "Pawan Alwandi",
      "github": "pawpy",
      "linkedin": "https://www.linkedin.com/in/pawanalwandi"
    },
    "patrick-boest": {
      "name": "Patrick Boest"
    },
    "patrick-dawkins": {
      "name": "Patrick Dawkins",
      "github": "pjcdawkins",
      "linkedin": "https://www.linkedin.com/in/patrickdawkins"
    },
    "patrick-klima": {
      "name": "Patrick Klima"
    },
    "pjcdawkins": {
      "name": "Pjcdawkins"
    },
    "prineet-kaurbhurji": {
      "name": "Prineet Kaurbhurji"
    },
    "quentin-sinig": {
      "name": "Quentin Sinig"
    },
    "ralt": {
      "name": "Florian Margaine",
      "github": "ralt",
      "linkedin": "https://www.linkedin.com/in/florian-margaine-43971136"
    },
    "ramanathanramakrishnamurthy": {
      "name": "Ramanathanramakrishnamurthy"
    },
    "remi-lejeune": {
      "name": "Rémi Lejeune"
    },
    "ribel": {
      "name": "Taras Kruts"
    },
    "robert-douglass": {
      "name": "Robert Douglass"
    },
    "rudy-weber": {
      "name": "Rudy Weber"
    },
    "ryan-hicks": {
      "name": "Ryan Hicks"
    },
    "sabri-helal": {
      "name": "Sabri Helal"
    },
    "savannah-bergeron": {
      "name": "Savannah Bergeron"
    },
    "shannon-vettes": {
      "name": "Shannon Vettes"
    },
    "shawn-ogasawara": {
      "name": "Shawn Ogasawara",
      "linkedin": "https://www.linkedin.com/in/shawn-ogasawara-83a9a0/"
    },
    "shawna-spoor": {
      "name": "Shawna Spoor"
    },
    "shedrack-akintayo": {
      "name": "Shedrack Akintayo"
    },
    "simon-ruggier": {
      "name": "Simon Ruggier"
    },
    "sophie-van-der-kindere": {
      "name": "Sophie Van Der Kindere"
    },
    "stefanos-thampis": {
      "name": "Stefanos Thampis"
    },
    "stephen-weinberg": {
      "name": "Stephen Weinberg"
    },
    "sukhman-virk": {
      "name": "Sukhman Virk"
    },
    "sumaira-nazir": {
      "name": "Sumaira Nazir"
    },
    "sumer": {
      "name": "Sümer Cip"
    },
    "syed-raza": {
      "name": "Syed Raza"
    },
    "tamara-bacchia": {
      "name": "Tamara Bacchia"
    },
    "tara-arnold": {
      "name": "Tara Arnold"
    },
    "theosakamg": {
      "name": "Mickael Gaillard",
      "github": "theosakamg"
    },
    "thomasdiluccio": {
      "name": "Thomas di Luccio"
    },
    "tim-anderson": {
      "name": "Tim Anderson"
    },
    "tom-helmer-hansen": {
      "name": "Tom Helmer Hansen"
    },
    "tylermills": {
      "name": "Tyler Mills"
    },
    "upsun": {
      "name": "Upsun"
    },
    "veronika-tolkachova": {
      "name": "Veronika Tolkachova",
      "linkedin": "https://www.linkedin.com/in/veronika-tolkachova-169167a2"
    },
    "vince-parker": {
      "name": "Vince Parker"
    },
    "vinnie-russo": {
      "name": "Vincenzo Russo"
    },
    "vrobert78": {
      "name": "Vincent Robert",
      "github": "vrobert78",
      "linkedin": "https://www.linkedin.com/in/vincent-robert-498a883"
    },
    "yuriy-babenko": {
      "name": "Yuriy Babenko"
    },
    "yuriy-gerasimov": {
      "name": "Yuriy Gerasimov"
    }
  };
  return <div className="post-meta">
      {(authors.length > 0 || formattedDate) && <div className="post-meta-info">
          {authors.length > 0 && <div className="post-meta-authors">
              {authors.map(slug => {
    const {name, url, avatarUrl} = resolveAuthor(slug);
    const inner = <>
                    {avatarUrl && <img src={avatarUrl} alt={name} className="post-meta-avatar" />}
                    <span className="post-meta-author-name">{name}</span>
                  </>;
    return url ? <a key={slug} href={url} target="_blank" rel="noopener noreferrer" className="post-meta-author">
                    {inner}
                  </a> : <span key={slug} className="post-meta-author">{inner}</span>;
  })}
            </div>}
          {authors.length > 0 && formattedDate && <span className="post-meta-separator" aria-hidden="true">·</span>}
          {formattedDate && <span className="post-meta-date">{formattedDate}</span>}
        </div>}
      {image && <img src={image} alt="" className="post-meta-image" aria-hidden="true" />}
    </div>;
};

<PostMeta data={{ author: ["ralt"], date: "2025-08-25T00:00:00+00:00", image: "/images/posts/how-it-works/how-do-we-do-incremental-but-really-full-backups-on-top-of-ceph/ceph-backup-blob-storage.webp" }} />

## The backup challenge at scale

When you're managing container storage at scale, backup performance becomes critical. Large volumes—those reaching hundreds of gigabytes—can take hours to back up using traditional full-volume streaming methods. This creates operational challenges: extended backup windows, increased resource consumption, and potential data loss exposure during long-running operations.

At Upsun, we face this challenge daily with our Ceph-based storage infrastructure. While Ceph's copy-on-write (CoW) snapshots provide instant cloning capabilities for our containers, off-site backups require a different approach. We need a solution that combines the speed of incremental backups with the reliability of full restores.

## Why Ceph RBD over traditional file-based approaches

Our storage architecture leverages Ceph's RADOS Block Device (RBD) feature rather than CephFS for container storage. This choice provides several advantages:

* **Simplified data management**: Working with block devices means handling "bags of bytes" rather than complex file systems
* **Better performance**: Block-level operations eliminate file system overhead
* **Seamless failover**: Containers can migrate across our VM grid without complex file system considerations
* **Snapshot efficiency**: RBD snapshots are instant and space-efficient

While a file-based approach like `rsync` might seem intuitive—comparing file lists and transferring only changed files—it doesn't align with our block-level storage philosophy.

## The Ceph RBD export-diff solution

Ceph provides an elegant solution through [`rbd export-diff`](https://ceph.io/en/news/blog/2013/incremental-snapshots-with-rbd/), which extracts only the changes between two snapshots at the block level. This feature becomes the foundation for our incremental backup strategy.

Here's how the basic process works:

1. Create a new RBD snapshot
2. Use `rbd export-diff` to identify changed blocks between snapshots
3. Export only the differential data

However, implementing production-ready backups requires additional considerations beyond the basic diff export.

## Building full restore capability from incremental data

To maintain the ability to perform complete volume restores from blob storage, we developed a chunked metadata system:

### Chunk-based storage architecture

* **4MB chunks**: Each volume is divided into 4MB blocks for optimal transfer and deduplication
* **Hash-based deduplication**: Chunk keys are generated from content hashes, eliminating duplicate data across the entire system
* **Project-level isolation**: Each project maintains its own chunk catalog to prevent cross-customer data leakage

### Metadata file structure

Each backup generates a metadata file containing:

* Complete list of chunks required for full volume restoration
* Chunk offsets and positions within the volume

This approach ensures that every backup point can restore a complete volume, even though we're only transferring changed data.

## The backup workflow in practice

Here's the complete backup process:

1. **Snapshot creation**: Generate a new RBD snapshot of the volume
2. **Differential analysis**: Run `rbd export-diff` between the current and previous snapshots
3. **Chunk processing**: Break the differential data into 4MB chunks and generate hashes
4. **Selective upload**: Upload only chunks that don't already exist in blob storage
5. **Metadata generation**: Create a new metadata file referencing all chunks (new and existing) required for full restoration

This workflow ensures that backup time scales with the amount of changed data rather than total volume size.

## Optimized restore performance

The hash-based chunk system also accelerates restoration:

1. **Local verification**: Before downloading chunks from blob storage, verify if they already exist locally by comparing hashes
2. **Selective download**: Only retrieve chunks that have changed or are missing locally
3. **Parallel processing**: Multiple chunks can be processed simultaneously

While restore operations still require reading the full volume to verify local chunks, the selective download significantly reduces network transfer time.

## Performance benefits and trade-offs

This architecture delivers substantial improvements:

* **Backup speed**: Scales with data change rate rather than volume size
* **Storage efficiency**: Deduplication reduces storage across all backups
* **Network optimization**: Minimal data transfer for routine backups
* **Restore flexibility**: Any backup point can restore a complete volume

The primary trade-off is complexity—managing chunk metadata and ensuring referential integrity requires more sophisticated backup orchestration than simple volume dumps.

Ceph can enable enterprise-grade backup performance without sacrificing restore capabilities. By leveraging block-level snapshots and implementing intelligent chunking strategies, you can achieve backup speeds that scale with your actual data change patterns.

Ready to experience high-performance, scalable container storage? [Start your free trial](https://upsun.com) and see how Upsun's infrastructure handles your most demanding applications.
