AWS Glacier Deep Archive: Ultimate Cold Storage Backup Solution for My Homelab

AWS Glacier Deep Archive: Ultimate Cold Storage Backup Solution for My Homelab

HomeLab

April, 2025

10 minutes


In my ongoing homelab journey, I've shared various setups from hardware configurations to self-hosted applications. Today, I want to focus on what I consider the most critical yet often overlooked aspect of any homelab: a robust backup strategy, specifically how I implemented AWS Glacier Deep Archive as my offsite backup solution.

Why Offsite Backups Are Non-Negotiable

3-2-1 backup strategy3-2-1 backup strategy

The 3-2-1 backup rule has been my guiding principle for data protection:

  • 3 copies of your data
  • on 2 different types of media
  • with 1 copy stored offsite

While my local NAS and secondary drives handle the first two requirements, the offsite component presented challenges. Commercial backup solutions became prohibitively expensive as my data grew beyond a few terabytes. That's when I discovered AWS Glacier Deep Archive.

Why AWS Glacier Deep Archive?

AWS Glacier Deep Archive Pricing Comparison

AWS Glacier Deep Archive

$0.00099/GB/month

AWS Glacier Flexible Retrieval

$0.0036/GB/month

AWS S3 Standard-IA

$0.0125/GB/month

AWS S3 Standard

$0.023/GB/month

*Prices shown for US East (N. Virginia) region

At just $0.99 per terabyte per month, Glacier Deep Archive is by far the most cost-effective cloud storage solution for true cold storage. It's specifically designed for data that:

  • Needs to be retained for compliance or disaster recovery
  • Is accessed infrequently (less than once per year)
  • Can tolerate retrieval times of 12+ hours

This makes it perfect for my worst-case scenario backups—if my house burns down or my local backups fail catastrophically, I have an encrypted copy of everything safely stored in AWS.

The Costly Mistake Many Glacier Users Make

Early Deletion Fees Comparison

Deleting after 30 days:

Regular cost:$0.99
With penalty:$4.95 (+400%)

Deleting after 90 days:

Regular cost:$2.97
With penalty:$3.96 (+33%)

Deleting after 150 days:

Regular cost:$4.95
With penalty:$2.97 (40% remaining)

Deleting after 180+ days:

Regular cost:$5.94
With penalty:$5.94 (No penalty)

*Example based on 1TB storage at $0.99/TB/month

I learned an expensive lesson when I first implemented Glacier: the 180-day minimum storage commitment. AWS charges early deletion fees for objects removed before 180 days, and these fees can quickly exceed what you'd save by deleting the data.

After tracking my expenses over several months, I discovered I was paying significantly more in early deletion fees than in actual storage costs! My monthly rotation policy was costing me dearly.

The solution was simple but required adjusting my mindset: with storage this inexpensive, it makes more sense to keep backups longer rather than delete and replace them frequently.

My Glacier Backup Implementation

rclone command-line programrclone command-line program

For my implementation, I chose rclone—a command-line program to manage files on cloud storage. Here's how my setup works:

Key Components:

  1. Encrypted Backups: Everything is encrypted client-side before upload
  2. Scheduled Monthly Backups: Automated process requiring no manual intervention
  3. Smart Retention Policy: Only delete backups older than 180 days
  4. Notification System: Alerts me when backups start, complete, or encounter issues

Setting Up Rclone for Glacier

The heart of my backup system is a properly configured rclone setup. Here's a simplified version of my configuration for AWS Glacier Deep Archive with encryption:

# S3 Glacier configuration
[glacier-backup]
type = s3
provider = AWS
env_auth = false
access_key_id = YOUR_ACCESS_KEY_ID
secret_access_key = YOUR_SECRET_ACCESS_KEY
region = us-east-1
acl = private
storage_class = DEEP_ARCHIVE
bucket_acl = private

# Encryption layer on top of S3
[glacier-backup-crypt]
type = crypt
remote = glacier-backup:your-bucket-name
filename_encryption = off
directory_name_encryption = false
password = YOUR_PASSWORD_1
password2 = YOUR_PASSWORD_2

This configuration creates two rclone remotes:

  • glacier-backup: Connects to AWS S3 with the Deep Archive storage class
  • glacier-backup-crypt: Encrypts all data before uploading to the glacier-backup remote

The Backup Process:

Each month, my automation:

  1. Identifies directories requiring backup
  2. Creates compressed archives of each directory
  3. Encrypts the archives using rclone's crypt backend
  4. Uploads them to AWS using Deep Archive storage class
  5. Sends notifications at key points in the process
  6. Only removes backups that have aged beyond the 180-day minimum

Example Backup Command

Here's an example of how I use Docker to run rclone for backing up a directory:

# Create a compressed archive and pipe it directly to rclone
tar czf - -C /path/to/data important-folder | \
docker run --rm -i \
  --name glacier-backup \
  -v /path/to/rclone/config:/config/rclone \
  rclone/rclone \
  rcat \
  "glacier-backup-crypt:backups/important-folder-$(date +%Y%m%d).tar.gz" \
  --config=/config/rclone/rclone.conf \
  --progress \
  --s3-chunk-size 1000M

This one-liner:

  1. Creates a compressed tar archive of the directory
  2. Pipes it directly to rclone running in Docker
  3. Encrypts the data using my configuration
  4. Uploads it to AWS Glacier Deep Archive
  5. Uses a date-stamped filename to identify the backup

Implementing the 180-Day Retention Policy

The critical part of my backup solution is the retention policy that respects Glacier's 180-day minimum. Here's how I safely clean up old backups:

docker run --rm \
  --name glacier-cleanup \
  -v /path/to/rclone/config:/config/rclone \
  rclone/rclone \
  delete \
  "glacier-backup-crypt:" \
  --min-age 180d \
  --config=/config/rclone/rclone.conf \
  --include "backups/*"

This ensures I only remove backups that have been stored for at least 180 days, avoiding early deletion fees entirely.

Cost Analysis: Glacier vs. Other Solutions

Monthly Cost Comparison (5TB Backup)

AWS Glacier Deep Archive$4.95/month
Google Cloud Archive$15.00/month
Backblaze B2$25.00/month
Typical Cloud Backup Service$50.00/month
Enterprise Backup Solution$100.00/month

*Costs are approximate and may vary based on region and specific service tiers

After correcting my deletion policy, the cost savings became dramatic. Here's how Glacier Deep Archive compares to alternatives for my 5TB of critical data:

  • AWS Glacier Deep Archive: ~$5/month
  • Backblaze B2: ~$25/month
  • Google Cloud Archive: ~$15/month
  • Commercial backup services: $50-100/month

For a homelab enthusiast on a budget, this difference is substantial over time. The tradeoff is retrieval time, but that's acceptable for my disaster recovery scenario.

Setting Up Your Own Glacier Backup System

If you're considering implementing a similar solution, here are the key steps:

  1. Set up an AWS account and create an S3 bucket
  2. Create IAM credentials with minimal permissions for security
  3. Install and configure rclone with the S3 and crypt backends
  4. Create a backup script that handles compression, encryption, and upload
  5. Schedule regular backups using cron or your preferred scheduler
  6. Implement a smart retention policy respecting the 180-day minimum
  7. Test your recovery process to ensure backups are viable

Testing Retrieval

It's essential to periodically test your restore process. Here's an example command to retrieve a file from your Glacier backup:

# Request the file to be made available (this takes 9-12 hours for Deep Archive)
docker run --rm \
  -v /path/to/rclone/config:/config/rclone \
  rclone/rclone \
  copy \
  "glacier-backup-crypt:backups/important-file.tar.gz" \
  /local/restore/path/ \
  --config=/config/rclone/rclone.conf \
  --s3-restore-tier=Bulk \
  --s3-restore-days=5

This initiates the restore process, and once the file becomes available (after the retrieval period), you can download it using a standard rclone copy command.

Important Considerations

Security

All data should be encrypted before leaving your network. Rclone's crypt backend handles this seamlessly, ensuring that even if someone gained access to your AWS account, they couldn't read your data without your encryption keys.

Bandwidth Requirements

Deep Archive is best suited for environments with sufficient upload bandwidth. My fiber connection makes this feasible, but those on slower connections might need to adjust their backup schedule or be more selective about what gets backed up.

Retrieval Costs

While storage costs are minimal, retrieval can be expensive. This is truly designed for "break glass in case of emergency" scenarios—not for regular access to your files.

Conclusion

AWS Glacier Deep Archive has transformed my homelab backup strategy, providing peace of mind at a fraction of the cost of other solutions. The key lessons I've learned:

  1. The 180-day retention policy is non-negotiable if you want to avoid excessive fees
  2. Client-side encryption is essential for cloud storage
  3. Automation makes the process manageable and reliable
  4. For true cold storage, AWS Glacier Deep Archive is unbeatable in terms of cost

With this setup, I can rest easy knowing that even in the worst-case scenario, my irreplaceable data—family photos, financial records, personal projects—is safely stored in multiple geographic locations, encrypted, and available when needed.

For homelab enthusiasts looking to implement a proper 3-2-1 backup strategy without breaking the bank, AWS Glacier Deep Archive combined with rclone provides an excellent solution that scales with your data needs.