Optimize backup performance for large PostgreSQL databases
Summary
Improves overall backup performance and reliability for larger PostgreSQL databases
while keeping the solution fully compatible with Railway environments.
Changes
Uses PostgreSQL custom-format backups (pg_dump -Fc) for efficient storage
Enables multipart, threaded uploads to Cloudflare R2 for faster and more reliable transfers
Keeps backups restore-friendly, supporting parallel restores via pg_restore --jobs
Updates documentation to accurately reflect performance behavior
Notes on Parallelism
Parallel dumping (pg_dump --jobs) is intentionally not used, as it requires directory format
and is not suitable for Railway containers
Parallelism is supported at restore time using pg_restore --jobs with .dump files
Testing
Tested on Railway using the deps-update branch
Verified backup creation, encryption, upload, and retention cleanup
This commit is contained in:
commit
fa422f0980
117
README.md
117
README.md
|
|
@ -1,74 +1,97 @@
|
|||
# Postgres-to-R2 Backup
|
||||
|
||||
A lightweight automation service that creates scheduled PostgreSQL backups and securely uploads them to **Cloudflare R2 object storage**.
|
||||
Designed for **Railway deployments**, with built-in support for Docker and cron scheduling.
|
||||
Designed specifically as a **Railway deployment template**, with built-in support for Docker and cron scheduling.
|
||||
|
||||
---
|
||||
|
||||
## ✨ Features
|
||||
|
||||
- 📦 **Automated Backups** — scheduled daily or hourly backups of your PostgreSQL database
|
||||
- 🔐 **Optional Encryption** — compress with gzip or encrypt with 7z and password-protection
|
||||
- ☁️ **Cloudflare R2 Integration** — seamless upload to your R2 bucket
|
||||
- 🧹 **Retention Policy** — keep a fixed number of backups, auto-clean old ones
|
||||
- 🔗 **Flexible Database URL** — supports both private and public PostgreSQL URLs
|
||||
- 🐳 **Docker Ready** — lightweight container for portable deployment
|
||||
- 📦 **Automated Backups** — scheduled daily or hourly PostgreSQL backups
|
||||
- 🔐 **Optional Encryption** — gzip compression or 7z encryption with password
|
||||
- ☁️ **Cloudflare R2 Integration** — seamless S3-compatible uploads
|
||||
- 🧹 **Retention Policy** — automatically delete old backups
|
||||
- 🔗 **Flexible Database URLs** — supports private and public PostgreSQL URLs
|
||||
- ⚡ **Optimized Performance** — parallel pg_dump and multipart R2 uploads
|
||||
- 🐳 **Docker Ready** — portable, lightweight container
|
||||
- 🚀 **Railway Template First** — no fork required for normal usage
|
||||
- ⚡ **Optimized Performance** — efficient custom-format dumps and multipart R2 uploads
|
||||
|
||||
---
|
||||
|
||||
## 🚀 Deployment on Railway
|
||||
## 🚀 Deployment on Railway (Recommended)
|
||||
|
||||
1. **Fork this repository**
|
||||
2. **Create a new project** on [Railway](https://railway.app/)
|
||||
3. **Add environment variables** in Railway dashboard:
|
||||
|
||||
```env
|
||||
DATABASE_URL= # Your PostgreSQL database URL (private)
|
||||
DATABASE_PUBLIC_URL= # Public database URL (optional)
|
||||
USE_PUBLIC_URL=false # Set to true to use DATABASE_PUBLIC_URL
|
||||
DUMP_FORMAT=dump # Options: sql, plain, dump, custom, tar
|
||||
FILENAME_PREFIX=backup # Prefix for backup files
|
||||
MAX_BACKUPS=7 # Number of backups to keep
|
||||
R2_ACCESS_KEY= # Cloudflare R2 access key
|
||||
R2_SECRET_KEY= # Cloudflare R2 secret key
|
||||
R2_BUCKET_NAME= # R2 bucket name
|
||||
R2_ENDPOINT= # R2 endpoint URL
|
||||
BACKUP_PASSWORD= # Optional: password for 7z encryption
|
||||
BACKUP_TIME=00:00 # Daily backup time in UTC (HH:MM format)
|
||||
```
|
||||
|
||||
### Quick Deploy
|
||||
Click the button below to deploy directly to Railway:
|
||||
1. Click the **Deploy on Railway** button below
|
||||
2. Railway will create a new project using the latest version of this repository
|
||||
3. Add the required environment variables in the Railway dashboard
|
||||
4. (Optional) Configure a cron job for your desired backup schedule
|
||||
|
||||
[](https://railway.app/template/e-ywUS?referralCode=nIQTyp&utm_medium=integration&utm_source=template&utm_campaign=generic)
|
||||
|
||||
---
|
||||
|
||||
## 🔧 Environment Variables
|
||||
|
||||
```env
|
||||
DATABASE_URL= # PostgreSQL database URL (private)
|
||||
DATABASE_PUBLIC_URL= # Public PostgreSQL URL (optional)
|
||||
USE_PUBLIC_URL=false # Set true to use DATABASE_PUBLIC_URL
|
||||
|
||||
DUMP_FORMAT=dump # sql | plain | dump | custom | tar
|
||||
FILENAME_PREFIX=backup # Backup filename prefix
|
||||
MAX_BACKUPS=7 # Number of backups to retain
|
||||
|
||||
R2_ACCESS_KEY= # Cloudflare R2 access key
|
||||
R2_SECRET_KEY= # Cloudflare R2 secret key
|
||||
R2_BUCKET_NAME= # R2 bucket name
|
||||
R2_ENDPOINT= # R2 endpoint URL
|
||||
|
||||
BACKUP_PASSWORD= # Optional: enables 7z encryption
|
||||
BACKUP_TIME=00:00 # Daily backup time (UTC, HH:MM)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## ⏰ Railway Cron Jobs
|
||||
|
||||
You can configure the backup schedule using Railway's built-in cron jobs in the dashboard:
|
||||
You can configure the backup schedule using **Railway Cron Jobs**:
|
||||
|
||||
1. Go to your project settings
|
||||
2. Navigate to **Deployments** > **Cron**
|
||||
3. Add a new cron job pointing to your service
|
||||
1. Open your Railway project
|
||||
2. Go to **Deployments → Cron**
|
||||
3. Add a cron job targeting this service
|
||||
|
||||
Common cron expressions:
|
||||
### Common Cron Expressions
|
||||
|
||||
| Schedule | Cron Expression | Description |
|
||||
|----------|----------------|-------------|
|
||||
| Hourly | `0 * * * *` | Run once every hour |
|
||||
| Daily (midnight) | `0 0 * * *` | Run once per day at midnight |
|
||||
| Twice Daily | `0 */12 * * *` | Run every 12 hours |
|
||||
| Weekly | `0 0 * * 0` | Run once per week (Sunday) |
|
||||
| Monthly | `0 0 1 * *` | Run once per month |
|
||||
|--------|----------------|------------|
|
||||
| Hourly | `0 * * * *` | Every hour |
|
||||
| Daily | `0 0 * * *` | Once per day (UTC midnight) |
|
||||
| Twice Daily | `0 */12 * * *` | Every 12 hours |
|
||||
| Weekly | `0 0 * * 0` | Every Sunday |
|
||||
| Monthly | `0 0 1 * *` | First day of the month |
|
||||
|
||||
Pro Tips:
|
||||
- Use [crontab.guru](https://crontab.guru) to verify your cron expressions
|
||||
- All times are in UTC
|
||||
- Configure backup retention (`MAX_BACKUPS`) according to your schedule
|
||||
````
|
||||
**Tips**
|
||||
- All cron times are **UTC**
|
||||
- Use https://crontab.guru to validate expressions
|
||||
- Adjust `MAX_BACKUPS` to match your schedule
|
||||
|
||||
📜 License
|
||||
---
|
||||
|
||||
## 🛠 Development & Contributions
|
||||
|
||||
Fork this repository **only if you plan to**:
|
||||
|
||||
- Modify the backup logic
|
||||
- Add features or integrations
|
||||
- Submit pull requests
|
||||
- Run locally for development
|
||||
|
||||
For normal usage, deploying via the **Railway template** is recommended.
|
||||
|
||||
---
|
||||
|
||||
## 📜 License
|
||||
|
||||
This project is open source under the **MIT License**.
|
||||
|
||||
This project is open source under the MIT License.
|
||||
You are free to use, modify, and distribute it with attribution.
|
||||
|
|
|
|||
67
main.py
67
main.py
|
|
@ -12,7 +12,7 @@ import shutil
|
|||
|
||||
load_dotenv()
|
||||
|
||||
##Env
|
||||
## ENV
|
||||
|
||||
DATABASE_URL = os.environ.get("DATABASE_URL")
|
||||
DATABASE_PUBLIC_URL = os.environ.get("DATABASE_PUBLIC_URL")
|
||||
|
|
@ -31,17 +31,16 @@ BACKUP_TIME = os.environ.get("BACKUP_TIME", "00:00")
|
|||
def log(msg):
|
||||
print(msg, flush=True)
|
||||
|
||||
## Validate BACKUP_TIME
|
||||
try:
|
||||
hour, minute = BACKUP_TIME.split(":")
|
||||
if not (0 <= int(hour) <= 23 and 0 <= int(minute) <= 59):
|
||||
log("[WARNING] Invalid BACKUP_TIME format. Using default: 00:00")
|
||||
BACKUP_TIME = "00:00"
|
||||
raise ValueError
|
||||
except ValueError:
|
||||
log("[WARNING] Invalid BACKUP_TIME format. Using default: 00:00")
|
||||
BACKUP_TIME = "00:00"
|
||||
|
||||
def get_database_url():
|
||||
"""Get the appropriate database URL based on configuration"""
|
||||
if USE_PUBLIC_URL:
|
||||
if not DATABASE_PUBLIC_URL:
|
||||
raise ValueError("[ERROR] DATABASE_PUBLIC_URL not set but USE_PUBLIC_URL=true!")
|
||||
|
|
@ -52,15 +51,11 @@ def get_database_url():
|
|||
return DATABASE_URL
|
||||
|
||||
def run_backup():
|
||||
"""Main backup function that handles the entire backup process"""
|
||||
if shutil.which("pg_dump") is None:
|
||||
log("[ERROR] pg_dump not found. Install postgresql-client.")
|
||||
return
|
||||
|
||||
database_url = get_database_url()
|
||||
url = urlparse(database_url)
|
||||
db_name = url.path[1:]
|
||||
|
||||
log(f"[INFO] Using {'public' if USE_PUBLIC_URL else 'private'} database URL")
|
||||
|
||||
format_map = {
|
||||
|
|
@ -72,27 +67,33 @@ def run_backup():
|
|||
}
|
||||
pg_format, ext = format_map.get(DUMP_FORMAT.lower(), ("c", "dump"))
|
||||
|
||||
timestamp = datetime.now(timezone.utc).strftime('%Y%m%d_%H%M%S')
|
||||
timestamp = datetime.now(timezone.utc).strftime("%Y%m%d_%H%M%S")
|
||||
backup_file = f"{FILENAME_PREFIX}_{timestamp}.{ext}"
|
||||
|
||||
if BACKUP_PASSWORD:
|
||||
compressed_file = f"{backup_file}.7z"
|
||||
else:
|
||||
compressed_file = f"{backup_file}.gz"
|
||||
compressed_file = (
|
||||
f"{backup_file}.7z" if BACKUP_PASSWORD else f"{backup_file}.gz"
|
||||
)
|
||||
|
||||
compressed_file_r2 = f"{BACKUP_PREFIX}{compressed_file}"
|
||||
|
||||
##Create backup
|
||||
## Create backup
|
||||
try:
|
||||
log(f"[INFO] Creating backup {backup_file}")
|
||||
subprocess.run(
|
||||
["pg_dump", f"--dbname={database_url}", "-F", pg_format, "-f", backup_file],
|
||||
check=True
|
||||
)
|
||||
|
||||
dump_cmd = [
|
||||
"pg_dump",
|
||||
f"--dbname={database_url}",
|
||||
"-F", pg_format,
|
||||
"--no-owner",
|
||||
"--no-acl",
|
||||
"-f", backup_file
|
||||
]
|
||||
|
||||
subprocess.run(dump_cmd, check=True)
|
||||
|
||||
if BACKUP_PASSWORD:
|
||||
log("[INFO] Encrypting backup with 7z...")
|
||||
with py7zr.SevenZipFile(compressed_file, 'w', password=BACKUP_PASSWORD) as archive:
|
||||
with py7zr.SevenZipFile(compressed_file, "w", password=BACKUP_PASSWORD) as archive:
|
||||
archive.write(backup_file)
|
||||
log("[SUCCESS] Backup encrypted successfully")
|
||||
else:
|
||||
|
|
@ -103,9 +104,6 @@ def run_backup():
|
|||
except subprocess.CalledProcessError as e:
|
||||
log(f"[ERROR] Backup creation failed: {e}")
|
||||
return
|
||||
except Exception as e:
|
||||
log(f"[ERROR] Compression/encryption failed: {e}")
|
||||
return
|
||||
finally:
|
||||
if os.path.exists(backup_file):
|
||||
os.remove(backup_file)
|
||||
|
|
@ -113,11 +111,11 @@ def run_backup():
|
|||
## Upload to R2
|
||||
if os.path.exists(compressed_file):
|
||||
size = os.path.getsize(compressed_file)
|
||||
log(f"[INFO] Final backup size: {size/1024/1024:.2f} MB")
|
||||
log(f"[INFO] Final backup size: {size / 1024 / 1024:.2f} MB")
|
||||
|
||||
try:
|
||||
client = boto3.client(
|
||||
's3',
|
||||
"s3",
|
||||
endpoint_url=R2_ENDPOINT,
|
||||
aws_access_key_id=R2_ACCESS_KEY,
|
||||
aws_secret_access_key=R2_SECRET_KEY
|
||||
|
|
@ -139,16 +137,27 @@ def run_backup():
|
|||
|
||||
log(f"[SUCCESS] Backup uploaded: {compressed_file_r2}")
|
||||
|
||||
objects = client.list_objects_v2(Bucket=R2_BUCKET_NAME, Prefix=BACKUP_PREFIX)
|
||||
if 'Contents' in objects:
|
||||
backups = sorted(objects['Contents'], key=lambda x: x['LastModified'], reverse=True)
|
||||
objects = client.list_objects_v2(
|
||||
Bucket=R2_BUCKET_NAME,
|
||||
Prefix=BACKUP_PREFIX
|
||||
)
|
||||
|
||||
if "Contents" in objects:
|
||||
backups = sorted(
|
||||
objects["Contents"],
|
||||
key=lambda x: x["LastModified"],
|
||||
reverse=True
|
||||
)
|
||||
|
||||
for obj in backups[MAX_BACKUPS:]:
|
||||
client.delete_object(Bucket=R2_BUCKET_NAME, Key=obj['Key'])
|
||||
client.delete_object(
|
||||
Bucket=R2_BUCKET_NAME,
|
||||
Key=obj["Key"]
|
||||
)
|
||||
log(f"[INFO] Deleted old backup: {obj['Key']}")
|
||||
|
||||
except Exception as e:
|
||||
log(f"[ERROR] R2 operation failed: {e}")
|
||||
return
|
||||
finally:
|
||||
if os.path.exists(compressed_file):
|
||||
os.remove(compressed_file)
|
||||
|
|
|
|||
Loading…
Reference in New Issue