docs: sync README and CONTRIBUTING

This commit is contained in:
KakiFilem Team 2026-02-02 19:34:41 +08:00
parent 4381e14583
commit e3bd27bd54
1 changed files with 294 additions and 223 deletions

517
README.md
View File

@ -1,223 +1,294 @@
![License](https://img.shields.io/badge/license-MIT-blue.svg) ![License](https://img.shields.io/badge/license-MIT-blue.svg)
![Python](https://img.shields.io/badge/python-3.12-blue) ![Python](https://img.shields.io/badge/python-3.12-blue)
![Storage](https://img.shields.io/badge/storage-S3--compatible-orange) ![Storage](https://img.shields.io/badge/storage-S3--compatible-orange)
![Database](https://img.shields.io/badge/database-PostgreSQL-336791) ![Database](https://img.shields.io/badge/database-PostgreSQL-336791)
![Deploy](https://img.shields.io/badge/deploy-Railway-purple) ![Deploy](https://img.shields.io/badge/deploy-Railway-purple)
![Docker](https://img.shields.io/badge/docker-supported-blue) ![Docker](https://img.shields.io/badge/docker-supported-blue)
# Postgres-to-R2 Backup (S3-Compatible) # Postgres-to-R2 Backup (S3-Compatible)
A lightweight automation service that creates scheduled PostgreSQL backups and securely uploads them to **S3-compatible object storage** A lightweight automation service that creates scheduled PostgreSQL backups and securely uploads them to **S3-compatible object storage**
such as **Cloudflare R2, AWS S3, Wasabi, Backblaze B2, or MinIO**. such as **Cloudflare R2, AWS S3, Wasabi, Backblaze B2, or MinIO**.
Designed specifically as a **Railway deployment template**, with built-in support for Docker and cron scheduling. Designed specifically as a **Railway deployment template**, with built-in support for Docker and cron scheduling.
--- ---
## ✨ Features ## ✨ Features
- 📦 **Automated Backups** — scheduled daily or hourly PostgreSQL backups - 📦 **Automated Backups** — scheduled daily or hourly PostgreSQL backups
- 🔐 **Optional Encryption** — gzip compression or 7z encryption with password - 🔐 **Optional Encryption** — gzip compression or 7z encryption with password
- ☁️ **Cloudflare R2 Integration** — seamless S3-compatible storage support - ☁️ **Cloudflare R2 Integration** — seamless S3-compatible storage support
- 🧹 **Retention Policy** — automatically delete old backups - 🧹 **Retention Policy** — automatically delete old backups
- 🔗 **Flexible Database URLs** — supports private and public PostgreSQL connection URLs - 🔗 **Flexible Database URLs** — supports private and public PostgreSQL connection URLs
- ⚡ **Optimized Performance** — parallel pg_dump and multipart S3 uploads - ⚡ **Optimized Performance** — parallel pg_dump and multipart S3 uploads
- 🐳 **Docker Ready** — portable, lightweight container - 🐳 **Docker Ready** — portable, lightweight container
- 🚀 **Railway Template First** — no fork required for normal usage - 🚀 **Railway Template First** — no fork required for normal usage
- 🪣 **S3-Compatible Storage** — works with R2, AWS S3, Wasabi, B2, MinIO - 🪣 **S3-Compatible Storage** — works with R2, AWS S3, Wasabi, B2, MinIO
--- ---
## 🚀 Deployment on Railway ## 🚀 Deployment on Railway
1. Click the **Deploy on Railway** button below 1. Click the **Deploy on Railway** button below
2. Railway will create a new project using the latest version of this repository 2. Railway will create a new project using the latest version of this repository
3. Add the required environment variables in the Railway dashboard 3. Add the required environment variables in the Railway dashboard
4. (Optional) Configure a cron job for your desired backup schedule 4. (Optional) Configure a cron job for your desired backup schedule
[![Deploy on Railway](https://railway.com/button.svg)](https://railway.com/deploy/postgres-to-r2-backup?referralCode=nIQTyp&utm_medium=integration&utm_source=template&utm_campaign=generic) [![Deploy on Railway](https://railway.com/button.svg)](https://railway.com/deploy/postgres-to-r2-backup?referralCode=nIQTyp&utm_medium=integration&utm_source=template&utm_campaign=generic)
--- ---
## 🔧 Environment Variables (S3-Compatible) ## 🔧 Environment Variables (S3-Compatible)
```env ```env
DATABASE_URL= # PostgreSQL database URL (private) DATABASE_URL= # PostgreSQL database URL (private)
DATABASE_PUBLIC_URL= # Public PostgreSQL URL (optional) DATABASE_PUBLIC_URL= # Public PostgreSQL URL (optional)
USE_PUBLIC_URL=false # Set true to use DATABASE_PUBLIC_URL USE_PUBLIC_URL=false # Set true to use DATABASE_PUBLIC_URL
DUMP_FORMAT=dump # sql | plain | dump | custom | tar DUMP_FORMAT=dump # sql | plain | dump | custom | tar
FILENAME_PREFIX=backup # Backup filename prefix FILENAME_PREFIX=backup # Backup filename prefix
MAX_BACKUPS=7 # Number of backups to retain MAX_BACKUPS=7 # Number of backups to retain
R2_ENDPOINT= # S3 endpoint URL R2_ENDPOINT= # S3 endpoint URL
R2_BUCKET_NAME= # Bucket name R2_BUCKET_NAME= # Bucket name
R2_ACCESS_KEY= # Access key R2_ACCESS_KEY= # Access key
R2_SECRET_KEY= # Secret key R2_SECRET_KEY= # Secret key
S3_REGION=us-east-1 # Required for AWS S3 (ignored by R2/MinIO) S3_REGION=us-east-1 # Required for AWS S3 (ignored by R2/MinIO)
BACKUP_PASSWORD= # Optional: enables 7z encryption BACKUP_PASSWORD= # Optional: enables 7z encryption
BACKUP_TIME=00:00 # Daily backup time (UTC, HH:MM) BACKUP_TIME=00:00 # Daily backup time (UTC, HH:MM)
``` ```
> Variable names use `R2_*` for historical reasons, but **any S3-compatible provider** can be used by changing the endpoint and credentials. > Variable names use `R2_*` for historical reasons, but **any S3-compatible provider** can be used by changing the endpoint and credentials.
> For AWS S3 users: ensure `S3_REGION` matches your buckets region. > For AWS S3 users: ensure `S3_REGION` matches your buckets region.
--- ---
## ☁️ Supported S3-Compatible Providers ## ☁️ Supported S3-Compatible Providers
This project uses the **standard AWS S3 API via boto3**, and works with: This project uses the **standard AWS S3 API via boto3**, and works with:
- Cloudflare R2 (recommended) - Cloudflare R2 (recommended)
- AWS S3 - AWS S3
- Wasabi - Wasabi
- Backblaze B2 (S3 API) - Backblaze B2 (S3 API)
- MinIO (self-hosted) - MinIO (self-hosted)
### Example Endpoints ### Example Endpoints
| Provider | Endpoint Example | | Provider | Endpoint Example |
|--------|------------------| |--------|------------------|
| Cloudflare R2 | `https://<accountid>.r2.cloudflarestorage.com` | | Cloudflare R2 | `https://<accountid>.r2.cloudflarestorage.com` |
| AWS S3 | `https://s3.amazonaws.com` | | AWS S3 | `https://s3.amazonaws.com` |
| Wasabi | `https://s3.wasabisys.com` | | Wasabi | `https://s3.wasabisys.com` |
| Backblaze B2 | `https://s3.us-west-004.backblazeb2.com` | | Backblaze B2 | `https://s3.us-west-004.backblazeb2.com` |
| MinIO | `http://localhost:9000` | | MinIO | `http://localhost:9000` |
--- ---
## ⏰ Railway Cron Jobs ## ⏰ Railway Cron Jobs
You can configure the backup schedule using **Railway Cron Jobs**: You can configure the backup schedule using **Railway Cron Jobs**:
1. Open your Railway project 1. Open your Railway project
2. Go to **Deployments → Cron** 2. Go to **Deployments → Cron**
3. Add a cron job targeting this service 3. Add a cron job targeting this service
### Common Cron Expressions ### Common Cron Expressions
| Schedule | Cron Expression | Description | | Schedule | Cron Expression | Description |
|--------|----------------|------------| |--------|----------------|------------|
| Hourly | `0 * * * *` | Every hour | | Hourly | `0 * * * *` | Every hour |
| Daily | `0 0 * * *` | Once per day (UTC midnight) | | Daily | `0 0 * * *` | Once per day (UTC midnight) |
| Twice Daily | `0 */12 * * *` | Every 12 hours | | Twice Daily | `0 */12 * * *` | Every 12 hours |
| Weekly | `0 0 * * 0` | Every Sunday | | Weekly | `0 0 * * 0` | Every Sunday |
| Monthly | `0 0 1 * *` | First day of the month | | Monthly | `0 0 1 * *` | First day of the month |
**Tips** **Tips**
- All cron times are **UTC** - All cron times are **UTC**
- Use https://crontab.guru to validate expressions - Use https://crontab.guru to validate expressions
- Adjust `MAX_BACKUPS` to match your schedule - Adjust `MAX_BACKUPS` to match your schedule
> If you use Railway Cron Jobs, the service will start once per execution. > If you use Railway Cron Jobs, the service will start once per execution.
> In this case, the internal scheduler is ignored after startup. > In this case, the internal scheduler is ignored after startup.
--- ---
## 🖥️ Running Locally or on Other Platforms ## 🖥️ Running Locally or on Other Platforms
It can run on **any platform** that supports: It can run on **any platform** that supports:
- Python 3.9+ - Python 3.9+
- `pg_dump` (PostgreSQL client tools) - `pg_dump` (PostgreSQL client tools)
- Environment variables - Environment variables
- Long-running background processes or cron - Long-running background processes or cron
> Docker images use **Python 3.12** by default. > Docker images use **Python 3.12** by default.
> Local execution supports **Python 3.9+**. > Local execution supports **Python 3.9+**.
### Supported Environments ### Supported Environments
- Local machine (Linux / macOS / Windows*) - Local machine (Linux / macOS / Windows*)
- VPS (Netcup, Hetzner, DigitalOcean, etc.) - VPS (Netcup, Hetzner, DigitalOcean, etc.)
- Docker containers - Docker containers
- Other PaaS providers (Heroku, Fly.io, Render, etc.) - Other PaaS providers (Heroku, Fly.io, Render, etc.)
> *Windows is supported when `pg_dump` is installed and available in PATH.* > *Windows is supported when `pg_dump` is installed and available in PATH.*
### Local Requirements ### Local Requirements
- Python 3.9+ - Python 3.9+
- PostgreSQL client tools (`pg_dump`) - PostgreSQL client tools (`pg_dump`)
- pip - pip
### Run Manually (Local) ### Run Manually (Local)
```bash ```bash
pip install -r requirements.txt pip install -r requirements.txt
python main.py python main.py
``` ```
### Run with Docker (Optional) ### Run with Docker (Optional)
Build and run the image locally: Build and run the image locally:
```bash ```bash
docker build -t postgres-to-r2-backup . docker build -t postgres-to-r2-backup .
docker run --env-file .env postgres-to-r2-backup docker run --env-file .env postgres-to-r2-backup
``` ```
> Ensure the container is allowed to run continuously when not using an external cron scheduler. > Ensure the container is allowed to run continuously when not using an external cron scheduler.
> All scheduling uses **UTC** by default (e.g. Malaysia UTC+8 → set `BACKUP_TIME=16:00` for midnight). > All scheduling uses **UTC** by default (e.g. Malaysia UTC+8 → set `BACKUP_TIME=16:00` for midnight).
### Run from Prebuilt Docker Image ### Run from Prebuilt Docker Image
If you downloaded a prebuilt Docker image archive (`.tar` or `.tar.gz`), you can run it without building locally: If you downloaded a prebuilt Docker image archive (`.tar` or `.tar.gz`), you can run it without building locally:
```bash ```bash
# Extract the archive (if compressed) # Extract the archive (if compressed)
tar -xzf postgres-to-r2-backup_v1.0.0.tar.gz tar -xzf postgres-to-r2-backup_v1.0.4.tar.gz
# Load the image into Docker # Load the image into Docker
docker load -i postgres-to-r2-backup_v1.0.0.tar docker load -i postgres-to-r2-backup_v1.0.4.tar
# Run the container # Run the container
docker run --env-file .env postgres-to-r2-backup:v1.0.0 docker run --env-file .env postgres-to-r2-backup:v1.0.4
``` ```
> Prebuilt images are architecture-specific (amd64 / arm64). > Prebuilt images are architecture-specific (amd64 / arm64).
--- ---
## 🔐 Security ## 🧰 Using the CLI (Global Installation)
- **Do not expose PostgreSQL directly to the public internet.** This project can also be used as a standalone CLI tool, installable via pip, in addition to running as a Railway or Docker service.
If your database is not on a private network, use a secure tunnel instead.
### Install via pip
- **Recommended: Cloudflare Tunnel**
When using a public database URL, it is strongly recommended to connect via a secure tunnel such as **Cloudflare Tunnel** rather than opening database ports. ```bash
pip install pg-r2-backup
- **Protect credentials** ```
Store all secrets (database URLs, R2 keys, encryption passwords) using environment variables.
Never commit `.env` files to version control. ### Requirements
- **Encrypted backups (optional)** - Python 3.9+
Set `BACKUP_PASSWORD` to enable encrypted backups using 7z before uploading to S3-compatible storage. - PostgreSQL client tools (`pg_dump`) installed and available in PATH
- **Least privilege access** ### Quick Start (CLI)
Use a PostgreSQL user with read-only access where possible, and restrict R2 credentials to the required bucket only.
```bash
--- mkdir backups
cd backups
## 🛠 Development & Contributions
pg-r2-backup init # creates .env from .env.example
Fork this repository **only if you plan to**: pg-r2-backup doctor # checks environment and dependencies
pg-r2-backup run # runs a backup immediately
- Modify the backup logic ```
- Add features or integrations
- Submit pull requests ### CLI Commands
- Run locally for development
```bash
--- pg-r2-backup run # Run backup immediately
pg-r2-backup doctor # Check environment & dependencies
## ❓ FAQ pg-r2-backup config show # Show current configuration
pg-r2-backup init # Create .env from .env.example
**Why only DATABASE_URL?** pg-r2-backup schedule # Show scheduling examples
This matches how most modern platforms expose PostgreSQL credentials. pg-r2-backup --version
Support for separate DB variables may be added if there is demand. ```
## 📜 License ### Environment Variable Resolution (CLI)
This project is open source under the **MIT License**. When running via the CLI, environment variables are resolved in the following order:
You are free to use, modify, and distribute it with attribution. 1. A `.env` file in the current working directory (or parent directory)
2. System environment variables
This allows different folders to maintain separate backup configurations.
### Scheduling Backups (CLI)
The CLI does not run a background scheduler. Use your operating system or platform scheduler instead.
**Linux / macOS (cron)**
```bash
0 0 * * * pg-r2-backup run
```
**Windows (Task Scheduler)**
- Program: `pg-r2-backup`
- Arguments: `run`
- Start in: folder containing `.env` (working directory)
**Railway / Docker**
Use the platform's built-in scheduler (recommended).
💡 **Tip**
Run `pg-r2-backup schedule` at any time to see scheduling examples.
---
## 🔐 Security
- **Do not expose PostgreSQL directly to the public internet.**
If your database is not on a private network, use a secure tunnel instead.
- **Recommended: Cloudflare Tunnel**
When using a public database URL, it is strongly recommended to connect via a secure tunnel such as **Cloudflare Tunnel** rather than opening database ports.
- **Protect credentials**
Store all secrets (database URLs, R2 keys, encryption passwords) using environment variables.
Never commit `.env` files to version control.
- **Encrypted backups (optional)**
Set `BACKUP_PASSWORD` to enable encrypted backups using 7z before uploading to S3-compatible storage.
- **Least privilege access**
Use a PostgreSQL user with read-only access where possible, and restrict R2 credentials to the required bucket only.
---
## 🛠 Development & Contributions
Fork this repository **only if you plan to**:
- Modify the backup logic
- Add features or integrations
- Submit pull requests
- Run locally for development
---
## ❓ FAQ
**Why only DATABASE_URL?**
This matches how most modern platforms expose PostgreSQL credentials.
Support for separate DB variables may be added if there is demand.
## 📜 License
This project is open source under the **MIT License**.
You are free to use, modify, and distribute it with attribution.