![License](https://img.shields.io/badge/license-MIT-blue.svg) ![Python](https://img.shields.io/badge/python-3.13-blue) ![Framework](https://img.shields.io/badge/FastAPI-green) ![Docker](https://img.shields.io/badge/docker-supported-blue) # YouTube Transcript API A lightweight **FastAPI** service that extracts **YouTube video captions (no speech-to-text)**. No video or audio downloads โ€” just clean, structured captions returned as JSON. Built to be simple, stateless, and easy to deploy anywhere. --- ## โœจ Features * Extract **human or auto-generated captions** * No media downloads (captions only) * Clean JSON output with timestamps * Accepts normal, playlist, and radio-style YouTube URLs (single video only) * Docker friendly * Built-in Swagger UI at `/docs` --- ## ๐Ÿงช API Usage ### Endpoint ``` POST /transcript ``` ### Query Parameters | Name | Type | Required | Description | | ----- | ------ | -------- | ----------------- | | `url` | string | โœ… | YouTube video URL | Supported URLs: * `https://www.youtube.com/watch?v=VIDEO_ID` * `https://www.youtube.com/watch?v=VIDEO_ID&list=RD...` * `https://youtu.be/VIDEO_ID` --- ### Example (curl) ```bash curl -X POST "http://localhost:8000/transcript?url=https://www.youtube.com/watch?v=PY9DcIMGxMs" ``` --- ### Example Response ```json { "video": { "id": "PY9DcIMGxMs", "title": "Everything you think you know about addiction is wrong | TED", "channel": "TED", "duration": 882, "url": "https://www.youtube.com/watch?v=PY9DcIMGxMs" }, "captions": [ { "start": 12.597, "end": 14.338, "text": "One of my earliest memories" } ], "language": "auto", "source": "human" } ``` --- ## ๐Ÿ“„ API Docs Once running, open: ``` /docs ``` Swagger UI is enabled by default. --- ## ๐Ÿณ Run Locally with Docker ### Build ```bash docker build -t youtube-transcript-api . ``` ### Run ```bash docker run -p 8000:8000 youtube-transcript-api ``` Then open: ``` http://localhost:8000/docs ``` --- ## โš™๏ธ Environment Variables (Optional) No environment variables are required. | Variable | Default | Description | | ----------------- | ------- | ---------------------------------- | | `PORT` | `8000` | Port to bind | | `REQUEST_TIMEOUT` | `25` | yt-dlp execution timeout (seconds) | --- ## ๐Ÿง  Design Notes * Uses `yt-dlp` **only for metadata and captions** * No Redis, database, or background workers * Fully stateless and container-friendly * Designed to fail safely with clear error responses --- ## โš ๏ธ Notes on Reliability This project depends on **YouTube availability and yt-dlp behavior**. On cloud platforms, requests may occasionally fail due to: * IP-based rate limiting * YouTube bot detection * regional consent or throttling When this happens, the API returns a structured error instead of crashing. --- ## โš ๏ธ Limitations * Does **not** download audio or video * Does **not** perform speech-to-text * Captions must already exist on YouTube * Shorts and embedded players are not a primary target --- ## ๐Ÿ“œ License MIT License --- ## ๐Ÿ™Œ Credits * FastAPI โ€” [https://fastapi.tiangolo.com/](https://fastapi.tiangolo.com/) * yt-dlp โ€” [https://github.com/yt-dlp/yt-dlp](https://github.com/yt-dlp/yt-dlp) --- ### โœ… Status * Docker tested * Real-world URLs tested * Cloud-friendly * Ready for open-source use