Initial release: YouTube Transcript API v1.0.0

This commit is contained in:
BigDaddyAman 2026-02-02 01:52:38 +08:00
commit 3adbc16dfb
20 changed files with 589 additions and 0 deletions

5
.gitignore vendored Normal file
View File

@ -0,0 +1,5 @@
.venv/
__pycache__/
*.pyc
.env
.DS_Store

76
CONTRIBUTING Normal file
View File

@ -0,0 +1,76 @@
# Contributing
Thanks for your interest in contributing! 🎉
This project aims to stay **simple, stable, and template-friendly**, so please read this first.
---
## 🧭 Project Principles
- **Caption-only** (no audio/video downloads)
- **Stateless** (no database required)
- **Railway & Docker friendly**
- **Minimal dependencies**
- **Clear API contracts**
Changes that break these principles are unlikely to be accepted.
---
## 🐛 Reporting Bugs
Please include:
- The YouTube URL used
- Expected vs actual behavior
- Logs or error messages
- Whether captions were human or auto-generated
Open an issue with a clear title and reproduction steps.
---
## ✨ Feature Requests
Good feature requests:
- Improve caption parsing / cleanup
- Better validation or error messages
- Performance or stability improvements
- Optional flags that do NOT break defaults
Please avoid:
- Adding mandatory databases
- Downloading media files
- Authentication requirements
---
## 🧪 Development Setup
```bash
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -r requirements.txt
uvicorn app.main:app --reload
```
Open:
```
http://localhost:8000/docs
```
---
## 🧹 Code Style
- Python 3.12+ compatible
- Use type hints
- Keep functions small and readable
- Avoid over-engineering
---
## 📜 License
By contributing, you agree that your contributions will be licensed under the **MIT License**.
Thank you for helping improve the project 🙌

19
Dockerfile Normal file
View File

@ -0,0 +1,19 @@
FROM python:3.13-slim
ENV PYTHONUNBUFFERED=1
ENV PIP_NO_CACHE_DIR=1
RUN apt-get update && apt-get install -y \
ca-certificates \
&& rm -rf /var/lib/apt/lists/*
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
EXPOSE 8000
CMD ["sh", "-c", "uvicorn app.main:app --host 0.0.0.0 --port ${PORT:-8000}"]

21
LICENSE Normal file
View File

@ -0,0 +1,21 @@
MIT License
Copyright (c) 2026 Aman
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

176
README.md Normal file
View File

@ -0,0 +1,176 @@
![License](https://img.shields.io/badge/license-MIT-blue.svg)
![Python](https://img.shields.io/badge/python-3.13-blue)
![Framework](https://img.shields.io/badge/FastAPI-green)
![Docker](https://img.shields.io/badge/docker-supported-blue)
# YouTube Transcript API
A lightweight **FastAPI** service that extracts **YouTube video captions (no speech-to-text)**.
No video or audio downloads — just clean, structured captions returned as JSON.
Built to be simple, stateless, and easy to deploy anywhere.
---
## ✨ Features
* Extract **human or auto-generated captions**
* No media downloads (captions only)
* Clean JSON output with timestamps
* Accepts normal, playlist, and radio-style YouTube URLs (single video only)
* Docker friendly
* Built-in Swagger UI at `/docs`
---
## 🧪 API Usage
### Endpoint
```
POST /transcript
```
### Query Parameters
| Name | Type | Required | Description |
| ----- | ------ | -------- | ----------------- |
| `url` | string | ✅ | YouTube video URL |
Supported URLs:
* `https://www.youtube.com/watch?v=VIDEO_ID`
* `https://www.youtube.com/watch?v=VIDEO_ID&list=RD...`
* `https://youtu.be/VIDEO_ID`
---
### Example (curl)
```bash
curl -X POST "http://localhost:8000/transcript?url=https://www.youtube.com/watch?v=PY9DcIMGxMs"
```
---
### Example Response
```json
{
"video": {
"id": "PY9DcIMGxMs",
"title": "Everything you think you know about addiction is wrong | TED",
"channel": "TED",
"duration": 882,
"url": "https://www.youtube.com/watch?v=PY9DcIMGxMs"
},
"captions": [
{
"start": 12.597,
"end": 14.338,
"text": "One of my earliest memories"
}
],
"language": "auto",
"source": "human"
}
```
---
## 📄 API Docs
Once running, open:
```
/docs
```
Swagger UI is enabled by default.
---
## 🐳 Run Locally with Docker
### Build
```bash
docker build -t youtube-transcript-api .
```
### Run
```bash
docker run -p 8000:8000 youtube-transcript-api
```
Then open:
```
http://localhost:8000/docs
```
---
## ⚙️ Environment Variables (Optional)
No environment variables are required.
| Variable | Default | Description |
| ----------------- | ------- | ---------------------------------- |
| `PORT` | `8000` | Port to bind |
| `REQUEST_TIMEOUT` | `25` | yt-dlp execution timeout (seconds) |
---
## 🧠 Design Notes
* Uses `yt-dlp` **only for metadata and captions**
* No Redis, database, or background workers
* Fully stateless and container-friendly
* Designed to fail safely with clear error responses
---
## ⚠️ Notes on Reliability
This project depends on **YouTube availability and yt-dlp behavior**.
On cloud platforms, requests may occasionally fail due to:
* IP-based rate limiting
* YouTube bot detection
* regional consent or throttling
When this happens, the API returns a structured error instead of crashing.
---
## ⚠️ Limitations
* Does **not** download audio or video
* Does **not** perform speech-to-text
* Captions must already exist on YouTube
* Shorts and embedded players are not a primary target
---
## 📜 License
MIT License
---
## 🙌 Credits
* FastAPI — [https://fastapi.tiangolo.com/](https://fastapi.tiangolo.com/)
* yt-dlp — [https://github.com/yt-dlp/yt-dlp](https://github.com/yt-dlp/yt-dlp)
---
### ✅ Status
* Docker tested
* Real-world URLs tested
* Cloud-friendly
* Ready for open-source use

0
__init__.py Normal file
View File

0
app/__init__.py Normal file
View File

19
app/main.py Normal file
View File

@ -0,0 +1,19 @@
from fastapi import FastAPI
from routes.health import router as health_router
from routes.transcript import router as transcript_router
app = FastAPI(
title="YouTube Transcript API",
description="Caption-only YouTube transcript extraction (no downloads)",
version="1.0.0",
)
@app.get("/")
def root():
return {
"name": "YouTube Transcript API",
"docs": "/docs"
}
app.include_router(health_router)
app.include_router(transcript_router)

19
core/config.py Normal file
View File

@ -0,0 +1,19 @@
from pydantic_settings import BaseSettings
class Settings(BaseSettings):
app_env: str = "production"
request_timeout: int = 25
max_video_duration: int = 7200
enable_redis: bool = False
enable_postgres: bool = False
enable_rate_limit: bool = False
class Config:
env_file = ".env"
extra = "ignore"
settings = Settings()

21
core/errors.py Normal file
View File

@ -0,0 +1,21 @@
from fastapi import HTTPException
def bad_request(message: str, code: str = "BAD_REQUEST"):
raise HTTPException(
status_code=400,
detail={
"error": message,
"code": code,
},
)
def not_found(message: str, code: str = "NOT_FOUND"):
raise HTTPException(
status_code=404,
detail={
"error": message,
"code": code,
},
)

6
requirements.txt Normal file
View File

@ -0,0 +1,6 @@
fastapi==0.128.0
uvicorn==0.40.0
pydantic==2.12.5
pydantic-settings==2.12.0
webvtt-py==0.5.1
yt-dlp==2026.1.31

8
routes/health.py Normal file
View File

@ -0,0 +1,8 @@
from fastapi import APIRouter
router = APIRouter()
@router.get("/health")
def health():
return {"status": "ok"}

45
routes/transcript.py Normal file
View File

@ -0,0 +1,45 @@
from fastapi import APIRouter, Query
from utils.validators import validate_youtube_url
from utils.filesystem import temp_dir
from core.errors import not_found
from services.ytdlp import extract_metadata_and_captions
from services.captions import parse_vtt
from services.metadata import normalize_metadata
from schemas.transcript import TranscriptResponse
router = APIRouter()
@router.post("/transcript", response_model=TranscriptResponse)
def transcript(
url: str = Query(..., description="YouTube video URL"),
):
validate_youtube_url(url)
with temp_dir() as tmp:
metadata, caption_files = extract_metadata_and_captions(url, tmp)
if not caption_files:
not_found("No captions available for this video", "NO_CAPTIONS")
human = [p for p in caption_files if "auto" not in p.name.lower()]
auto = [p for p in caption_files if "auto" in p.name.lower()]
if human:
caption_path = human[0]
source = "human"
elif auto:
caption_path = auto[0]
source = "auto"
else:
not_found("No captions available", "NO_CAPTIONS")
captions = parse_vtt(str(caption_path))
video = normalize_metadata(metadata)
return TranscriptResponse(
video=video,
captions=captions,
language="auto",
source=source,
)

6
schemas/error.py Normal file
View File

@ -0,0 +1,6 @@
from pydantic import BaseModel
class ErrorResponse(BaseModel):
error: str
code: str

23
schemas/transcript.py Normal file
View File

@ -0,0 +1,23 @@
from pydantic import BaseModel
from typing import List
from typing import List, Literal
class CaptionSegment(BaseModel):
start: float
end: float
text: str
class VideoMetadata(BaseModel):
id: str
title: str
channel: str
duration: int
url: str
class TranscriptResponse(BaseModel):
video: VideoMetadata
captions: List[CaptionSegment]
language: str
source: Literal["human", "auto"]

51
services/captions.py Normal file
View File

@ -0,0 +1,51 @@
import webvtt
from schemas.transcript import CaptionSegment
from typing import List
def parse_vtt(path: str):
segments = []
for caption in webvtt.read(path):
segments.append(
CaptionSegment(
start=_to_seconds(caption.start),
end=_to_seconds(caption.end),
text=caption.text.strip(),
)
)
return dedupe_segments(segments)
def _to_seconds(ts: str) -> float:
h, m, rest = ts.split(":")
s, ms = rest.split(".")
return (
int(h) * 3600
+ int(m) * 60
+ int(s)
+ int(ms) / 1000
)
def dedupe_segments(segments: List[CaptionSegment]) -> List[CaptionSegment]:
cleaned = []
for seg in segments:
text = seg.text.strip()
if not text:
continue
if cleaned:
prev = cleaned[-1]
prev_text = prev.text.strip()
if prev_text and prev_text in text:
cleaned[-1] = seg
continue
cleaned.append(seg)
return cleaned

11
services/metadata.py Normal file
View File

@ -0,0 +1,11 @@
from schemas.transcript import VideoMetadata
def normalize_metadata(raw: dict) -> VideoMetadata:
return VideoMetadata(
id=raw["id"],
title=raw["title"],
channel=raw.get("uploader", ""),
duration=raw.get("duration", 0),
url=raw["webpage_url"],
)

50
services/ytdlp.py Normal file
View File

@ -0,0 +1,50 @@
import json
import subprocess
from pathlib import Path
from typing import Tuple, List
from core.errors import bad_request
from core.config import settings
def extract_metadata_and_captions(
url: str,
workdir: str,
) -> Tuple[dict, List[Path]]:
cmd = [
"yt-dlp",
"--skip-download",
"--write-subs",
"--write-auto-subs",
"--sub-format", "vtt",
"--no-playlist",
"--print-json",
"-o", f"{workdir}/%(id)s",
url,
]
try:
result = subprocess.run(
cmd,
capture_output=True,
text=True,
check=True,
timeout=settings.request_timeout,
)
except subprocess.TimeoutExpired:
bad_request("yt-dlp timed out", "TIMEOUT")
except subprocess.CalledProcessError:
bad_request("Failed to extract video data", "YTDLP_ERROR")
lines = result.stdout.splitlines()
if not lines:
bad_request("No metadata returned from yt-dlp", "EMPTY_RESPONSE")
try:
metadata = json.loads(lines[0])
except json.JSONDecodeError:
bad_request("Invalid metadata returned from yt-dlp", "INVALID_METADATA")
subtitle_files = list(Path(workdir).glob("*.vtt"))
return metadata, subtitle_files

5
utils/filesystem.py Normal file
View File

@ -0,0 +1,5 @@
from tempfile import TemporaryDirectory
def temp_dir():
return TemporaryDirectory()

28
utils/validators.py Normal file
View File

@ -0,0 +1,28 @@
from urllib.parse import urlparse, parse_qs
from core.errors import bad_request
YOUTUBE_DOMAINS = ("youtube.com", "www.youtube.com", "youtu.be")
def validate_youtube_url(url: str):
try:
parsed = urlparse(url)
except Exception:
bad_request("Invalid YouTube URL", "INVALID_URL")
if parsed.netloc not in YOUTUBE_DOMAINS:
bad_request("Invalid YouTube URL", "INVALID_URL")
if parsed.netloc == "youtu.be":
if not parsed.path.strip("/"):
bad_request("Invalid YouTube video URL", "INVALID_URL")
return
if parsed.path == "/watch":
qs = parse_qs(parsed.query)
if "v" not in qs or not qs["v"][0]:
bad_request("Invalid YouTube video URL", "INVALID_URL")
return
bad_request("Invalid YouTube video URL", "INVALID_URL")