Transcripts¶
The YouTube Toolkit provides two transcript extraction flows: a synchronous single-video endpoint and an async batch endpoint for processing multiple URLs in one job.
Transcript endpoints¶
| Endpoint |
Purpose |
GET /v1/youtube/transcript |
Extract captions for a single YouTube video |
POST /v1/youtube/transcript/batch |
Submit up to 20 URLs for async batch extraction |
GET /v1/youtube/transcript/batch/{job_id} |
Poll the status and results of a batch job |
REST API Examples¶
Get a transcript¶
curl "https://youtube.toolkitapi.io/v1/youtube/transcript?url=https://www.youtube.com/watch?v=dQw4w9WgXcQ" \
-H "X-API-Key: YOUR_KEY"
const params = new URLSearchParams({ url: "https://www.youtube.com/watch?v=dQw4w9WgXcQ" });
const resp = await fetch(`https://youtube.toolkitapi.io/v1/youtube/transcript?${params}`, {
headers: { "X-API-Key": "YOUR_KEY" },
});
const data = await resp.json();
console.log(data.transcript.substring(0, 200));
Batch transcript request¶
curl -X POST "https://youtube.toolkitapi.io/v1/youtube/transcript/batch" \
-H "X-API-Key: YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"urls": ["https://www.youtube.com/watch?v=dQw4w9WgXcQ", "https://www.youtube.com/watch?v=abc123"]}'
Python SDK examples¶
Single transcript¶
from toolkitapi import Media
with Media(api_key="tk_...") as media:
result = media.youtube_transcript("https://www.youtube.com/watch?v=dQw4w9WgXcQ")
print(result["lang"], len(result["content"]), "segments")
Request a specific language¶
from toolkitapi import Media
with Media(api_key="tk_...") as media:
result = media.youtube_transcript(
"https://www.youtube.com/watch?v=dQw4w9WgXcQ",
lang="es",
)
for segment in result["content"]:
print(f"[{segment['offset']}ms] {segment['text']}")
Submit a batch job¶
from toolkitapi import Media
urls = [
"https://www.youtube.com/watch?v=dQw4w9WgXcQ",
"https://www.youtube.com/watch?v=9bZkp7q19f0",
"https://youtu.be/kffacxfA7G4",
]
with Media(api_key="tk_...") as media:
job = media.youtube_transcript_batch({"urls": urls, "lang": "en"})
print(job["jobId"])
Poll batch job status¶
import time
from toolkitapi import Media
with Media(api_key="tk_...") as media:
job_id = "abc-123"
while True:
status = media.youtube_transcript_batch_status(job_id)
if status["status"] in ("completed", "failed"):
break
time.sleep(2)
for item in status["results"]:
if item["status"] == "succeeded":
print(f"{item['id']}: {len(item['content'])} segments")
else:
print(f"{item['id']}: FAILED — {item['error']}")
Transcript response fields¶
| Field |
Type |
Description |
id |
string |
YouTube video ID |
url |
string |
Canonical video URL |
lang |
string |
Language code of the returned transcript |
availableLangs |
string[] |
All language codes available for this video |
content |
array |
Transcript segments (see below) |
content[] segment fields¶
| Field |
Type |
Description |
text |
string |
Segment text |
offset |
integer |
Start time in milliseconds |
duration |
integer |
Duration in milliseconds |
lang |
string |
Segment language code |
Batch job fields¶
Submit response (POST /transcript/batch)¶
| Field |
Type |
Description |
jobId |
string |
Unique job identifier for polling |
Status response (GET /transcript/batch/{job_id})¶
| Field |
Type |
Description |
jobId |
string |
Job identifier |
status |
string |
pending, active, or completed |
results |
array |
Per-URL results (see below) |
stats |
object |
total, succeeded, failed counts |
completedAt |
string |
ISO 8601 completion timestamp (nullable) |
results[] item fields¶
| Field |
Type |
Description |
id |
string |
Video ID |
url |
string |
Video URL |
status |
string |
succeeded or failed |
content |
array |
Transcript segments (when succeeded) |
lang |
string |
Detected language (when succeeded) |
error |
string |
Error message (when failed) |
Tips¶
- The
lang parameter is a preference — if the requested language is unavailable the API returns the closest available language. Check the lang field in the response to confirm what was actually returned.
availableLangs lists all auto-generated and manually added captions for the video.
- Batch jobs expire after a short window — poll promptly after submission and persist results on your end.
- For AI/RAG pipelines, the
content array gives you clean, timestamped segments ready for chunking and embedding.
- The batch endpoint accepts up to 20 URLs per job. For larger sets, split into multiple jobs and submit concurrently.
- https://youtube.toolkitapi.io/tools/youtube/youtube-transcript/
- https://youtube.toolkitapi.io/tools/youtube/youtube-transcript-batch/