Subtitles & Transcription
There are three paths to add subtitles to a rendered video. Choose based on whether you already have captions and your preferred workflow.
The Three Paths
| Path | Method | When to Use |
|---|---|---|
| Path 1 | Auto-transcribe at render time | Simplest. No pre-work. Transcription happens inside the render job. |
| Path 2 | Standalone transcription (reuse captions) | Transcribe once, reuse captions across multiple renders. More efficient if rendering same video multiple ways. |
| Path 3 | Manual SRT upload & parse | You already have an SRT/VTT/ASS/LRC file. Parse it and use the captions. |
Path 1 (auto-transcribe) is the easiest and recommended default for most use cases.
Path 1: Auto-Transcribe at Render Time (Recommended)
The easiest path. Include auto_transcribe: true in your subtitle block and point it at a video/audio block. The render worker handles everything.
Step-by-Step
Step 1 — Upload your video:
curl -X POST \
-H "X-API-Key: YOUR_KEY" \
-F "file=@my-video.mp4" \
https://peako.shin0x.space/api/assets/upload
# Response:
# { "url": "https://peako.shin0x.space/assets/video-uuid.mp4" }
Step 2 — Submit render with auto-transcribe:
curl -X POST \
-H "X-API-Key: YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"blocks": {
"hero-video": {
"upload_url": "https://peako.shin0x.space/assets/video-uuid.mp4"
},
"subtitle-main": {
"auto_transcribe": true,
"transcribe_from": "hero-video"
}
},
"outputFormat": "mp4",
"delivery": "async"
}' \
https://peako.shin0x.space/api/templates/TEMPLATE_ID/render
# Response:
# { "jobId": "job-uuid", "status": "queued" }
Step 3 — Poll until done:
while true; do
RESULT=$(curl -s -H "X-API-Key: YOUR_KEY" \
https://peako.shin0x.space/api/jobs/job-uuid)
STATUS=$(echo "$RESULT" | jq -r '.status')
echo "Status: $STATUS"
[ "$STATUS" = "done" ] && break
[ "$STATUS" = "failed" ] && exit 1
sleep 5
done
OUTPUT=$(echo "$RESULT" | jq -r '.outputUrl')
echo "Output: $OUTPUT"
Step 4 — Download the result:
curl -o video-with-captions.mp4 "$OUTPUT"
What Happens
- The render worker downloads the source video
- Extracts the audio track
- Runs faster-whisper (speech-to-text) on the audio
- Generates timed captions automatically
- Burns the captions into the rendered video
- Returns the final output
Duration: Add ~15-30 seconds to render time for transcription.
Path 2: Standalone Transcription (For Reuse)
Use this when you want to transcribe a video once and reuse the captions across multiple renders. More efficient than Path 1 if you're rendering the same content multiple times.
Step-by-Step
Step 1 — Upload your video:
VIDEO_URL=$(curl -s -X POST \
-H "X-API-Key: YOUR_KEY" \
-F "file=@my-video.mp4" \
https://peako.shin0x.space/api/assets/upload | jq -r '.url')
echo "Video: $VIDEO_URL"
Step 2 — Submit a transcription job:
TRANSCRIBE_RESULT=$(curl -s -X POST \
-H "X-API-Key: YOUR_KEY" \
-H "Content-Type: application/json" \
-d "{\"assetUrl\": \"$VIDEO_URL\"}" \
https://peako.shin0x.space/api/transcribe)
TRANSCRIBE_JOB=$(echo "$TRANSCRIBE_RESULT" | jq -r '.jobId')
echo "Transcription job: $TRANSCRIBE_JOB"
Step 3 — Poll transcription until done:
while true; do
JOB=$(curl -s -H "X-API-Key: YOUR_KEY" \
https://peako.shin0x.space/api/jobs/$TRANSCRIBE_JOB)
STATUS=$(echo "$JOB" | jq -r '.status')
if [ "$STATUS" = "done" ]; then
CAPTIONS=$(echo "$JOB" | jq '.captions')
break
fi
[ "$STATUS" = "failed" ] && echo "Transcription failed" && exit 1
sleep 5
done
echo "Captions ready:"
echo "$CAPTIONS" | jq .
Captions response structure:
{
"jobId": "transcribe-uuid",
"status": "done",
"captions": [
{ "id": "1", "from": 0, "to": 2300, "text": "Hey everyone, welcome back." },
{ "id": "2", "from": 2300, "to": 5100, "text": "Today we're diving into the API." },
{ "id": "3", "from": 5100, "to": 7800, "text": "Let's start with authentication." }
]
}
Step 4 — Use captions in a render (or multiple renders!):
CAPTIONS_JSON=$(echo "$CAPTIONS" | jq -c '.')
# Render 1
curl -X POST \
-H "X-API-Key: YOUR_KEY" \
-H "Content-Type: application/json" \
-d "{
\"blocks\": {
\"hero-video\": { \"upload_url\": \"$VIDEO_URL\" },
\"subtitle-main\": {
\"captions\": $CAPTIONS_JSON,
\"style\": {
\"fontFamily\": \"Arial\",
\"fontSize\": 28,
\"color\": \"#FFFFFF\",
\"align\": \"center\"
}
}
},
\"outputFormat\": \"mp4\",
\"delivery\": \"async\"
}" \
https://peako.shin0x.space/api/templates/TEMPLATE_ID_1/render
# Render 2 (same captions, different template)
curl -X POST \
-H "X-API-Key: YOUR_KEY" \
-H "Content-Type: application/json" \
-d "{
\"blocks\": {
\"hero-video\": { \"upload_url\": \"$VIDEO_URL\" },
\"subtitle-main\": {
\"captions\": $CAPTIONS_JSON,
\"style\": {
\"fontFamily\": \"Inter\",
\"fontSize\": 32,
\"color\": \"#FFD700\",
\"align\": \"center\"
}
}
},
\"outputFormat\": \"mp4\",
\"delivery\": \"async\"
}" \
https://peako.shin0x.space/api/templates/TEMPLATE_ID_2/render
When to Use Path 2
✅ You'll render the same video in multiple templates
✅ You want to separate transcription from rendering
✅ You need precise control over caption timing
✅ You want to modify/edit captions before rendering
Advantages
- Reusable: Same captions in multiple renders
- Faster renders: No re-transcription, just use cached captions
- Flexible: Edit or filter captions before rendering
Path 3: Manual SRT Upload & Parse
Use this when you already have an SRT/VTT/ASS/LRC file (from a video editor, external transcription service, etc.).
Step-by-Step
Step 1 — Parse your subtitle file:
PARSE_RESULT=$(curl -s -X POST \
-H "X-API-Key: YOUR_KEY" \
-F "file=@subtitles.srt" \
https://peako.shin0x.space/api/subtitles/parse)
echo "$PARSE_RESULT"
# {
# "captions": [
# { "id": "1", "from": 0, "to": 2000, "text": "First subtitle" },
# { "id": "2", "from": 2000, "to": 4500, "text": "Second subtitle" }
# ]
# }
Step 2 — Use parsed captions in render (same as Path 2, step 4).
Supported Subtitle Formats
| Format | Extension | Example |
|---|---|---|
| SubRip | .srt | Standard subtitle format (most common) |
| WebVTT | .vtt | Web video text tracks format |
| ASS/SSA | .ass, .ssa | Advanced SubStation Alpha (supports styling) |
| LRC | .lrc | Lyric format (timing only, no styling) |
Parse Subtitle File
POST /api/subtitles/parse
Parse a subtitle file into a captions array that can be used in render requests.
Headers:
X-API-Key: <your-api-key>(required)
Request Format: multipart/form-data
Fields:
file(required) — subtitle file (max 1 MB)
Supported Format Details
SRT (SubRip) Format
Most common subtitle format. Simple text-based.
Example:
1
00:00:00,000 --> 00:00:02,500
First subtitle line
2
00:00:02,500 --> 00:00:05,000
Second subtitle line
3
00:00:05,000 --> 00:00:08,200
Third subtitle line
Can be multi-line
Timing format: HH:MM:SS,mmm (hours, minutes, seconds, milliseconds)
WebVTT (Web Video Text Tracks)
Similar to SRT, designed for web video.
Example:
WEBVTT
00:00:00.000 --> 00:00:02.500
First subtitle
00:00:02.500 --> 00:00:05.000
Second subtitle
Timing format: HH:MM:SS.mmm (note: period instead of comma)
ASS/SSA (Advanced SubStation Alpha)
Advanced format that supports styling (font, color, positioning). Only timing and text are extracted.
Example:
[Script Info]
Title: My Subtitles
[V4+ Styles]
Format: Name, ...
Style: Default,...
[Events]
Format: Layer, Start, End, Style, ...
Dialogue: 0,0:00:00.00,0:00:02.00,Default,,0,0,0,,First subtitle
Dialogue: 0,0:00:02.00,0:00:05.00,Default,,0,0,0,,Second subtitle
Timing format: H:MM:SS.CC (hours, minutes, seconds, centiseconds)
LRC (Lyric Format)
Simple timing format used for song lyrics.
Example:
[00:00.00]First subtitle
[00:02.50]Second subtitle
[00:05.00]Third subtitle
Timing format: [MM:SS.mmm] (minutes, seconds, milliseconds)
Response
{
"captions": [
{
"id": "1",
"from": 0,
"to": 2000,
"text": "First subtitle"
},
{
"id": "2",
"from": 2000,
"to": 4500,
"text": "Second subtitle"
},
{
"id": "3",
"from": 4500,
"to": 8200,
"text": "Third subtitle"
}
]
}
Response Fields:
| Field | Type | Description |
|---|---|---|
captions | array | Array of caption objects |
captions[].id | string | Unique ID for this caption (auto-generated) |
captions[].from | number | Start time in milliseconds |
captions[].to | number | End time in milliseconds |
captions[].text | string | Caption text |
All times in response are in milliseconds, regardless of source format:
- SRT
HH:MM:SS,mmm→ ms - WebVTT
HH:MM:SS.mmm→ ms - ASS
H:MM:SS.CC(centiseconds) → ms - LRC
[MM:SS.mmm]→ ms
File Size Limit
Maximum 1 MB per subtitle file.
Error Codes
200— Success400— Invalid format, unsupported extension, or parse error401— Missing or invalid API key413— File exceeds 1 MB
Example:
curl -X POST \
-H "X-API-Key: YOUR_KEY" \
-F "file=@my-subtitles.srt" \
https://peako.shin0x.space/api/subtitles/parse
# Response:
# {
# "captions": [
# { "id": "1", "from": 0, "to": 2000, "text": "Hello world" },
# { "id": "2", "from": 2000, "to": 5000, "text": "Welcome back" }
# ]
# }
Transcribe Asset
Start a standalone transcription job for a video or audio file. Returns a jobId that you can poll.
POST /api/transcribe
Headers:
X-API-Key: <your-api-key>(required)Content-Type: application/json
Request Body:
{
"assetUrl": "https://peako.shin0x.space/assets/video-uuid.mp4"
}
| Field | Type | Required | Description |
|---|---|---|---|
assetUrl | string | ✓ | URL of the video or audio file. Must be a Peako CDN URL (peako.shin0x.space/assets/...). No external URLs allowed. |
Response
{
"jobId": "550e8400-e29b-41d4-a716-446655440000"
}
Status: 202 Accepted (job queued, not yet complete)
Poll the Job
curl -X GET \
-H "X-API-Key: YOUR_KEY" \
https://peako.shin0x.space/api/jobs/550e8400-e29b-41d4-a716-446655440000
Response (while active):
{
"jobId": "550e8400-e29b-41d4-a716-446655440000",
"status": "active",
"progress": 0.45
}
Response (complete):
{
"jobId": "550e8400-e29b-41d4-a716-446655440000",
"status": "done",
"captions": [
{ "id": "1", "from": 0, "to": 2000, "text": "Audio content here" },
{ "id": "2", "from": 2000, "to": 4500, "text": "More audio content" }
]
}
Error Codes
202— Job queued400—assetUrlis not a valid Peako CDN URL401— Missing or invalid API key
Security Note
Only Peako CDN URLs are accepted (starting with peako.shin0x.space/assets/). External URLs are rejected to prevent SSRF attacks.
Example:
# Start transcription
TRANSCRIBE_JOB=$(curl -s -X POST \
-H "X-API-Key: YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"assetUrl": "https://peako.shin0x.space/assets/my-video-uuid.mp4"
}' \
https://peako.shin0x.space/api/transcribe | jq -r '.jobId')
echo "Transcription job: $TRANSCRIBE_JOB"
# Poll status
while true; do
JOB=$(curl -s -H "X-API-Key: YOUR_KEY" \
https://peako.shin0x.space/api/jobs/$TRANSCRIBE_JOB)
STATUS=$(echo "$JOB" | jq -r '.status')
if [ "$STATUS" = "done" ]; then
echo "$JOB" | jq '.captions'
break
fi
sleep 5
done
Transcription Technology
Peako uses faster-whisper (a CTranslate2 reimplementation of OpenAI Whisper) running locally on the server.
Characteristics
| Property | Details |
|---|---|
| Engine | faster-whisper (CTranslate2) |
| Language | Auto-detected. Supports 100+ languages. |
| Speed | Approximately 2–4x real-time on CPU. A 60-second video takes ~15–30 seconds to transcribe. |
| Accuracy | High on clear speech. Degrades with background noise, heavy accents, poor audio quality. |
| Concurrency | Max 2 concurrent transcription jobs per server |
| Max file size | 2 GB (same as upload limit) |
Performance Tips
- Optimize audio quality — Reduce background noise before transcribing for better accuracy
- Use Path 1 (auto-transcribe) for simple workflows — No extra steps
- Use Path 2 for batch workflows — Transcribe once, reuse captions
- Pre-transcribe during off-peak hours — Transcription jobs may queue during peak usage
Complete Example: Full Workflow
Here's a complete bash script combining all steps:
#!/bin/bash
set -e
API_KEY="your-api-key"
TEMPLATE_ID="your-template-id"
VIDEO_FILE="my-video.mp4"
echo "=== Peako Subtitle Workflow Example ==="
# Option 1: Path 1 (Auto-Transcribe at Render Time)
echo ""
echo "Option 1: Auto-transcribe at render time"
echo "==========================================="
VIDEO_URL=$(curl -s -X POST \
-H "X-API-Key: $API_KEY" \
-F "file=@$VIDEO_FILE" \
https://peako.shin0x.space/api/assets/upload | jq -r '.url')
echo "Video uploaded: $VIDEO_URL"
JOB_ID=$(curl -s -X POST \
-H "X-API-Key: $API_KEY" \
-H "Content-Type: application/json" \
-d "{
\"blocks\": {
\"hero-video\": { \"upload_url\": \"$VIDEO_URL\" },
\"subtitle-main\": {
\"auto_transcribe\": true,
\"transcribe_from\": \"hero-video\"
}
},
\"outputFormat\": \"mp4\",
\"delivery\": \"async\"
}" \
https://peako.shin0x.space/api/templates/$TEMPLATE_ID/render | jq -r '.jobId')
echo "Render job: $JOB_ID"
echo "Polling status..."
while true; do
RESULT=$(curl -s -H "X-API-Key: $API_KEY" \
https://peako.shin0x.space/api/jobs/$JOB_ID)
STATUS=$(echo "$RESULT" | jq -r '.status')
echo "Status: $STATUS"
if [ "$STATUS" = "done" ]; then
OUTPUT=$(echo "$RESULT" | jq -r '.outputUrl')
curl -o output-path1.mp4 "$OUTPUT"
echo "✓ Path 1 complete: output-path1.mp4"
break
fi
[ "$STATUS" = "failed" ] && echo "✗ Failed" && exit 1
sleep 5
done
# Option 2: Path 2 (Standalone Transcription)
echo ""
echo "Option 2: Standalone transcription (reuse captions)"
echo "=================================================="
TRANSCRIBE_JOB=$(curl -s -X POST \
-H "X-API-Key: $API_KEY" \
-H "Content-Type: application/json" \
-d "{\"assetUrl\": \"$VIDEO_URL\"}" \
https://peako.shin0x.space/api/transcribe | jq -r '.jobId')
echo "Transcription job: $TRANSCRIBE_JOB"
while true; do
JOB=$(curl -s -H "X-API-Key: $API_KEY" \
https://peako.shin0x.space/api/jobs/$TRANSCRIBE_JOB)
STATUS=$(echo "$JOB" | jq -r '.status')
if [ "$STATUS" = "done" ]; then
CAPTIONS=$(echo "$JOB" | jq '.captions')
break
fi
sleep 5
done
CAPTIONS_JSON=$(echo "$CAPTIONS" | jq -c '.')
RENDER_JOB=$(curl -s -X POST \
-H "X-API-Key: $API_KEY" \
-H "Content-Type: application/json" \
-d "{
\"blocks\": {
\"hero-video\": { \"upload_url\": \"$VIDEO_URL\" },
\"subtitle-main\": {
\"captions\": $CAPTIONS_JSON,
\"style\": {
\"fontFamily\": \"Inter\",
\"fontSize\": 32,
\"color\": \"#FFFFFF\"
}
}
},
\"outputFormat\": \"mp4\",
\"delivery\": \"async\"
}" \
https://peako.shin0x.space/api/templates/$TEMPLATE_ID/render | jq -r '.jobId')
while true; do
RESULT=$(curl -s -H "X-API-Key: $API_KEY" \
https://peako.shin0x.space/api/jobs/$RENDER_JOB)
STATUS=$(echo "$RESULT" | jq -r '.status')
if [ "$STATUS" = "done" ]; then
OUTPUT=$(echo "$RESULT" | jq -r '.outputUrl')
curl -o output-path2.mp4 "$OUTPUT"
echo "✓ Path 2 complete: output-path2.mp4"
break
fi
sleep 5
done
# Option 3: Path 3 (SRT Upload)
echo ""
echo "Option 3: SRT file upload & parse"
echo "=================================="
PARSE_RESULT=$(curl -s -X POST \
-H "X-API-Key: $API_KEY" \
-F "file=@subtitles.srt" \
https://peako.shin0x.space/api/subtitles/parse)
CAPTIONS=$(echo "$PARSE_RESULT" | jq '.captions')
CAPTIONS_JSON=$(echo "$CAPTIONS" | jq -c '.')
RENDER_JOB=$(curl -s -X POST \
-H "X-API-Key: $API_KEY" \
-H "Content-Type: application/json" \
-d "{
\"blocks\": {
\"hero-video\": { \"upload_url\": \"$VIDEO_URL\" },
\"subtitle-main\": {
\"captions\": $CAPTIONS_JSON
}
},
\"outputFormat\": \"mp4\",
\"delivery\": \"async\"
}" \
https://peako.shin0x.space/api/templates/$TEMPLATE_ID/render | jq -r '.jobId')
while true; do
RESULT=$(curl -s -H "X-API-Key: $API_KEY" \
https://peako.shin0x.space/api/jobs/$RENDER_JOB)
STATUS=$(echo "$RESULT" | jq -r '.status')
if [ "$STATUS" = "done" ]; then
OUTPUT=$(echo "$RESULT" | jq -r '.outputUrl')
curl -o output-path3.mp4 "$OUTPUT"
echo "✓ Path 3 complete: output-path3.mp4"
break
fi
sleep 5
done
echo ""
echo "=== All paths complete! ==="
Next: Jobs