Skip to main content

Subtitles & Transcription

There are three paths to add subtitles to a rendered video. Choose based on whether you already have captions and your preferred workflow.

The Three Paths

PathMethodWhen to Use
Path 1Auto-transcribe at render timeSimplest. No pre-work. Transcription happens inside the render job.
Path 2Standalone transcription (reuse captions)Transcribe once, reuse captions across multiple renders. More efficient if rendering same video multiple ways.
Path 3Manual SRT upload & parseYou already have an SRT/VTT/ASS/LRC file. Parse it and use the captions.
tip

Path 1 (auto-transcribe) is the easiest and recommended default for most use cases.


The easiest path. Include auto_transcribe: true in your subtitle block and point it at a video/audio block. The render worker handles everything.

Step-by-Step

Step 1 — Upload your video:

curl -X POST \
-H "X-API-Key: YOUR_KEY" \
-F "file=@my-video.mp4" \
https://peako.shin0x.space/api/assets/upload

# Response:
# { "url": "https://peako.shin0x.space/assets/video-uuid.mp4" }

Step 2 — Submit render with auto-transcribe:

curl -X POST \
-H "X-API-Key: YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"blocks": {
"hero-video": {
"upload_url": "https://peako.shin0x.space/assets/video-uuid.mp4"
},
"subtitle-main": {
"auto_transcribe": true,
"transcribe_from": "hero-video"
}
},
"outputFormat": "mp4",
"delivery": "async"
}' \
https://peako.shin0x.space/api/templates/TEMPLATE_ID/render

# Response:
# { "jobId": "job-uuid", "status": "queued" }

Step 3 — Poll until done:

while true; do
RESULT=$(curl -s -H "X-API-Key: YOUR_KEY" \
https://peako.shin0x.space/api/jobs/job-uuid)
STATUS=$(echo "$RESULT" | jq -r '.status')
echo "Status: $STATUS"

[ "$STATUS" = "done" ] && break
[ "$STATUS" = "failed" ] && exit 1
sleep 5
done

OUTPUT=$(echo "$RESULT" | jq -r '.outputUrl')
echo "Output: $OUTPUT"

Step 4 — Download the result:

curl -o video-with-captions.mp4 "$OUTPUT"

What Happens

  • The render worker downloads the source video
  • Extracts the audio track
  • Runs faster-whisper (speech-to-text) on the audio
  • Generates timed captions automatically
  • Burns the captions into the rendered video
  • Returns the final output

Duration: Add ~15-30 seconds to render time for transcription.


Path 2: Standalone Transcription (For Reuse)

Use this when you want to transcribe a video once and reuse the captions across multiple renders. More efficient than Path 1 if you're rendering the same content multiple times.

Step-by-Step

Step 1 — Upload your video:

VIDEO_URL=$(curl -s -X POST \
-H "X-API-Key: YOUR_KEY" \
-F "file=@my-video.mp4" \
https://peako.shin0x.space/api/assets/upload | jq -r '.url')

echo "Video: $VIDEO_URL"

Step 2 — Submit a transcription job:

TRANSCRIBE_RESULT=$(curl -s -X POST \
-H "X-API-Key: YOUR_KEY" \
-H "Content-Type: application/json" \
-d "{\"assetUrl\": \"$VIDEO_URL\"}" \
https://peako.shin0x.space/api/transcribe)

TRANSCRIBE_JOB=$(echo "$TRANSCRIBE_RESULT" | jq -r '.jobId')
echo "Transcription job: $TRANSCRIBE_JOB"

Step 3 — Poll transcription until done:

while true; do
JOB=$(curl -s -H "X-API-Key: YOUR_KEY" \
https://peako.shin0x.space/api/jobs/$TRANSCRIBE_JOB)
STATUS=$(echo "$JOB" | jq -r '.status')

if [ "$STATUS" = "done" ]; then
CAPTIONS=$(echo "$JOB" | jq '.captions')
break
fi

[ "$STATUS" = "failed" ] && echo "Transcription failed" && exit 1
sleep 5
done

echo "Captions ready:"
echo "$CAPTIONS" | jq .

Captions response structure:

{
"jobId": "transcribe-uuid",
"status": "done",
"captions": [
{ "id": "1", "from": 0, "to": 2300, "text": "Hey everyone, welcome back." },
{ "id": "2", "from": 2300, "to": 5100, "text": "Today we're diving into the API." },
{ "id": "3", "from": 5100, "to": 7800, "text": "Let's start with authentication." }
]
}

Step 4 — Use captions in a render (or multiple renders!):

CAPTIONS_JSON=$(echo "$CAPTIONS" | jq -c '.')

# Render 1
curl -X POST \
-H "X-API-Key: YOUR_KEY" \
-H "Content-Type: application/json" \
-d "{
\"blocks\": {
\"hero-video\": { \"upload_url\": \"$VIDEO_URL\" },
\"subtitle-main\": {
\"captions\": $CAPTIONS_JSON,
\"style\": {
\"fontFamily\": \"Arial\",
\"fontSize\": 28,
\"color\": \"#FFFFFF\",
\"align\": \"center\"
}
}
},
\"outputFormat\": \"mp4\",
\"delivery\": \"async\"
}" \
https://peako.shin0x.space/api/templates/TEMPLATE_ID_1/render

# Render 2 (same captions, different template)
curl -X POST \
-H "X-API-Key: YOUR_KEY" \
-H "Content-Type: application/json" \
-d "{
\"blocks\": {
\"hero-video\": { \"upload_url\": \"$VIDEO_URL\" },
\"subtitle-main\": {
\"captions\": $CAPTIONS_JSON,
\"style\": {
\"fontFamily\": \"Inter\",
\"fontSize\": 32,
\"color\": \"#FFD700\",
\"align\": \"center\"
}
}
},
\"outputFormat\": \"mp4\",
\"delivery\": \"async\"
}" \
https://peako.shin0x.space/api/templates/TEMPLATE_ID_2/render

When to Use Path 2

✅ You'll render the same video in multiple templates
✅ You want to separate transcription from rendering
✅ You need precise control over caption timing
✅ You want to modify/edit captions before rendering

Advantages

  • Reusable: Same captions in multiple renders
  • Faster renders: No re-transcription, just use cached captions
  • Flexible: Edit or filter captions before rendering

Path 3: Manual SRT Upload & Parse

Use this when you already have an SRT/VTT/ASS/LRC file (from a video editor, external transcription service, etc.).

Step-by-Step

Step 1 — Parse your subtitle file:

PARSE_RESULT=$(curl -s -X POST \
-H "X-API-Key: YOUR_KEY" \
-F "file=@subtitles.srt" \
https://peako.shin0x.space/api/subtitles/parse)

echo "$PARSE_RESULT"
# {
# "captions": [
# { "id": "1", "from": 0, "to": 2000, "text": "First subtitle" },
# { "id": "2", "from": 2000, "to": 4500, "text": "Second subtitle" }
# ]
# }

Step 2 — Use parsed captions in render (same as Path 2, step 4).

Supported Subtitle Formats

FormatExtensionExample
SubRip.srtStandard subtitle format (most common)
WebVTT.vttWeb video text tracks format
ASS/SSA.ass, .ssaAdvanced SubStation Alpha (supports styling)
LRC.lrcLyric format (timing only, no styling)

Parse Subtitle File

POST /api/subtitles/parse

Parse a subtitle file into a captions array that can be used in render requests.

Headers:

  • X-API-Key: <your-api-key> (required)

Request Format: multipart/form-data

Fields:

  • file (required) — subtitle file (max 1 MB)

Supported Format Details

SRT (SubRip) Format

Most common subtitle format. Simple text-based.

Example:

1
00:00:00,000 --> 00:00:02,500
First subtitle line

2
00:00:02,500 --> 00:00:05,000
Second subtitle line

3
00:00:05,000 --> 00:00:08,200
Third subtitle line
Can be multi-line

Timing format: HH:MM:SS,mmm (hours, minutes, seconds, milliseconds)

WebVTT (Web Video Text Tracks)

Similar to SRT, designed for web video.

Example:

WEBVTT

00:00:00.000 --> 00:00:02.500
First subtitle

00:00:02.500 --> 00:00:05.000
Second subtitle

Timing format: HH:MM:SS.mmm (note: period instead of comma)

ASS/SSA (Advanced SubStation Alpha)

Advanced format that supports styling (font, color, positioning). Only timing and text are extracted.

Example:

[Script Info]
Title: My Subtitles

[V4+ Styles]
Format: Name, ...
Style: Default,...

[Events]
Format: Layer, Start, End, Style, ...
Dialogue: 0,0:00:00.00,0:00:02.00,Default,,0,0,0,,First subtitle
Dialogue: 0,0:00:02.00,0:00:05.00,Default,,0,0,0,,Second subtitle

Timing format: H:MM:SS.CC (hours, minutes, seconds, centiseconds)

LRC (Lyric Format)

Simple timing format used for song lyrics.

Example:

[00:00.00]First subtitle
[00:02.50]Second subtitle
[00:05.00]Third subtitle

Timing format: [MM:SS.mmm] (minutes, seconds, milliseconds)

Response

{
"captions": [
{
"id": "1",
"from": 0,
"to": 2000,
"text": "First subtitle"
},
{
"id": "2",
"from": 2000,
"to": 4500,
"text": "Second subtitle"
},
{
"id": "3",
"from": 4500,
"to": 8200,
"text": "Third subtitle"
}
]
}

Response Fields:

FieldTypeDescription
captionsarrayArray of caption objects
captions[].idstringUnique ID for this caption (auto-generated)
captions[].fromnumberStart time in milliseconds
captions[].tonumberEnd time in milliseconds
captions[].textstringCaption text
note

All times in response are in milliseconds, regardless of source format:

  • SRT HH:MM:SS,mmm → ms
  • WebVTT HH:MM:SS.mmm → ms
  • ASS H:MM:SS.CC (centiseconds) → ms
  • LRC [MM:SS.mmm] → ms

File Size Limit

Maximum 1 MB per subtitle file.

Error Codes

  • 200 — Success
  • 400 — Invalid format, unsupported extension, or parse error
  • 401 — Missing or invalid API key
  • 413 — File exceeds 1 MB

Example:

curl -X POST \
-H "X-API-Key: YOUR_KEY" \
-F "file=@my-subtitles.srt" \
https://peako.shin0x.space/api/subtitles/parse

# Response:
# {
# "captions": [
# { "id": "1", "from": 0, "to": 2000, "text": "Hello world" },
# { "id": "2", "from": 2000, "to": 5000, "text": "Welcome back" }
# ]
# }

Transcribe Asset

Start a standalone transcription job for a video or audio file. Returns a jobId that you can poll.

POST /api/transcribe

Headers:

  • X-API-Key: <your-api-key> (required)
  • Content-Type: application/json

Request Body:

{
"assetUrl": "https://peako.shin0x.space/assets/video-uuid.mp4"
}
FieldTypeRequiredDescription
assetUrlstringURL of the video or audio file. Must be a Peako CDN URL (peako.shin0x.space/assets/...). No external URLs allowed.

Response

{
"jobId": "550e8400-e29b-41d4-a716-446655440000"
}

Status: 202 Accepted (job queued, not yet complete)

Poll the Job

curl -X GET \
-H "X-API-Key: YOUR_KEY" \
https://peako.shin0x.space/api/jobs/550e8400-e29b-41d4-a716-446655440000

Response (while active):

{
"jobId": "550e8400-e29b-41d4-a716-446655440000",
"status": "active",
"progress": 0.45
}

Response (complete):

{
"jobId": "550e8400-e29b-41d4-a716-446655440000",
"status": "done",
"captions": [
{ "id": "1", "from": 0, "to": 2000, "text": "Audio content here" },
{ "id": "2", "from": 2000, "to": 4500, "text": "More audio content" }
]
}

Error Codes

  • 202 — Job queued
  • 400assetUrl is not a valid Peako CDN URL
  • 401 — Missing or invalid API key

Security Note

Only Peako CDN URLs are accepted (starting with peako.shin0x.space/assets/). External URLs are rejected to prevent SSRF attacks.

Example:

# Start transcription
TRANSCRIBE_JOB=$(curl -s -X POST \
-H "X-API-Key: YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"assetUrl": "https://peako.shin0x.space/assets/my-video-uuid.mp4"
}' \
https://peako.shin0x.space/api/transcribe | jq -r '.jobId')

echo "Transcription job: $TRANSCRIBE_JOB"

# Poll status
while true; do
JOB=$(curl -s -H "X-API-Key: YOUR_KEY" \
https://peako.shin0x.space/api/jobs/$TRANSCRIBE_JOB)
STATUS=$(echo "$JOB" | jq -r '.status')

if [ "$STATUS" = "done" ]; then
echo "$JOB" | jq '.captions'
break
fi

sleep 5
done

Transcription Technology

Peako uses faster-whisper (a CTranslate2 reimplementation of OpenAI Whisper) running locally on the server.

Characteristics

PropertyDetails
Enginefaster-whisper (CTranslate2)
LanguageAuto-detected. Supports 100+ languages.
SpeedApproximately 2–4x real-time on CPU. A 60-second video takes ~15–30 seconds to transcribe.
AccuracyHigh on clear speech. Degrades with background noise, heavy accents, poor audio quality.
ConcurrencyMax 2 concurrent transcription jobs per server
Max file size2 GB (same as upload limit)

Performance Tips

  1. Optimize audio quality — Reduce background noise before transcribing for better accuracy
  2. Use Path 1 (auto-transcribe) for simple workflows — No extra steps
  3. Use Path 2 for batch workflows — Transcribe once, reuse captions
  4. Pre-transcribe during off-peak hours — Transcription jobs may queue during peak usage

Complete Example: Full Workflow

Here's a complete bash script combining all steps:

#!/bin/bash

set -e

API_KEY="your-api-key"
TEMPLATE_ID="your-template-id"
VIDEO_FILE="my-video.mp4"

echo "=== Peako Subtitle Workflow Example ==="

# Option 1: Path 1 (Auto-Transcribe at Render Time)
echo ""
echo "Option 1: Auto-transcribe at render time"
echo "==========================================="

VIDEO_URL=$(curl -s -X POST \
-H "X-API-Key: $API_KEY" \
-F "file=@$VIDEO_FILE" \
https://peako.shin0x.space/api/assets/upload | jq -r '.url')

echo "Video uploaded: $VIDEO_URL"

JOB_ID=$(curl -s -X POST \
-H "X-API-Key: $API_KEY" \
-H "Content-Type: application/json" \
-d "{
\"blocks\": {
\"hero-video\": { \"upload_url\": \"$VIDEO_URL\" },
\"subtitle-main\": {
\"auto_transcribe\": true,
\"transcribe_from\": \"hero-video\"
}
},
\"outputFormat\": \"mp4\",
\"delivery\": \"async\"
}" \
https://peako.shin0x.space/api/templates/$TEMPLATE_ID/render | jq -r '.jobId')

echo "Render job: $JOB_ID"
echo "Polling status..."

while true; do
RESULT=$(curl -s -H "X-API-Key: $API_KEY" \
https://peako.shin0x.space/api/jobs/$JOB_ID)
STATUS=$(echo "$RESULT" | jq -r '.status')

echo "Status: $STATUS"

if [ "$STATUS" = "done" ]; then
OUTPUT=$(echo "$RESULT" | jq -r '.outputUrl')
curl -o output-path1.mp4 "$OUTPUT"
echo "✓ Path 1 complete: output-path1.mp4"
break
fi

[ "$STATUS" = "failed" ] && echo "✗ Failed" && exit 1
sleep 5
done

# Option 2: Path 2 (Standalone Transcription)
echo ""
echo "Option 2: Standalone transcription (reuse captions)"
echo "=================================================="

TRANSCRIBE_JOB=$(curl -s -X POST \
-H "X-API-Key: $API_KEY" \
-H "Content-Type: application/json" \
-d "{\"assetUrl\": \"$VIDEO_URL\"}" \
https://peako.shin0x.space/api/transcribe | jq -r '.jobId')

echo "Transcription job: $TRANSCRIBE_JOB"

while true; do
JOB=$(curl -s -H "X-API-Key: $API_KEY" \
https://peako.shin0x.space/api/jobs/$TRANSCRIBE_JOB)
STATUS=$(echo "$JOB" | jq -r '.status')

if [ "$STATUS" = "done" ]; then
CAPTIONS=$(echo "$JOB" | jq '.captions')
break
fi

sleep 5
done

CAPTIONS_JSON=$(echo "$CAPTIONS" | jq -c '.')

RENDER_JOB=$(curl -s -X POST \
-H "X-API-Key: $API_KEY" \
-H "Content-Type: application/json" \
-d "{
\"blocks\": {
\"hero-video\": { \"upload_url\": \"$VIDEO_URL\" },
\"subtitle-main\": {
\"captions\": $CAPTIONS_JSON,
\"style\": {
\"fontFamily\": \"Inter\",
\"fontSize\": 32,
\"color\": \"#FFFFFF\"
}
}
},
\"outputFormat\": \"mp4\",
\"delivery\": \"async\"
}" \
https://peako.shin0x.space/api/templates/$TEMPLATE_ID/render | jq -r '.jobId')

while true; do
RESULT=$(curl -s -H "X-API-Key: $API_KEY" \
https://peako.shin0x.space/api/jobs/$RENDER_JOB)
STATUS=$(echo "$RESULT" | jq -r '.status')

if [ "$STATUS" = "done" ]; then
OUTPUT=$(echo "$RESULT" | jq -r '.outputUrl')
curl -o output-path2.mp4 "$OUTPUT"
echo "✓ Path 2 complete: output-path2.mp4"
break
fi

sleep 5
done

# Option 3: Path 3 (SRT Upload)
echo ""
echo "Option 3: SRT file upload & parse"
echo "=================================="

PARSE_RESULT=$(curl -s -X POST \
-H "X-API-Key: $API_KEY" \
-F "file=@subtitles.srt" \
https://peako.shin0x.space/api/subtitles/parse)

CAPTIONS=$(echo "$PARSE_RESULT" | jq '.captions')
CAPTIONS_JSON=$(echo "$CAPTIONS" | jq -c '.')

RENDER_JOB=$(curl -s -X POST \
-H "X-API-Key: $API_KEY" \
-H "Content-Type: application/json" \
-d "{
\"blocks\": {
\"hero-video\": { \"upload_url\": \"$VIDEO_URL\" },
\"subtitle-main\": {
\"captions\": $CAPTIONS_JSON
}
},
\"outputFormat\": \"mp4\",
\"delivery\": \"async\"
}" \
https://peako.shin0x.space/api/templates/$TEMPLATE_ID/render | jq -r '.jobId')

while true; do
RESULT=$(curl -s -H "X-API-Key: $API_KEY" \
https://peako.shin0x.space/api/jobs/$RENDER_JOB)
STATUS=$(echo "$RESULT" | jq -r '.status')

if [ "$STATUS" = "done" ]; then
OUTPUT=$(echo "$RESULT" | jq -r '.outputUrl')
curl -o output-path3.mp4 "$OUTPUT"
echo "✓ Path 3 complete: output-path3.mp4"
break
fi

sleep 5
done

echo ""
echo "=== All paths complete! ==="

Next: Jobs