Reference

AI video glossary: faceless YouTube and AI video terms explained

Published

The plain-English vocabulary of AI video and faceless YouTube, in one place. The terms that trip people up most are RPM vs CPM (what you keep vs what advertisers pay), character drift (characters changing between scenes), and the video engine choice (Kling, Veo, Hailuo). Each definition links to a deeper guide where there is one.

AI video and faceless YouTube terms

Faceless YouTube channel

A channel that publishes videos with no on-camera presenter. The content is narration or music over visuals (animation, generated scenes, stock or b-roll), which makes it easy to produce at scale. How to make faceless videos.

Text-to-video

Turning written text (a script, a story or lyrics) into a finished video automatically: the text drives the voice or song, the storyboard, the generated scenes and the final edit. Story to video.

RPM (revenue per mille)

What a creator actually keeps per 1,000 video views, after ad fill rate, non-monetized views and YouTube's revenue share. RPM is almost always far lower than CPM and is the number that decides real income. RPM by country.

CPM (cost per mille)

What advertisers pay per 1,000 ad impressions. It is the big headline number, but it is not what the creator keeps. The gap between CPM and RPM comes from fill rate, non-monetized views and the platform's cut. CPM vs RPM.

Character drift (identity drift)

When a generated character's face, clothing, proportions or art style change from one scene to the next, so a multi-scene video looks like it stars several different people. It is the most common giveaway of low-effort AI video. How to fix character drift.

Storyboard

The plan that breaks a script or song into a sequence of scenes, each lasting a few seconds, so the visuals can be generated and timed to match what is being said or sung at that moment.

Scene

One short shot in an AI video, typically a few seconds long, made of a generated image that is then animated. A multi-minute video is built from many scenes stitched together.

Reference image

An image fed to the generator so it keeps a character, world or prop consistent across scenes. Pinning a few reference images is one of the strongest ways to prevent character drift. Consistency methods.

Video engine

The AI model that animates a still image into motion, such as Kling, Google Veo or Hailuo. Engines differ in resolution, scene length and cost, which is why the engine choice changes both quality and price. Kling vs Veo vs Hailuo.

TTS (text-to-speech)

AI narration: a voice model reads a script aloud. Modern TTS (for example ElevenLabs) offers many voices filterable by language, gender and age, with audio previews.

Dubbing

Re-voicing a finished video into another language while keeping its tone and timing. Dubbing one video into several languages lets a single piece of content reach several ad markets at once.

Lyric video / music video

A video built around a song rather than narration. The audio is analysed so scene changes land on the song's phrasing, and the visuals are generated to match the mood of the track. AI music video generator.

Watch time

The total minutes people spend watching a channel's videos. YouTube weighs watch time heavily for reach and monetization, often more than raw view count.

Inauthentic content (YouTube policy)

YouTube's July 2025 policy (renamed from 'repetitious content') that makes mass-produced, template-like, easily-replicable uploads ineligible for monetization. It targets low-effort spam, not AI itself, and AI-assisted channels remain eligible.

Credits

TubeTube's usage-based currency. A video reserves an estimated number of credits up front and the unused part of the hold is refunded, so you pay for what is actually produced.

Join the waitlistSee real examples