Last updated: 8 May 2026 · By Luke Lv, Director, Lumira Studio
A training video script is the structured plan for what each person on screen will say, what each shot will show, and what each section will teach. Done well, it makes the difference between a training video that learners actually finish and one that gets paused, abandoned, and quietly fails. Most weak training videos are not weak production. They are weak scripts that production tried to rescue.
What a training video script needs to do
A training video does three jobs at once: deliver information, hold attention long enough for the information to land, and leave the learner able to do something afterwards. The script is what makes all three possible. A good script answers, in order:
- What does the learner need to be able to do at the end?
- What does the learner already know, and what do they not know?
- What is the shortest path from current knowledge to required capability?
- How do we hold attention through that path?
Skip any of those and the script either teaches things the learner already knows, glosses over things they do not, or loses them in between.
Pre-planning: defining goals and learning objectives
Before any words go on the page, the script is shaped by the brief. The two questions that shape everything else:
- What learning outcome are we measuring? “Understand health and safety” is a topic, not an outcome. “Identify the three highest-risk failure modes on the rig and run the correct shutdown sequence” is an outcome.
- What does competence look like? If the learner could do one observable thing differently after watching, what would it be? That single behaviour is the script’s target.
Choose the right format for the content
Not every learning outcome suits the same video format. Match the format to the content:
| Format | Best for | Typical length |
|---|---|---|
| Talking head | Concepts, principles, context-setting | 2-5 min |
| Screen recording with voiceover | Software, tools, digital workflows | 3-8 min |
| Demonstration | Physical procedures, equipment, hands-on tasks | 3-10 min |
| Microlearning module | Single specific concepts, refresher content | 60-180s |
| Scenario / dramatisation | Soft skills, customer interaction, judgement calls | 4-8 min |
| Animated explainer | Abstract systems, processes too dangerous to film | 2-4 min |
Most training programmes use two or three of these formats together rather than picking one.
A six-step framework for structuring a training video script
The structure that holds learner attention while delivering content effectively:
1. Hook (0-15 seconds)
Open with the problem the learner is trying to solve, or the consequence of getting it wrong. Not the topic. Not “in this video we will cover…”. The hook is what earns the next 30 seconds.
2. Context (15-30 seconds)
One sentence on why this matters now, who this is for, and what they will be able to do at the end.
3. Core content (60-80% of total length)
The actual teaching. Broken into 2-4 distinct sections, each with a single learning point. Each section should answer: what is this, why does it matter, how do you do it, what does it look like when done correctly, and what does failure look like.
4. Demonstration or example
For procedural content: walk through the actual task on screen. For conceptual content: a worked example that the learner can map to their own context.
5. Recap (last 30-45 seconds)
The three or four things the learner should remember. Clear, specific, framed as actions: “Before you do X, always check Y. If Z happens, the response is…”
6. Call to action
What the learner does next. Quiz? Practical exercise? Read a follow-up document? Apply it on shift tomorrow? One specific next step.
Writing for the ear, not the page
Training scripts are spoken, not read. Three habits that compound:
- Read every line aloud before locking it. Anything that sounds awkward will sound twice as awkward on camera.
- Short sentences over long ones. 12-15 words per sentence is the comfortable upper bound for spoken content.
- Active voice, second person. “You check the gauge” beats “the gauge should be checked”.
Common mistakes in training video scripts
- Trying to teach everything in one video. Three short videos on three specific outcomes outperform one long video that covers everything.
- Reading from a slide deck. If the script is just spoken bullet points, the learner is better off with a document.
- No clear behaviour change as the goal. “Be aware of compliance requirements” is not a behaviour. “Run the standard pre-flight checklist before any external visitor enters the workshop” is.
- Voiceover that talks over essential visual information. When the screen is showing the critical step, the voiceover should describe it, not introduce a new concept.
- No room for the learner to think. Pauses matter. The script should breathe.
How long should a training video be?
Engagement data is consistent: HubSpot’s 2026 marketing statistics show videos under one minute average a 50% engagement rate, dropping to 17% above 60 minutes. For training specifically, the optimal length depends on format: microlearning modules at 60-180 seconds, talking heads at 2-5 minutes, demonstrations and scenarios at 3-10 minutes. Anything above 15 minutes typically benefits from being broken into a series.
Frequently asked questions
How do I start writing a training video script?
Start with the learning outcome, not the topic. Define what the learner should be able to do differently after watching. Then work backwards: what do they need to know to do that, what do they already know, what is the shortest path from current to required knowledge.
How long should a training video script be?
For a 5-minute video, the script will typically be 700-800 spoken words. Voiceover content runs at roughly 150 words per minute when paced for learning. Pad slightly for pauses and on-screen demonstration time.
Should training video scripts include the camera direction?
For internal documents and approval cycles, no, keep it readable. For the production team, yes, a separate shot list or annotated script that maps each section to camera setups, B-roll, and graphics. Two documents, same source.
What is the most important part of a training video script?
The first 15 seconds. If the learner has not understood why this matters by 15 seconds in, attention drops sharply. The hook is the highest-leverage part of the script.
How many learning points per training video?
For a 3-5 minute video: one clear learning point with 2-4 supporting sub-points. For a longer 8-15 minute video: 3-4 distinct points with their own internal structure. More than that and a series usually performs better than a single longer video.
Do you produce training videos at Lumira Studio?
Yes. Training and educational videos are one of our core service categories, alongside corporate video, testimonials, and post-production. We work with brands, universities, and B2B businesses on training content that needs to actually change behaviour.




