# Video Generation Prompt Guide

**Table of Contents**

1. [Quick Start](https://app.gitbook.com/o/wESD8a1fR7H7ddwkYO61/s/IabFtGTeQzwfWCzp8vd6/~/edit/~/changes/458/for-users/prompt-writing-guide/video-generation-prompt-guide#id-1-quickstart)
2. [Break the Prompt Into Clear Elements](https://app.gitbook.com/o/wESD8a1fR7H7ddwkYO61/s/IabFtGTeQzwfWCzp8vd6/~/edit/~/changes/458/for-users/prompt-writing-guide/video-generation-prompt-guide#id-2-break-the-prompt-into-clear-elements)
   1. [Cinematography](https://app.gitbook.com/o/wESD8a1fR7H7ddwkYO61/s/IabFtGTeQzwfWCzp8vd6/~/edit/~/changes/458/for-users/prompt-writing-guide/video-generation-prompt-guide#cinematography)
   2. [Subject](https://app.gitbook.com/o/wESD8a1fR7H7ddwkYO61/s/IabFtGTeQzwfWCzp8vd6/~/edit/~/changes/458/for-users/prompt-writing-guide/video-generation-prompt-guide#subject)
   3. [Action](https://app.gitbook.com/o/wESD8a1fR7H7ddwkYO61/s/IabFtGTeQzwfWCzp8vd6/~/edit/~/changes/458/for-users/prompt-writing-guide/video-generation-prompt-guide#action)
   4. [Context](https://app.gitbook.com/o/wESD8a1fR7H7ddwkYO61/s/IabFtGTeQzwfWCzp8vd6/~/edit/~/changes/458/for-users/prompt-writing-guide/video-generation-prompt-guide#context)
   5. [Style & Ambiance](https://app.gitbook.com/o/wESD8a1fR7H7ddwkYO61/s/IabFtGTeQzwfWCzp8vd6/~/edit/~/changes/458/for-users/prompt-writing-guide/video-generation-prompt-guide#style-and-ambiance)
3. [Be Specific and Concrete](https://app.gitbook.com/o/wESD8a1fR7H7ddwkYO61/s/IabFtGTeQzwfWCzp8vd6/~/edit/~/changes/458/for-users/prompt-writing-guide/video-generation-prompt-guide#id-3-general-prompting-tips-all-models)
4. [Keep the Prompt Organized](https://app.gitbook.com/o/wESD8a1fR7H7ddwkYO61/s/IabFtGTeQzwfWCzp8vd6/~/edit/~/changes/458/for-users/prompt-writing-guide/video-generation-prompt-guide#id-4-troubleshooting)
5. [Define Constraints Early](https://app.gitbook.com/o/wESD8a1fR7H7ddwkYO61/s/IabFtGTeQzwfWCzp8vd6/~/edit/~/changes/458/for-users/prompt-writing-guide/video-generation-prompt-guide#id-4-troubleshooting-1)
6. [Write Negative Prompts the Right Way](https://app.gitbook.com/o/wESD8a1fR7H7ddwkYO61/s/IabFtGTeQzwfWCzp8vd6/~/edit/~/changes/458/for-users/prompt-writing-guide/video-generation-prompt-guide#id-4-troubleshooting-2)
7. [Think in Story Beats](https://app.gitbook.com/o/wESD8a1fR7H7ddwkYO61/s/IabFtGTeQzwfWCzp8vd6/~/edit/~/changes/458/for-users/prompt-writing-guide/video-generation-prompt-guide#id-4-troubleshooting-3)
8. [Use Reference Images for Consistency](https://app.gitbook.com/o/wESD8a1fR7H7ddwkYO61/s/IabFtGTeQzwfWCzp8vd6/~/edit/~/changes/458/for-users/prompt-writing-guide/video-generation-prompt-guide#id-4-troubleshooting-4)
9. [Veo 3.1 Prompting Guide](https://app.gitbook.com/o/wESD8a1fR7H7ddwkYO61/s/IabFtGTeQzwfWCzp8vd6/~/edit/~/changes/458/for-users/prompt-writing-guide/video-generation-prompt-guide#gpt-4o-tips-and-prompt-library)
   1. [Tips](https://app.gitbook.com/o/wESD8a1fR7H7ddwkYO61/s/IabFtGTeQzwfWCzp8vd6/~/edit/~/changes/458/for-users/prompt-writing-guide/video-generation-prompt-guide#prompt-samples)
   2. [Audio Prompting](https://app.gitbook.com/o/wESD8a1fR7H7ddwkYO61/s/IabFtGTeQzwfWCzp8vd6/~/edit/~/changes/458/for-users/prompt-writing-guide/video-generation-prompt-guide#prompt-samples-1)
   3. [Timestamp Prompting](https://app.gitbook.com/o/wESD8a1fR7H7ddwkYO61/s/IabFtGTeQzwfWCzp8vd6/~/edit/~/changes/458/for-users/prompt-writing-guide/video-generation-prompt-guide#prompt-samples-2)
   4. [Video Generation Workflows](https://app.gitbook.com/o/wESD8a1fR7H7ddwkYO61/s/IabFtGTeQzwfWCzp8vd6/~/edit/~/changes/458/for-users/prompt-writing-guide/video-generation-prompt-guide#prompt-samples-3)
   5. [Prompt Templates](https://app.gitbook.com/o/wESD8a1fR7H7ddwkYO61/s/IabFtGTeQzwfWCzp8vd6/~/edit/~/changes/458/for-users/prompt-writing-guide/video-generation-prompt-guide#nano-banana-tips-and-prompt-library)

***

### 1. Quickstart

#### Basic Prompting Sandwich <a href="#basic-prompt-sandwich" id="basic-prompt-sandwich"></a>

A simple way to structure a strong video prompt is:

**\[Cinematography] + \[Subject] + \[Action] + \[Context] + \[Style & Ambiance]**

1. **Cinematography:** Define the camera work and shot composition.
2. **Subject:** Identify the main character or focal point.
3. **Action:** Describe what the subject is doing.
4. **Context:** Detail the environment and background elements.
5. **Style & ambiance:** Specify the overall aesthetic, mood, and lighting

**Example**

<table data-header-hidden><thead><tr><th width="161.02081298828125"></th><th></th></tr></thead><tbody><tr><td><a href="https://youtu.be/U4B5H079dJ8?si=rRvNOMLmGzqwrIGj">Video Example</a></td><td>Medium shot, a tired corporate worker, rubbing his temples in exhaustion, in front of a bulky 1980s computer in a cluttered office late at night. The scene is lit by the harsh fluorescent overhead lights and the green glow of the monochrome monitor. Retro aesthetic, shot as if on 1980s color film, slightly grainy.</td></tr></tbody></table>

***

### 2) Break the Prompt Into Clear Elements

#### Cinematography

Describe how the scene should be filmed.

**You can specify:**

* **Shot Type**: wide establishing, medium, close-up, extreme close-up, macro, over-the-shoulder, long shot, POV
* **Camera Angle**: eye-level, low angle, high angle, top-down, worm's-eye, dutch angle,&#x20;
* **Camera Movement**: slow dolly-in, pan, tilt, locked-off tripod, handheld tracking, orbit, crane shot, static shot (or fixed), truck, pedestal, zoom, drone, whip pan, arc shot
* **Lens & Focus**: 35mm lens, shallow depth of field, soft bokeh, anamorphic flares, wide angle, telephoto, deep depth of field, lens flare, rack focus, fisheye, vertigo

**Example:**

<pre><code><strong>Crane shot starting low on a lone hiker and ascending high above, 
</strong>revealing they are standing on the edge of a colossal, mist-filled 
canyon at sunrise, epic fantasy style, awe-inspiring, soft morning light.
</code></pre>

#### Subject

State the main focus of the shot. Be specific. Include defining details like color, material, age, species, clothing, or expression where helpful.

**Examples:**

* **People**: Generic descriptors (man, woman, elderly person), Specific professions, Historical figures, Mythical beings ("mischievous fairy", "a stoic knight")
* **Animals or creatures**: Specific breeds of animals, Fantastical creatures: ("a miniature dragon with iridescent scales", "a wise, ancient talking tree")
* **Objects**: Everyday items, Vehicles, Abstract shapes ("glowing orbs", "crystalline structures")

#### Action

Describe what the subject is doing. The clearer the action, the more predictable the animation.

**You can specify:**

* **Basic movements**: walking, running, jumping, flying, swimming, dancing, spinning, falling, standing still, sitting
* **Interactions**: talking, laughing, arguing, hugging, fighting, playing a game, cooking, building, writing, reading, observing
* **Emotional expressions**: smiling, frowning, surprise, concentrating deeply, appearing thoughtful, showing excitement, crying
* **Subtle actions**: a gentle breeze ruffling hair, leaves rustling, a subtle nod, fingers tapping impatiently, eyes blinking slowly
* **Transformations or processes**: a flower blooming in fast-motion, ice melting, a city skyline developing over time (however, keep clip length in mind for events that occur over a longer period)

#### Context

Describe the setting or situation around the subject. Context grounds the scene and prevents generic outputs.

**You can specify:**

* **Location (interior)**: a cozy living room with a crackling fireplace, a sterile futuristic laboratory, a cluttered artist's studio, a grand ballroom, a dusty attic
* **Location (exterior)**: a sun-drenched tropical beach, a misty ancient forest, a bustling futuristic cityscape at night, a serene mountain peak at dawn, a desolate alien planet
* **Time of day**: golden hour, midday sun, twilight, deep night, pre-dawn
* **Weather**: clear blue sky, overcast and gloomy, light drizzle, heavy thunderstorm with visible lightning, gentle snowfall, swirling fog
* **Historical or fantastical period**: a medieval castle courtyard, a roaring 1920s jazz club, a cyberpunk alleyway, an enchanted forest glade
* **Atmospheric details**: floating dust motes in a sunbeam, shimmering heat haze, reflections on wet pavement, leaves scattered by the wind

#### Style & Ambiance

Describe the look and mood of the video. Instead of writing something broad like “epic vibe,” define the mood through lighting, palette, and texture.

**You can specify:**

* **Style**: cinematic realism, stop-motion feel, hand-drawn animation, film noir, retro VHS, photorealistic, cinematic, animation, art movements/artists, specific looks
* **Lighting**: soft key light, warm tungsten practicals, cool moonlight rim light, natural light , artificial light, cinematic lighting ("rembrandt lighting on a portrait"), specific effects ("volumetric lighting creating visible light rays")
* **Tone or Mood**: happy/joyful, sad/melancholy, suspenseful/tense, peaceful/serene, epic/grandios, futuristic/sci-fi:, vintage/retro, romantic, horror
* **Ambiance**: color paelttes, atmospheric effects, textural qualities

{% hint style="info" %}
For an indepth guide of video generation elements, look into [Veo Prompt Guide](https://docs.cloud.google.com/vertex-ai/generative-ai/docs/video/video-gen-prompt-guide#temporal-elements)
{% endhint %}

***

### 3) Be Specific and Concrete <a href="#id-3-general-prompting-tips-all-models" id="id-3-general-prompting-tips-all-models"></a>

Replace vague language with visual direction.&#x20;

Specific prompts usually produce more controllable outputs because the model has less room to guess. Google’s prompt guidance also emphasizes using detailed, explicit instructions rather than abstract descriptors.

**Weaker:**

```
A heartwarming cooking scene.
```

**Better:**

```
Slow dolly-in on a chef plating noodles; 
steam rising; warm rim light; shallow depth of field.
```

***

### 4) Keep the Prompt Organized <a href="#id-4-troubleshooting" id="id-4-troubleshooting"></a>

Write in short, readable lines or short grouped sections.

A clean structure often works better than one long paragraph. For example:

**Format**\
16:9, 6 seconds, cinematic realism.

**Subject + Action**\
A matte-black skincare bottle rotates slowly on a wet stone slab.

**Cinematography**\
Locked-off tripod. 50mm lens. Shallow depth of field.

**Style & Ambiance**\
Soft cool backlight, subtle bloom, low-contrast filmic grade.

This kind of structure reduces ambiguity and makes it easier to revise later.

***

### 5) Define Constraints Early <a href="#id-4-troubleshooting" id="id-4-troubleshooting"></a>

Put core technical constraints near the top of the prompt:

* aspect ratio
* clip length
* frame style or format
* resolution if supported
* general visual mode

**Example:**

```
16:9, 6 seconds, cinematic realism.
```

***

### 6) Write Negative Prompts the Right Way <a href="#id-4-troubleshooting" id="id-4-troubleshooting"></a>

When excluding elements, avoid awkward phrasing like:

* “don’t show buildings”
* “no scary mood”
* “don’t make it dark”

Instead, phrase the exclusion clearly as a negative prompt list.

**Example:**

```
With negative prompt: urban background, man-made structures, dark 
stormy atmosphere, extra props, text overlays.
```

***

### 7) Think in Story Beats <a href="#id-4-troubleshooting" id="id-4-troubleshooting"></a>

Even short clips work better when they have internal structure.

For a 6-8 second clip, you can think in simple beats:

* **Beat 1:** establish subject and setting
* **Beat 2:** introduce motion or action
* **Beat 3:** end on a reveal, expression, or hold

***

### 8) Use Reference Images for Consistency <a href="#id-4-troubleshooting" id="id-4-troubleshooting"></a>

Use reference inputs when you want:

* the same character across multiple shots
* the same product design across variations
* the same environment or art direction
* more visual consistency from scene to scene

***

### 9) Veo 3.1 Prompting Guide <a href="#gpt-4o-tips-and-prompt-library" id="gpt-4o-tips-and-prompt-library"></a>

#### Tips <a href="#prompt-samples" id="prompt-samples"></a>

1. **Native Audio Prefixes**: Veo 3.1 is one of the few models that allows for direct audio-to-video synchronization within the prompt using specific tags:
2. **300-Character Ceiling**: Veo 3.1 is highly sensitive to prompt length. Aim for 150–300 characters. Prompts exceeding 400 characters often lead to "prompt leakage," where the model ignores the latter half of your instructions.
3. **Timestamp-Based Prompting**: You can direct specific pacing by adding time markers.
4. **Multi-Reference "Ingredients"**: Instead of just one "Image-to-Video" reference, Veo 3.1 supports up to three image inputs. Use this to maintain consistency for a specific product, a specific character, and a specific background simultaneously

#### Audio Prompting <a href="#prompt-samples" id="prompt-samples"></a>

For Veo 3.1, audio should be treated as one of the main parts of the prompt when relevant. Google’s latest Veo guide explicitly highlights synchronized audio, including dialogue, ambient sound, and sound effects.

**Useful audio elements include:**

* dialogue
* ambient room tone
* environmental sounds
* sound effects
* music or no music
* voice tone, pace, style, or accent

A good rule is to describe audio in separate lines.

**No Dialogue Example**

```
Ambient room tone with soft water drips. No music. No dialogue.
```

**With Dialogue Example**

```
Voice: calm adult male, neutral accent, steady pace.
Line 1: “Welcome to Day 06.”
Line 2: “Smart skincare starts here.”
```

#### Timestamp Prompting <a href="#prompt-samples" id="prompt-samples"></a>

Aside from story beats, Veo 3.1 now goes further by supporting timestamp-based prompting for more explicit scene pacing. Google’s latest guidance includes timestamped shot descriptions as a way to control the sequence of events in a short clip.

**Example:**

```
[00:00-00:02]
Wide establishing shot of the room.

[00:02-00:05]
Camera slowly pushes in as the subject turns and speaks.

[00:05-00:08]
Hold on the final expression with soft room tone.
```

#### Video Generation Workflows <a href="#prompt-samples" id="prompt-samples"></a>

#### Ingredients to Video

Prepare visual ingredients, such as characters, props, or settings, then animate them into a video scene. Google specifically highlights this for building multi-shot scenes with consistent characters and dialogue.

**How does it work:**

1. **Generate your "ingredients":** reference images
2. **Compose the scene:** Use the Ingredients to Video feature with the relevant reference images.

**Use this when:**

* you need repeatable character design
* you want a product to remain consistent
* you want a scene to feel like the same world across shots
* you are building a dialogue scene with recurring subjects

#### First and Last Frame

You provide a starting image and an ending image, then prompt the model to generate the transition between them. Google’s Veo 3.1 guide explicitly describes this feature and recommends describing both the motion and the audio for the in-between sequence.

**Use this when you want:**

* a controlled transformation between two states
* a reveal from one composition to another
* a precise transition arc
* a stylized before/after movement

**Example**

```
Start from the front-facing portrait of the singer, 
then arc smoothly around her until the shot ends from behind the performer
looking out onto the audience. Include live stage ambience and sung dialogue.
```

{% hint style="info" %}
For an in depth guide of video generation workflows, look into [Veo Prompt Guide](https://docs.cloud.google.com/vertex-ai/generative-ai/docs/video/video-gen-prompt-guide#temporal-elements)
{% endhint %}

#### Prompt Templates <a href="#nano-banana-tips-and-prompt-library" id="nano-banana-tips-and-prompt-library"></a>

1. **General Video Generation Template**

```
[FORMAT]
Aspect ratio, duration, overall visual mode.

[CINEMATOGRAPHY]
Shot type, movement, lens, depth of field.

[SUBJECT]
Main character, object, or product.

[ACTION]
What happens over the clip.

[CONTEXT]
Environment and scene setup.

[STYLE & AMBIANCE]
Lighting, grade, texture, mood.

[AUDIO]
Dialogue, ambience, SFX, music, silence instructions.

[REFERENCE INPUTS / WORKFLOW]
Reference image, ingredients to video, first and last frame, or other workflow notes if applicable.

[TIMESTAMPS]
Optional timing breakdown for multi-beat clips.

[NEGATIVE PROMPT]
Elements to exclude.
```

2. **Product Here with Native Audio**

```
[FORMAT]
16:9, 6 seconds, cinematic realism.

[CINEMATOGRAPHY]
Locked-off tripod. Medium close-up. 50mm lens. Shallow depth of field.

[SUBJECT]
A matte-black skincare serum bottle.

[ACTION]
The bottle rotates slowly while condensation droplets slide down the surface and tiny ripples move across the wet stone beneath it.

[CONTEXT]
Placed on a dark stone slab in a minimalist studio.

[STYLE & AMBIANCE]
Soft cool rim light, subtle bloom, low-contrast filmic grade, premium editorial look.

[AUDIO]
Ambient room tone with soft water drips. No music. No dialogue.

[NEGATIVE PROMPT]
Text overlays, label distortion, logo changes, extra props, cluttered background.
```

3. **Character with Dialogue**

```
[FORMAT]
1:1, 8 seconds, stylized animation.

[CINEMATOGRAPHY]
Medium shot to close-up. Slow dolly-in. 35mm equivalent lens. Shallow depth of field.

[SUBJECT]
A chibi dog mascot sitting on a stool.

[ACTION]
The mascot waves, points toward the camera, and smiles.

[CONTEXT]
Simple pastel studio backdrop.

[STYLE & AMBIANCE]
Hand-drawn linework, flat pastel palette, soft paper texture, cheerful and playful mood.

[AUDIO - DIALOGUE]
Voice: bright, friendly, childlike tone.
Line 1: “Hi! Ready for clean teeth and happy tails?”
Line 2: “Niblets, your daily dental treat!”
End with a tiny “ding” sound effect.

[NEGATIVE PROMPT]
On-screen text, rapid cuts, background crowd, style drift.
```

4. **First-Frame / Image-to-Video Variation**

```
[REFERENCE INPUT]
Use the provided product photo as the opening frame.

[FORMAT]
16:9, 5 seconds, realistic.

[CINEMATOGRAPHY]
Slow right-to-left parallax. Soft rack focus.

[ACTION]
Start from the exact composition of the source image, then shift focus from the front label to the bottle cap.

[STYLE & AMBIANCE]
Clean product-commercial feel. Soft studio reflections. Controlled premium lighting.

[AUDIO]
Subtle room tone only. No dialogue.

[NEGATIVE PROMPT]
Logo changes, label morphing, extra reflections, added props.
```
