Reference to Video AI

Question 1

How many reference images can I upload?

Answer

You can upload 1 to 3 reference images. At least one image is required, and uploads above three images are rejected by validation.

Question 2

What generation mode is used under the hood?

Answer

This page uses Veo generation type REFERENCE_2_VIDEO with the fast Veo 3.1 model profile, optimized for guided reference-based motion generation.

Question 3

Can I change output duration?

Answer

Not in this model setup. Duration is fixed at 8 seconds for predictable iteration speed and stable workflow behavior across repeated runs.

Question 4

Which aspect ratios are supported?

Answer

You can choose Auto, 16:9, or 9:16. Auto is convenient for quick tests, while explicit aspect ratios are better for production delivery targets.

Question 5

Is prompt translation enabled?

Answer

Yes. Requests are sent with translation enabled by default to improve prompt interpretation consistency when prompts are not originally written in English.

Question 6

Does this mode support audio?

Answer

The model supports audio-capable output behavior from the upstream pipeline. In rare sensitive scenarios, audio may still be suppressed by provider policy.

Question 7

Will this model appear in Image to Video page?

Answer

No. This model is scoped to the dedicated Reference to Video page, so existing Text to Video and Image to Video model lists remain unchanged.

Question 8

What improves generation quality most?

Answer

Use clear, high-quality references and precise motion prompts. Explicit camera verbs and scene intent usually improve consistency and reduce random drift.

Question 9

How many reference images can I upload?

Answer

You can upload 1 to 3 reference images. At least one image is required, and uploads above three images are rejected by validation.

Question 10

What generation mode is used under the hood?

Answer

This page uses Veo generation type REFERENCE_2_VIDEO with the fast Veo 3.1 model profile, optimized for guided reference-based motion generation.

Question 11

Can I change output duration?

Answer

Not in this model setup. Duration is fixed at 8 seconds for predictable iteration speed and stable workflow behavior across repeated runs.

Question 12

Which aspect ratios are supported?

Answer

You can choose Auto, 16:9, or 9:16. Auto is convenient for quick tests, while explicit aspect ratios are better for production delivery targets.

Question 13

Is prompt translation enabled?

Answer

Yes. Requests are sent with translation enabled by default to improve prompt interpretation consistency when prompts are not originally written in English.

Question 14

Does this mode support audio?

Answer

The model supports audio-capable output behavior from the upstream pipeline. In rare sensitive scenarios, audio may still be suppressed by provider policy.

Question 15

Will this model appear in Image to Video page?

Answer

No. This model is scoped to the dedicated Reference to Video page, so existing Text to Video and Image to Video model lists remain unchanged.

Question 16

What improves generation quality most?

Answer

Use clear, high-quality references and precise motion prompts. Explicit camera verbs and scene intent usually improve consistency and reduce random drift.

Image Tools

Video Tools

Image Tools

Video Tools

Reference to Video AI

Why Use Reference to Video

1-3 Reference Images

Veo 3.1 Fast Pipeline

Aspect Ratio Control

Audio-Capable Output

Prompt Translation Enabled

Fixed 8s Duration

Generate in 3 Steps

Upload References

Write Motion Prompt

Review and Download

Frequently Asked Questions

How many reference images can I upload?

What generation mode is used under the hood?

Can I change output duration?

Which aspect ratios are supported?

Is prompt translation enabled?

Does this mode support audio?

Will this model appear in Image to Video page?

What improves generation quality most?

Image Tools

Video Tools

Image Tools

Video Tools

Reference to Video AI

Why Use Reference to Video

1-3 Reference Images

Veo 3.1 Fast Pipeline

Aspect Ratio Control

Audio-Capable Output

Prompt Translation Enabled

Fixed 8s Duration

Generate in 3 Steps

Upload References

Write Motion Prompt

Review and Download

Frequently Asked Questions

How many reference images can I upload?

What generation mode is used under the hood?

Can I change output duration?

Which aspect ratios are supported?

Is prompt translation enabled?

Does this mode support audio?

Will this model appear in Image to Video page?

What improves generation quality most?