

Gemini Omni Flash Preview is Google's multimodal video generation and editing model, supporting text-to-video, image-to-video, reference-to-video, and edit workflows.
What is Gemini Omni Flash Preview API?
Gemini Omni Flash Preview is Google's multimodal video generation model. It produces video from text prompts, reference images, and existing video clips — and supports four task modes: text-to-video, image-to-video, reference-to-video, and video editing.
Output videos are delivered in 16:9 or 9:16 aspect ratios at durations from 3 to 10 seconds. Multi-turn editing is available via the Gemini Interactions API, enabling conversational refinement of generated content.
[Model Specifications embed]
[Performance Benchmarks embed]
API Pricing
Where to Use Gemini Omni
Text-to-video content creation
Describe a scene and receive a short video clip — for social content, marketing materials, or product demos where written briefs need to become motion assets.
Image-to-video animation
Animate a still image into a video sequence. Useful for product photography, character animation, and giving motion to static visuals.
Reference-guided generation
Bind reference images to specific roles in the prompt using tags like <IMAGE_REF_0> for precise control over character appearance, environment, or style.
Video editing workflows
Pass an existing video clip and prompt the model to modify it — background changes, motion adjustments, style transformation.
Gemini Omni vs. the Alternatives
What is Gemini Omni Flash Preview API?
Gemini Omni Flash Preview is Google's multimodal video generation model. It produces video from text prompts, reference images, and existing video clips — and supports four task modes: text-to-video, image-to-video, reference-to-video, and video editing.
Output videos are delivered in 16:9 or 9:16 aspect ratios at durations from 3 to 10 seconds. Multi-turn editing is available via the Gemini Interactions API, enabling conversational refinement of generated content.
[Model Specifications embed]
[Performance Benchmarks embed]
API Pricing
Where to Use Gemini Omni
Text-to-video content creation
Describe a scene and receive a short video clip — for social content, marketing materials, or product demos where written briefs need to become motion assets.
Image-to-video animation
Animate a still image into a video sequence. Useful for product photography, character animation, and giving motion to static visuals.
Reference-guided generation
Bind reference images to specific roles in the prompt using tags like <IMAGE_REF_0> for precise control over character appearance, environment, or style.
Video editing workflows
Pass an existing video clip and prompt the model to modify it — background changes, motion adjustments, style transformation.
Gemini Omni vs. the Alternatives