Grok Imagine 1.5 Preview

Image-to-video generation with synced audio and expressive motion.

Input

Upload Image *

Click or drag files here

View upload limits

Image: JPG / PNG / WEBP, ≤10.0MB, up to 1 images, width & height ≥300px, aspect ratio 1:2.5 ~ 2.5:1

Prompt

232 / 5000

Duration(s)

1
15

Resolution

480p
720p

Aspect Ratio

  • auto
  • 16:9
  • 9:16
  • 1:1
  • 3:2
  • 2:3
  • 3:4
  • 4:3
Model & Modality
Credits / Gen
Our Price (USD)Official Price (USD)
DISCOUNT
grok-imagine-video-1.5-preview, i2v, 480p
videoGrok
14.5
per second
$0.0647$0.08- 19%
grok-imagine, i2v, 720p
videoGrok
25
per second
$0.1116$0.14- 20%
grok-imagine, i2v, input image
videoGrok
2
per image
$0.0089$0.01- 11%
Native Multimodal Audio

Grok Imagine Video 1.5 API

Animate still images into short videos with synchronized audio using xAI’s Grok Imagine Video 1.5 preview model.

15s
Max Duration
24 fps
Frame rate
720P
Resolution

Prompt:

A massive rocket launching from a modern space center, engines igniting with intense flames and smoke, powerful liftoff, cinematic camera angle, dramatic lighting, realistic physics, clear blue sky, ultra detailed, high energy, 4K quality.”

Core Features

Grok Imagine 1.5 API Core Features

Bring Any Image to Life, With Sound

Image-to-Video Generation

Turn static images into dynamic videos while preserving subject identity, composition, and visual style.

Native Audio Generation

Create synchronized dialogue, sound effects, ambient audio, and background music in a single generation.

Video Extension

Extend videos seamlessly from the last frame while maintaining motion, lighting, and scene continuity.

Reference Consistency

Maintain character appearance, visual style, and scene aesthetics across multiple video generations.

Prompt-Based Video Editing

Edit and refine videos using natural language instructions without complex workflows.

Fast Cinematic Rendering

Generate high-quality videos with realistic motion, smooth camera movements, and fast rendering speeds.

What You Can Build with Grok Imagine Video 1.5

Grok Imagine Video 1.5 turns a static image into a dynamic video with realistic motion, natural interactions, and automatically generated sound. Upload a portrait, product photo, or illustration, and it transforms into a cinematic video with synced background music, sound effects, and ambient audio that match the visuals.

Unified Audio-Visual Generation Capability

Grok Imagine Video 1.5 supports simultaneous generation of video and audio in a single pass, enabling true audio-visual co-generation. The system automatically produces context-aware sound, including synchronized action effects (e.g., blade swishes, footsteps), ambient audio (e.g., room tone, spatial reverb), background music, and dialogue, with natural lip-sync alignment. With only one image and a prompt, it can generate a cinema-grade video with fully integrated sound, eliminating the need for external post-production audio tools.

Realistic Motion, Physics Simulation, and Detail Fidelity

The model can expand a single image into a fully animated scene with improved motion consistency, physical realism, and fine-grained detail. It naturally reproduces complex phenomena such as fluid dynamics, rising steam, and translucent materials like glass, while preserving the original visual style. It also closely follows prompt instructions and supports natural-language-based camera control for more flexible scene direction.

End-to-End Creative Workflow

Grok Imagine provides a fully integrated pipeline covering text-to-image, image editing, image-to-video, video generation, and clip extension, with Agent Mode enabling iterative creative refinement. This unified workflow is well-suited for short-form content, concept videos, and rapid prototyping, allowing users to efficiently transform ideas into production-ready video outputs within a single platform.

Grok Imagine Video 1.5 API vs Seedance 2.0 API

Grok Imagine Video 1.5 Preview recently claimed the #1 spot on the Image-to-Video Arena (720p) leaderboard, achieving a score of 1473 and surpassing Seedance 2.0's 1467. With a 52-point Elo improvement over its predecessor, Grok Imagine Video 1.5 ranks among the top-performing image-to-video models currently available on Crun.

ModelGrok Imagine Video 1.5Seedance 2.0
Resolution
720P
1080P
Video Length
15s
15s
Frame Rate
24fps
24fps
Audio-Visual Generation
Supported
Supported
Reference Video
Not supported
Not supported
Text-to-Video
Not supported
Supported
Motion Quality
Medium
High
Scene Complexity
Simple scenes
Multi-scene capable
Character Consistency
Basic
Strong
Generation Speed
Fast
Medium
Control Level
Low–Medium
High (multi-modal control system)

FAQ about Grok Imagine Video 1.5

  • What is Grok Imagine Video 1.5?

    Grok Imagine Video 1.5 is xAI’s image-to-video generation model. It accepts a reference image and a text prompt, then produces cinematic video that animates the image with motion and native audio — including dialogue, ambient sound, and effects — all synchronized within a single generation pass.
  • What makes Grok Imagine Video 1.5 stand out?

    It combines high-quality image-to-video generation with native audio synthesis, enabling both visuals and synchronized sound to be produced in one pass. It is also integrated into the broader Grok Imagine creative workflow — including text-to-image, image editing, image-to-video, video-to-video, and clip extension — making it a fast and flexible option for short-form content creation and rapid iteration.
  • How good is the audio in Grok Imagine Video 1.5?

    Audio is generated natively alongside the video, ensuring precise synchronization without the need for post-production. Grok Imagine Video 1.5 produces natural-sounding dialogue with accurate lip-sync, context-aware ambient audio, and well-timed sound effects, resulting in a polished, cinematic output.
  • What resolutions and durations does it support?

    Grok Imagine Video 1.5 supports image-to-video generation at 480p and 720p. Each clip can be up to 15 seconds in duration, with audio generated natively alongside the video.
  • Does Grok Imagine Video 1.5 API support native audio generation?

    Yes. Grok Imagine Video 1.5 supports synchronized audio generation alongside video output, including dialogue, sound effects, ambient audio, and background music, reducing the need for separate audio tools.
  • How does Grok Imagine Video 1.5 compare to Seedance 2.0?

    Both are advanced AI video generation models. Grok Imagine Video 1.5 currently ranks higher on the Image-to-Video Arena leaderboard, while Seedance 2.0 offers more advanced multimodal workflows and strong multi-shot storytelling capabilities. crun.ai provides API access to both models.
  • Can I use Grok Imagine Video 1.5 for commercial projects?

    Yes. Content generated via the crun.ai API can be used for commercial purposes.
Crunlogo

Crun

  • English
Crun WhatsApp

Scan on WhatsApp
for Crun support

© 2026 Crun.ai Inc. All rights reserved.