Skip to main content
Image to Video

Kling v2.6 Pro Image-to-Video

Generate videos from images using Kuaishou's Kling v2.6 Pro model with native audio, optional end-frame guidance, and voice control

View details

Inputs

Loading input fields...
Execution Steps

Loading workflow structure...

Loading curated examples...

Overview

Kling v2.6 Pro Image-to-Video turns a start image and motion prompt into a short generated video, with optional native audio, end-frame guidance, and voice control. Use it for controlled motion drafts, short-form creative, and animated campaign concepts from existing visuals.

Use cases

  • Animate a campaign still into a 5- or 10-second video draft.
  • Use a start image and optional end image to guide a product reveal or before-and-after motion concept.
  • Create an audio-enabled short video with up to two referenced voices when the prompt needs spoken moments.

Input tips

  • Provide a public start_image_url and a prompt describing motion, action, camera movement, audio, and scene progression.
  • Add end_image_url when the final frame should land on a specific visual.
  • Choose 5 or 10 seconds; the default is 5 seconds.
  • Leave generate_audio on for native audio; provide up to two voice IDs only when the prompt uses matching voice markers.
  • Use negative_prompt to reduce blur, distortion, low quality, or other unwanted artifacts.

Expected output

The AI Tool returns one generated video object with a downloadable URL, duration in seconds, and optional content type, file name, file size, width, and height when available, plus cost metadata. The shared output template renders the video for review and download.

Caveats

  • Source and end image URLs must be public and reachable.
  • End-frame guidance can constrain the video, but it may not make every transition exact.
  • Native audio and voice control need careful prompt review; unsupported speech details may not land as intended.
  • Generated motion, faces, text, and audio may need human review for realism, brand fit, and policy fit.
  • Longer videos or audio/voice-control runs can take longer than a simple silent draft.