Notebooks
G
Google Gemini
Get Started Veo REST

Get Started Veo REST

quickstartsgemini-cookbookgemini-apigemini
Copyright 2024 Google LLC.
[10]

Get started with Video generation using Veo and REST API

If you're reading this notebook on Github, open it in Colab by clicking the above button to see the generated videos.

⚠️

Veo is a paid only feature. It won't run with the Free Tier.
(cf. pricing for more details).

[42]

Why Veo

Veo enables developers to create high quality videos with incredible detail, minimal artifacts, and extended durations in resolutions up to 720p. Veo supports both text-to-video and image-to-video.

With Veo 3, developers can create videos with:

  • Advanced language understanding: Veo deeply understands natural language and visual semantics, capturing the nuance and tone of complex prompts to render intricate details in extended scenes, including cinematic terms like "timelapse" or "aerial shots."
  • Unprecedented creative control: Veo provides an unprecedented level of creative control, understanding prompts for all kinds of cinematic effects, like timelapses or aerial shots of a landscape.
  • Videos with audio: Veo 3 generates videos with audio automatically, with no additional effort from the developer.
  • More accurate video controls: Veo 3 is more accurate on lighting, accurate physics, and camera controls.

The Veo 3 family of models includes both Veo 3 as well as Veo 3 Fast, which is a faster and more accessible version of Veo 3. Veo 3 Fast is ideal for backend services that programmatically generate ads, tools for rapid A/B testing of creative concepts, or apps that need to quickly produce social media content.

Setup

[12]

Install jq so you can process the JSON results.

[13]
Reading package lists...
Building dependency tree...
Reading state information...
jq is already the newest version (1.6-2.1ubuntu3.1).
0 upgraded, 0 newly installed, 0 to remove and 35 not upgraded.

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

Text-to-video

Veo is able to generate videos from text prompts (see in the code comments for ideas). In these examples you are using the latest Google Veo 3 to generate your videos, which includes audio in the videos.

Prompting Tips for Veo

To get the most out of Veo, consider incorporating specific video terminology into your prompts Veo understands a wide range of terms related to:

  • Shot composition: Specify the framing and number of subjects in the shot (e.g., "single shot", "two shot", "over-the-shoulder shot").
  • Camera positioning and movement: Control the camera's location and movement using terms like "eye level", "high angle", "worms eye", "dolly shot", "zoom shot", "*pan shot," and "tracking shot".
  • Focus and lens effects: Use terms like "shallow focus", "deep focus", "soft focus", "macro lens", and "wide-angle lens" to achieve specific visual effects.
  • Overall style and subject: Guide Veo's creative direction by specifying styles like "sci-fi", "romantic comedy", "action movie" or "animation". You can also describe the subjects and backgrounds you want, such as "cityscape", "nature", "vehicles", or "animals."

Check the Veo prompt guide for more details and tips.

Optional parameters for Veo 3

The prompt is the only mandatory parameters, the others are all optional.

  • negative_prompt: What you don't want to see in the video,
  • person_generation: Tell you model if it's allowed to generate adults in the videos or not. Children are always blocked,
  • number_of_videos: With Veo 3, it is always a single shot generation (one video generated),
  • duration_seconds: With Veo 3, it is always 8 seconds long videos generated,
  • aspect ratio: Either 16:9 (landscape) or 9:16 (portrait),
  • resolution: Either 720p or 1080p (only in 16:9)

Start the generation

Video generation uses the predictLongRunning method:

[43]
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100   514    0    73  100   441    132    798 --:--:-- --:--:-- --:--:--   931

Have a look at the output:

[44]
{
  "name": "models/veo-3.0-fast-generate-001/operations/321kvdcepmcg"
}
[45]
models/veo-3.0-fast-generate-001/operations/321kvdcepmcg

Check the status

The op-name tells you where to check the status of the video generation. It initially contains nothing, or just the op name.

[46]
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0

Wait for generation to finish

When generation finished it will contain a response field.

[47]
Current status: null
Current status: null
Current status: null
Current status: null
Current status: {
  "@type": "type.googleapis.com/google.ai.generativelanguage.v1beta.PredictLongRunningResponse",
  "generateVideoResponse": {
    "generatedSamples": [
      {
        "video": {
          "uri": "https://generativelanguage.googleapis.com/v1beta/files/qu4d28uipuqh:download?alt=media"
        }
      }
    ]
  }
}
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100    73    0    73    0     0    178      0 --:--:-- --:--:-- --:--:--   178
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100    73    0    73    0     0    210      0 --:--:-- --:--:-- --:--:--   210
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100    73    0    73    0     0    193      0 --:--:-- --:--:-- --:--:--   194
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100    73    0    73    0     0    211      0 --:--:-- --:--:-- --:--:--   211
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100   443    0   443    0     0   1122      0 --:--:-- --:--:-- --:--:--  1124

Check the reponse then extract the uri.

[48]
{
  "@type": "type.googleapis.com/google.ai.generativelanguage.v1beta.PredictLongRunningResponse",
  "generateVideoResponse": {
    "generatedSamples": [
      {
        "video": {
          "uri": "https://generativelanguage.googleapis.com/v1beta/files/qu4d28uipuqh:download?alt=media"
        }
      }
    ]
  }
}
[49]
https://generativelanguage.googleapis.com/v1beta/files/qu4d28uipuqh:download?alt=media

And finally download the video(s).

[50]
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0

  0 5657k    0   772    0     0    733      0  2:11:42  0:00:01  2:11:41   733
100 5657k  100 5657k    0     0  4144k      0  0:00:01  0:00:01 --:--:-- 17.7M

Show the videos (optional - python)

[51]
video_0.mp4
[52]