Notebooks
M
Meta Llama
Vertex Tool Calling For Llama 4

Vertex Tool Calling For Llama 4

llamagcpAIvllmmachine-learning3p-integrationsllama2LLMvertex_MaaSllama-cookbookPythonfinetuningpytorchlangchain

Why function calling?

Imagine asking someone to write down important information without giving them a form or any guidelines on the structure. You might get a beautifully crafted paragraph, but extracting specific details like names, dates, or numbers would be tedious! Similarly, trying to get consistent structured data from a generative text model without function calling can be frustrating. You're stuck explicitly prompting for things like JSON output, often with inconsistent and frustrating results.

This is where function calling comes in. Instead of hoping for the best in a freeform text response from a generative model, you can define clear functions with specific parameters and data types. These function declarations act as structured guidelines, guiding the Llama model to structure its output in a predictable and usable way. No more parsing text responses for important information!

Think of it like teaching Llama to speak the language of your applications. Need to retrieve information from a database? Define a search_db function with parameters for search terms. Want to integrate with a weather API? Create a get_weather function that takes a location as input. Function calling bridges the gap between human language and the structured data needed to interact with external systems.

Objectives

In this tutorial, you will learn how to use either OpenAI SDK or Vertex AI SDK in Python to make function calls via the Llama 4 Maverick fully managed model on Vertex AI. See here for more info on using the OpenAI SDK with Vertex, as well as recommendations on when to use OpenAI SDK vs. Vertex AI SDK.

We will use a currency exchange function as an example, you can replace it with another function with the right functionality for you. This tutorial is based on this Vertex AI codelab: https://codelabs.developers.google.com/codelabs/gemini-function-calling

Setup and Relevant Links

Llama on Vertex AI (fully managed): https://cloud.google.com/vertex-ai/generative-ai/docs/partner-models/llama

Official docs from Vertex on tool calling with Llama coming soon.

First, with OpenAI's SDK

Handling imports and setup

[ ]
[ ]
project_id

define the function implementation

[ ]

set up the function declaration for the model

[ ]

set up the client, generate a tool call and execute the tool call Enter a query into the text field to get started, eg. "100 usd to eur"

[ ]
Enter your currency exchange query: 100 usd to eur
Direct response: The exchange rate for 100 USD to EUR is 88.645 EUR.

Now with Vertex SDK

Handling imports and setup

[ ]
[ ]
PROJECT_ID

define the function implementation

[ ]

set up the function declaration for the model

[ ]

set up the client, generate a tool call and execute the tool call Enter a query into the text field to get started, eg. "100 usd to eur"

[ ]
Enter your currency exchange query: 100 usd to eur
Final response: The exchange rate for 100 USD to EUR on the latest date is 88.645 EUR.

Congrats and conclusion

Leveraging function calling via Llama 4 in Vertex AI, you've successfully built a generative AI pipeline that uses the OpenAI and/or Vertex AI SDK! Users can ask about exchange rates, and the system will fetch the latest data from an external API and respond with an answer.

Given a prompt from an end-user, Llama takes care of selecting the appropriate function, extracting parameters from the prompt, and returning a structured data object for you to make an external API call.

Cleanup

You can perform the following cleanup to avoid incurring charges to your Google Cloud account for the resources used in this codelab:

  • To avoid unnecessary Google Cloud charges, use the Google Cloud console to delete your project if you do not need it.
  • If you want to disable the APIs for Vertex AI, navigate to the Vertex AI API Service Details page and click Disable API and confirm.