Vertex JSON Mode For Llama 4
Why JSON mode?
You can guarantee that a model's generated output always adheres to a specific schema so that you receive consistently formatted responses. For example, you might have an established data schema that you use for other tasks. If you have the model follow the same schema, you can directly extract data from the model's output without any post-processing.
To specify the structure of a model's output, define a response schema, which works like a blueprint for model responses. When you submit a prompt and include the response schema, the model's response always follows your defined schema.
Objectives
In this tutorial, you will learn how to use either OpenAI SDK or Vertex AI SDK in Python to generated structured outputs via the Llama 4 Maverick fully managed model on Vertex AI. See here for more info on using the OpenAI SDK with Vertex, as well as recommendations on when to use OpenAI SDK vs. Vertex AI SDK.
We will use sentiment analysis as an example use case, you can replace it with a different structure that's right for you.
Setup and Relevant Links
Llama on Vertex AI (fully managed): https://cloud.google.com/vertex-ai/generative-ai/docs/partner-models/llama. You may also need to accept the EULA to continue.
Official docs from Vertex on structured outputs/JSON mode with Llama coming soon.
**Setup and Relevant Links **
Defining the format we want to return...
Now, generate JSON output
First, with OpenAI's SDK...
{'reviews': [{'text': "Absolutely loved it! Best ice cream I've ever had.", 'rating': 4, 'flavor': 'Strawberry Cheesecake', 'sentiment': 'Positive', 'explanation': "The reviewer uses the phrase 'Absolutely loved it' and states it's the 'Best ice cream I've ever had', indicating a very positive sentiment. The rating of 4 out of a presumed 5 is consistent with this positive sentiment."}, {'text': 'Quite good, but a bit too sweet for my taste.', 'rating': 1, 'flavor': 'Mango Tango', 'sentiment': 'Negative', 'explanation': "Although the reviewer starts with 'Quite good', they follow it with a negative statement 'but a bit too sweet for my taste', indicating a mixed sentiment. However, the rating of 1 suggests a strongly negative sentiment, which is inconsistent with the text. The sentiment classification based on the text would be 'Mixed' or 'Neutral', but given the low rating, it leans more towards being negative overall."}]}
Now with Vertex AI SDK
[[{'explanation': "The reviewer used the phrase 'Absolutely loved it' and stated it was the 'Best ice cream I've ever had', indicating a very positive sentiment despite the rating being less than 5.", 'flavor': 'Strawberry Cheesecake', 'rating': 4, 'sentiment': 'POSITIVE'}, {'explanation': "The reviewer described the product as 'Quite good', but expressed a negative aspect by stating it was 'a bit too sweet', aligning with the low rating given. The negative aspect outweighs the positive, leading to an overall negative sentiment.", 'flavor': 'Mango Tango', 'rating': 1, 'sentiment': 'NEGATIVE'}]]
Congrats and conclusion
You've successfully built a sentiment analyzer leveraging structured outputs via Llama 4 Maverick using the OpenAI and/or Vertex AI SDK!
Cleanup
You can perform the following cleanup to avoid incurring charges to your Google Cloud account for the resources used in this codelab:
- To avoid unnecessary Google Cloud charges, use the Google Cloud console to delete your project if you do not need it.
- If you want to disable the APIs for Vertex AI, navigate to the Vertex AI API Service Details page and click Disable API and confirm.