Build With Llama Api
Build with Llama API
This notebook introduces you to the functionality offered by Llama API, so that you can get up and running with the latest Llama 4 models quickly and efficiently.
Running this notebook
To run this notebook, you'll need to sign up for a Llama API developer account at llama.developer.meta.com and get an API key. You'll also need to have Python 3.8+ and a way to install the Llama API Python SDK such as pip.
Installing the Llama API client for Python
The Llama API client for Python is an open-source client library that provides convenient access to Llama API endpoints through a familiar set of request methods.
Install the SDK using pip.
Getting and setting up an API key
Sign up for, or log in to, a Llama API developer account at llama.developer.meta.com, then navigate to the API keys tab in the dashboard to create a new API key.
Assign your API key to the environment variable LLAMA_API_KEY.
Now you can import the SDK and instantiate it. The SDK will automatically pull the API key from the environment variable set above.
Your first API call
With the SDK set up, you're ready to make your first API call.
Start by checking the list of available models:
Llama-3.3-70B-Instruct Llama-3.3-8B-Instruct Llama-4-Maverick-17B-128E-Instruct-FP8 Llama-4-Scout-17B-16E-Instruct-FP8
The list of models may change in accordance with model releases. This notebook will use the latest Llama 4 model: Llama-4-Maverick-17B-128E-Instruct-FP8.
Chat completion
Chat completion with text
Use the chat completions endpoint for a simple text based prompt-and-response round trip.
I'm just a language model, so I don't have feelings or emotions like humans do, but I'm functioning properly and ready to help with any questions or tasks you might have! How can I assist you today?
Multi-turn chat completion
The chat completions endpoint supports sending multiple messages in a single API call, so you can use it to continue a conversation between a user and a model.
Here's a fascinating fact about the octopus: Octopuses have **three hearts**! Two of the hearts are branchial hearts, which pump blood to the octopus's gills, while the third is a systemic heart that pumps blood to the rest of its body. Isn't that cool?
Streaming
You can return results from the API to the user more quickly by setting the stream parameter to True. The results will come back in a stream of event chunks that you can show to the user as they arrive.
Here is a short story: The old, mysterious shop had been on the corner of Main Street for as long as anyone could remember. Its windows were always dusty, and the sign above the door creaked in the wind, reading "Curios and Antiques" in faded letters. One rainy afternoon, a young woman named Lily ducked into the shop to escape the downpour. As she pushed open the door, a bell above it rang out, and the scent of old books and wood polish wafted out. The shop was dimly lit, with rows of shelves packed tightly with strange and exotic items: vintage dolls, taxidermied animals, and peculiar trinkets that seemed to serve no purpose. Lily wandered the aisles, running her fingers over the intricate carvings on an ancient wooden box, and marveling at a crystal pendant that glowed with an otherworldly light. As she reached the back of the shop, she noticed a small, ornate mirror hanging on the wall. The glass was cloudy, and the frame was adorned with symbols that seemed to shimmer and dance in the dim light. Without thinking, Lily reached out to touch the mirror's surface. As soon as she made contact with the glass, the room around her began to blur and fade. The mirror's surface rippled, like the surface of a pond, and Lily felt herself being pulled into its depths. When she opened her eyes again, she found herself standing in a lush, vibrant garden, surrounded by flowers that seemed to glow with an ethereal light. A soft, melodious voice whispered in her ear, "Welcome home, Lily." Lily looked around, bewildered, and saw that the garden was filled with people she had never met, yet somehow knew intimately. They smiled and beckoned her closer, and Lily felt a deep sense of belonging, as if she had finally found a place she had been searching for her entire life. As she stood there, the rain outside seemed to fade into the distance, and Lily knew that she would never see the world in the same way again. The mysterious shop, and the enchanted mirror, had unlocked a doorway to a new reality – one that was full of wonder, magic, and possibility. When Lily finally returned to the shop, the rain had stopped, and the sun was shining brightly outside. The shopkeeper, an old man with kind eyes, smiled at her and said, "I see you've found what you were looking for." Lily smiled back, knowing that she had discovered something far more valuable than any curiosity or antique – she had discovered a piece of herself.
Multi-modal chat completion
The chat completions endpoint also supports image understanding, using URLs to publicly available images, or using local images encoded as Base64.
Here's an example that compares two images which are available at public URLs:
The two images share a common subject matter, featuring llamas as the primary focus. The first image depicts a brown llama and a gray llama standing together in a desert-like environment with a body of water and mountains in the background. In contrast, the second image shows a group of llamas grazing on a hillside, set against a backdrop of mountains and a lake. **Common Elements:** * **Llamas:** Both images feature llamas as the main subjects. * **Mountainous Background:** Both scenes are set against a mountainous landscape. * **Natural Environment:** Both images showcase the natural habitats of the llamas, highlighting their adaptation to high-altitude environments. **Shared Themes:** * **Wildlife:** The presence of llamas in both images emphasizes their status as wildlife. * **Natural Beauty:** The mountainous backdrops in both images contribute to the overall theme of natural beauty. * **Serenity:** The calm demeanor of the llamas in both images creates a sense of serenity and tranquility. In summary, the two images are connected through their depiction of llamas in natural, mountainous environments, highlighting the beauty and serenity of these animals in their habitats.
And here's another example that encodes a local image to Base64 and sends it to the model:
The image features a person dressed as an alpaca, wearing a white jacket with red accents and sunglasses. The individual is positioned centrally in the frame, facing forward.
* **Alpaca Costume:**
* The person is wearing a white alpaca costume that covers their head and body.
* The costume includes two gray horns on top of the headpiece.
* The face of the alpaca is visible through the headpiece, with a neutral expression.
* **Clothing:**
* The person is wearing a white jacket with a fur-lined hood and red accents on the inside of the collar and cuffs.
* The jacket has a zipper closure at the front.
* **Sunglasses:**
* The person is wearing pink sunglasses with dark lenses.
* **Background:**
* The background of the image is a solid pink color.
* **Overall Impression:**
* The image appears to be a playful and humorous depiction of an alpaca, with the person's costume and accessories adding to the comedic effect.
In summary, the image shows a person dressed as an alpaca, wearing a white jacket and sunglasses, set against a pink background.
JSON structured output
You can use the chat completions endpoint with a developer-defined JSON schema, and the model will format the data to the schema before returning it.
The endpoint expects a Pydantic schema. You may need to install pydantic to run this example.
{"street": "123 Main St", "city": "Anytown", "state": "USA" , "zip": ""}
Tool calling
Tool calling is supported with the chat completions endpoint. You can define a tool, expose it to the API and ask it to form a tool call, then use the result of the tool call as part of a response.
Note: Llama API does not execute tool calls. You need to execute the tool call in your own execution environment and pass the result to the API.
CreateChatCompletionResponse(completion_message=CompletionMessage(content=MessageTextContentItem(text='', type='text'), role='assistant', stop_reason='tool_calls', tool_calls=[ToolCall(id='370eaccc-efb3-4bc6-85ed-20a99c165d1f', function=ToolCallFunction(arguments='{"location":"Menlo Park"}', name='get_weather'))]), metrics=[Metric(metric='num_completion_tokens', value=9.0, unit='tokens'), Metric(metric='num_prompt_tokens', value=590.0, unit='tokens'), Metric(metric='num_total_tokens', value=599.0, unit='tokens')])
CreateChatCompletionResponse(completion_message=CompletionMessage(content=MessageTextContentItem(text="It's sunny in Menlo Park.", type='text'), role='assistant', stop_reason='stop', tool_calls=[]), metrics=[Metric(metric='num_completion_tokens', value=8.0, unit='tokens'), Metric(metric='num_prompt_tokens', value=618.0, unit='tokens'), Metric(metric='num_total_tokens', value=626.0, unit='tokens')])
Moderations
The moderations endpoint allows you to check both user prompts and model responses for any problematic content.
ModerationCreateResponse(model='Llama-Guard', results=[Result(flagged=False, flagged_categories=None)]) ModerationCreateResponse(model='Llama-Guard', results=[Result(flagged=True, flagged_categories=['indiscriminate-weapons'])])
Next steps
Now that you've familiarized yourself with the concepts of Llama API, you can learn more by exploring the API reference docs and deep dive guides at https://llama.developer.meta.com/docs/.