Google IO2025 Live Coding
Google I/O 2025 - Live coding experiences with the Gemini API
Welcome to the official Colab notebook from the Google I/O 2025 live coding session on the Gemini API!
This notebook serves as a comprehensive, hands-on guide to exploring the cutting-edge capabilities of the Gemini models, as demonstrated live during the presentation. You will dive deep into how developers can leverage the Gemini API to build powerful, innovative, and highly intelligent AI-powered applications.
Throughout this interactive session, you'll find practical demonstrations covering the latest advancements in Gemini, including:
- Generative Media (GenMedia models): Learn to create stunning images with Imagen3 and the experimental Gemini 2.0 Flash image generation, and generate dynamic videos with the powerful Veo2 model.
- Advanced Multimodality: Understand and generate content across various modalities, seamlessly combining text, images, and videos in your prompts and responses.
- Text-to-Speech (TTS): Transform written text into natural-sounding audio, exploring customizable voices, language options, and even multi-speaker dialogues.
- Intelligent Tool Use: Empower Gemini with built-in tools like Code Execution (for solving complex problems in a sandbox), real-time grounding via Google Search, and URL context to interact with external systems and fetch factual information directly from the web.
- Adaptive Thinking & Agentic Solutions: Discover how Gemini models can perform internal reasoning and problem-solving with their thinking capability, and how to build complex, multi-step AI agents using the Google Agent Development Kit (ADK) for advanced use cases.
This notebook is designed to be fully runnable, allowing you to execute the code, experiment with different prompts, and directly experience the versatility and power of the Gemini API. Get ready to unlock new possibilities and code the future!
Setup
Before diving into the exciting world of Gemini API, we need to set up our environment. This involves installing the necessary SDK and configuring your API key.
Install SDK
The google-genai Python SDK is essential for interacting with the Gemini API. This SDK provides a streamlined way to access different Gemini models and their functionalities.
Install the SDK from PyPI.
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 196.3/196.3 kB 4.9 MB/s eta 0:00:00
Setup your API key
To authenticate your requests with the Gemini API, you need an API key. This key allows you to access Google's powerful generative AI models. It's recommended to store your API key securely, for instance, as a Colab Secret named GOOGLE_API_KEY.
If you don't already have an API key or you aren't sure how to create a Colab Secret, see Authentication
for an example.
Initialize SDK client
With the google-genai SDK, initializing the client is straightforward. You pass your API key to genai.Client, and the client handles communication with the Gemini API. Individual model settings are then applied in each API call.
Working with GenMedia models
The Gemini API offers access to various GenMedia models, enabling advanced capabilities like generating images and videos from text prompts. These models push the boundaries of what's possible in creative content generation.
Generating images with Imagen3
Imagen 3 is Google's most advanced text-to-image model, capable of producing highly detailed images with rich lighting and fewer artifacts than previous versions. It excels in scenarios where image quality and specific artistic styles are paramount.
| ⚠️ |
Imagen is a paid-only feature and won't work if you are on the free tier. |
Select the Imagen3 model to be used
The imagen-3.0-generate-002 model is specifically designed for high-quality image generation from textual prompts.
Imagen3 image generation prompt
When generating images with Imagen 3, you provide a descriptive prompt to guide the output. You can also specify parameters like the number of images, person_generation (to allow or disallow generating images of people), and aspect_ratio for different output dimensions.
CPU times: user 82.6 ms, sys: 13 ms, total: 95.6 ms Wall time: 4.59 s
After generation, the result.generated_images object contains the generated images, which can then be displayed.
Generating images with Gemini 2.0 Flash image out model (experimental)
The gemini-2.0-flash-preview-image-generation model extends Gemini's multimodal capabilities to include conversational image generation and editing. This model can generate images along with text responses, making it highly versatile for mixed-media content creation.
Select the Gemini 2.0 image out model
This model is specifically designed for generating and editing images conversationally.
Gemini 2.0 Flash image generation prompt (with interleaved text)
This example demonstrates how Gemini 2.0 Flash can generate both text and images in an interleaved fashion, providing a rich, conversational output that combines instructions with visual aids. You must explicitly set response_modalities to ['Text', 'Image'] to enable this feature.
CPU times: user 459 ms, sys: 53.7 ms, total: 513 ms Wall time: 19.6 s
The output response.candidates.content.parts can contain both text and inline image data, which are then displayed accordingly.
Gemini 2.0 Flash image generation prompt
This example focuses on generating an image based on a descriptive text prompt, similar to Imagen 3, but utilizing the Gemini 2.0 Flash model's capabilities for image generation within a text-based generation flow.
CPU times: user 112 ms, sys: 6.52 ms, total: 118 ms Wall time: 3.5 s
The generated content is then processed to display the image parts.
image/png
Saving the generated image
The generated image data can be extracted from the response and saved locally, typically as a PNG file.
Editing images with Gemini 2.0 Flash image out
Gemini 2.0 Flash also supports image editing. You can provide an image as input along with a text prompt describing the desired modifications. This allows for conversational image manipulation.
CPU times: user 678 ms, sys: 15.6 ms, total: 694 ms Wall time: 5.79 s
The edited image and any accompanying text are then displayed.
image/png
The edited image is saved, overwriting the previous gemini_imgout.png file (for future usage in this notebook).
Generating videos with the Veo2 model
Veo 2 is Google's advanced text-to-video and image-to-video model, capable of generating high-quality videos with detailed cinematic and visual styles. It can capture prompt nuances and maintain consistency across frames.
| ⚠️ |
Veo is a paid-only feature and won't work if you are on the free tier.
|
Select the Veo2 model to be used
The veo-2.0-generate-001 model is used for video generation tasks.
Run a text-to-video generation prompt
This section demonstrates how to generate videos directly from a text prompt. You can specify various configurations like person_generation, aspect_ratio, number_of_videos, duration, and a negative_prompt to guide the video generation process. Video generation is an asynchronous operation, so the code includes a loop to wait for the operation to complete.
name='models/veo-2.0-generate-001/operations/zmrlsiqnzaw2' metadata=None done=None error=None response=None result=None name='models/veo-2.0-generate-001/operations/zmrlsiqnzaw2' metadata=None done=True error=None response=GenerateVideosResponse(generated_videos=[GeneratedVideo(video=Video(uri=https://generativelanguage.googleapis.com/v1beta/files/p0w3cekzjwdc:download?alt=media, video_bytes=None, mime_type=None))], rai_media_filtered_count=None, rai_media_filtered_reasons=None) result=GenerateVideosResponse(generated_videos=[GeneratedVideo(video=Video(uri=https://generativelanguage.googleapis.com/v1beta/files/p0w3cekzjwdc:download?alt=media, video_bytes=None, mime_type=None))], rai_media_filtered_count=None, rai_media_filtered_reasons=None) [GeneratedVideo(video=Video(uri=https://generativelanguage.googleapis.com/v1beta/files/p0w3cekzjwdc:download?alt=media, video_bytes=None, mime_type=None))] CPU times: user 188 ms, sys: 37.1 ms, total: 225 ms Wall time: 40.7 s
See the video generation results
Once the video generation operation is complete, the generated video(s) can be downloaded and displayed within the notebook. Generated videos are stored for 2 days on the server, so it's important to save a local copy if needed.
Run a image-to-video generation prompt
Veo 2 can also generate videos from an input image, using the image as the starting frame. This allows you to bring static images to life by adding motion and narrative based on a text prompt.
name='models/veo-2.0-generate-001/operations/7rz7rsx527t2' metadata=None done=None error=None response=None result=None name='models/veo-2.0-generate-001/operations/7rz7rsx527t2' metadata=None done=True error=None response=GenerateVideosResponse(generated_videos=[GeneratedVideo(video=Video(uri=https://generativelanguage.googleapis.com/v1beta/files/p4yyv2k2oso2:download?alt=media, video_bytes=None, mime_type=None))], rai_media_filtered_count=None, rai_media_filtered_reasons=None) result=GenerateVideosResponse(generated_videos=[GeneratedVideo(video=Video(uri=https://generativelanguage.googleapis.com/v1beta/files/p4yyv2k2oso2:download?alt=media, video_bytes=None, mime_type=None))], rai_media_filtered_count=None, rai_media_filtered_reasons=None) [GeneratedVideo(video=Video(uri=https://generativelanguage.googleapis.com/v1beta/files/p4yyv2k2oso2:download?alt=media, video_bytes=None, mime_type=None))] CPU times: user 670 ms, sys: 27.9 ms, total: 698 ms Wall time: 41.4 s
The generated videos are then saved and displayed.
Generating text-to-speech (TTS) with Gemini models
The Gemini API offers native text-to-speech (TTS) capabilities, allowing you to transform text into natural-sounding audio. This feature provides fine-grained control over various aspects of speech, including style, accent, pace, and tone.
Select the TTS model to be used
The gemini-2.5-flash-preview-tts and gemini-2.5-pro-preview-tts models are optimized for low-latency, controllable audio generation, supporting both single-speaker and multi-speaker outputs.
Generating a simple audio output
his example demonstrates the basic text-to-speech functionality, converting a simple text string into an audio output. The response_modalities configuration is set to ['Audio'].
Controlling how the model speaks
The Gemini TTS models allow you to control the style, tone, accent, and pace of the generated speech using natural language prompts within the contents and by selecting a specific voice_name from a variety of prebuilt voices.
Changing the audio language
The TTS models can automatically detect the input language and generate speech in that language. They support a wide range of languages.
You can also instruct the model to read text in a particular style or pace, as demonstrated in this disclaimer reading example.
Working with multi-speakers
For conversations or scenarios requiring multiple distinct voices, the Gemini TTS models support multi-speaker audio generation. You define different speakers within the MultiSpeakerVoiceConfig and assign specific voices to them. The model can then process a transcript and assign parts to each speaker. First, a sample transcript is generated by a Gemini text model.
**Speaker A:** Okay, so, Google I/O 2025? Absolute mind-blow. Seriously, my brain’s still buzzing from all the AI announcements! **Speaker B:** Right?! Gemini 2.5 alone… the *on-device* multimodality was unreal. And the performance gains across the whole stack? Unbelievable! The new agentic capabilities are going to change everything. **Speaker A:** Totally! But seriously, forget the keynotes for a sec. The *highlight* for me? Hands down, Luciano Martins’ live coding session. **Speaker B:** YES! That was next level! He just *built* a fully agentic workflow with the new Gemini 2.5 APIs in like, 10 minutes, *live*, with zero hiccups. That's the real magic. Forget the shiny demos; that's the part that truly blew my mind! Best session of the whole conference. CPU times: user 38 ms, sys: 2.05 ms, total: 40 ms Wall time: 6.46 s
Then, the generated transcript is passed to the TTS model with multi-speaker configuration.
Working with the Gemini 2.5 models
The Gemini 2.5 series models, including Flash and Pro, offer enhanced capabilities such as adaptive thinking, multimodal understanding, and advanced reasoning, making them suitable for a wide range of complex tasks.
Counting token
Token counting helps you understand the length of your input and output, which is relevant for managing model context windows and estimating costs. A token is roughly equivalent to 4 characters for Gemini models.
total_tokens=13 cached_content_token_count=None
Sending your first prompt
Making a request to a Gemini model is straightforward using the generate_content method. You specify the model and the content (your prompt), and the model returns a text response.
GenerateContentResponseUsageMetadata(cache_tokens_details=None, cached_content_token_count=None, candidates_token_count=22, candidates_tokens_details=None, prompt_token_count=13, prompt_tokens_details=[ModalityTokenCount(modality=<MediaModality.TEXT: 'TEXT'>, token_count=13)], thoughts_token_count=375, tool_use_prompt_token_count=None, tool_use_prompt_tokens_details=None, total_token_count=410, traffic_type=None)
Your first streaming interaction
For more fluid interactions, especially with longer responses, you can use streaming. The generate_content_stream method allows you to receive parts of the response incrementally as they are generated, rather than waiting for the entire output.
Google I/O normally happens at the **Shoreline Amphitheatre** in **Mountain View, California**.Google I/O normally happens at the **Shoreline Amphitheatre** in **Mountain View, California**.Google I/O normally happens at the **Shoreline Amphitheatre** in **Mountain View, California**.Google I/O normally happens at the **Shoreline Amphitheatre** in **Mountain View, California**.Google I/O normally happens at the **Shoreline Amphitheatre** in **Mountain View, California**.Google I/O normally happens at the **Shoreline Amphitheatre** in **Mountain View, California**.Google I/O normally happens at the **Shoreline Amphitheatre** in **Mountain View, California**.Google I/O normally happens at the **Shoreline Amphitheatre** in **Mountain View, California**.Google I/O normally happens at the **Shoreline Amphitheatre** in **Mountain View, California**.Google I/O normally happens at the **Shoreline Amphitheatre** in **Mountain View, California**.Google I/O normally happens at the **Shoreline Amphitheatre** in **Mountain View, California**.Google I/O normally happens at the **Shoreline Amphitheatre** in **Mountain View, California**.Google I/O normally happens at the **Shoreline Amphitheatre** in **Mountain View, California**.Google I/O normally happens at the **Shoreline Amphitheatre** in **Mountain View, California**.Google I/O normally happens at the **Shoreline Amphitheatre** in **Mountain View, California**.Google I/O normally happens at the **Shoreline Amphitheatre** in **Mountain View, California**.Google I/O normally happens at the **Shoreline Amphitheatre** in **Mountain View, California**.Google I/O normally happens at the **Shoreline Amphitheatre** in **Mountain View, California**.Google I/O normally happens at the **Shoreline Amphitheatre** in **Mountain View, California**.
Working with multimodal prompts
Gemini models are inherently multimodal, meaning they can process and generate content based on various input types, including text, images, video, and audio. This example demonstrates how to provide an image alongside a text prompt.
The image and a textual prompt are combined to ask the model to describe the image.
Here, the model is asked to generate a blog post based on the provided image, showcasing its creative content generation capabilities from multimodal input.
Video understanding with Gemini models
Gemini models can process and understand video content, enabling a wide range of use cases such as describing scenes, extracting information, answering questions about video content, and referring to specific timestamps.
First, a sample video is downloaded, then uploaded using the client.files.upload method, which is suitable for larger videos or for reusing the video across multiple requests.
Waiting for video to be processed. Waiting for video to be processed. Waiting for video to be processed. Waiting for video to be processed. Video processing complete: https://generativelanguage.googleapis.com/v1beta/files/zyg3jyh4pdm9
Do a semantic search in the video
You can prompt the model to analyze video content for specific information, such as organizing scenes, identifying objects, and estimating emotional states. The model processes both the visual frames and the audio track to provide a comprehensive understanding.
CPU times: user 91.1 ms, sys: 14.8 ms, total: 106 ms Wall time: 18 s
Analyze YouTube videos
The Gemini API can also process publicly available YouTube videos directly by providing their URL. This allows for tasks like summarizing video content or finding specific mentions within the video.
The model can find specific instances of words or phrases and provide timestamps and context, as demonstrated by searching for "AI" in Sundar Pichai's keynote from Google IO 2023.
CPU times: user 251 ms, sys: 43.9 ms, total: 295 ms Wall time: 52.3 s
Analyze specific parts of videos using clipping intervals
For more focused analysis, you can specify video_metadata with start_offset and end_offset to define clipping intervals. This tells the model to analyze only a specific segment of the video.
The model will then summarize only the specified segment.
CPU times: user 282 ms, sys: 39.8 ms, total: 322 ms Wall time: 58.7 s
Customize the number of video frames per second (FPS) analyzed
By default, Gemini models sample videos at 1 frame per second (FPS) for analysis. You can customize this by passing an fps argument to video_metadata. A higher FPS can capture more details in rapidly changing visuals, while a lower FPS is useful for mostly static videos like lectures.
By adjusting the FPS, you can control the granularity of video analysis, which is particularly useful for tasks requiring close attention to visual details.
CPU times: user 90.1 ms, sys: 13.8 ms, total: 104 ms Wall time: 17.8 s
Working with Tools
The Gemini API enables models to interact with external systems and perform specialized tasks through the use of tools. These tools enhance the model's capabilities by allowing it to execute code, search the web, or process information from specific URLs.
Code execution
The code_execution tool allows the Gemini model to generate and execute Python code. This is particularly useful for tasks requiring precise calculations, data manipulation, or algorithmic problem-solving. The model can iteratively learn from the execution results to refine its output.
CPU times: user 50.5 ms, sys: 2.96 ms, total: 53.5 ms Wall time: 6.56 s
Multimodality with code execution
The code execution tool can be combined with multimodal inputs. This example demonstrates how the model can receive an image (related to the Monty Hall problem) and then generate and execute Python code to simulate the problem, providing a programmatic solution to a visually presented challenge.
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 24719 100 24719 0 0 122k 0 --:--:-- --:--:-- --:--:-- 121k
The model is prompted to run a simulation, demonstrating its ability to reason about a problem and use code to solve it, even with a visual aid as part of the prompt.
CPU times: user 112 ms, sys: 17.7 ms, total: 130 ms Wall time: 14.9 s
Grounding information with Google Search
The google_search tool allows Gemini models to access up-to-date information beyond their training data by querying Google Search. This significantly improves the accuracy and recency of responses, especially for questions about current events or highly specific topics. When enabled, the Gemini API can also return grounding sources and search suggestions.
First, a prompt without Google Search grounding is sent. The model responds based on its internal knowledge, which might be outdated.
Next, the same prompt is sent with the google_search tool enabled. This allows the model to perform a live search to retrieve the most current information.
Search Query: ['latest Brazil vs Argentina football game score', 'Brazil vs Argentina last match date and score'] Search Pages: 365scores.com, 365scores.com, sofascore.com, thehindu.com, skysports.com
Using Code Execution and Google Search grounding together
The power of Gemini models is further amplified when multiple tools are used in conjunction. Here, both code_execution and google_search tools are enabled. This allows the model to search for information (e.g., about movies) and then use code to process and visualize that data (e.g., generate a chart based on movie durations).
CPU times: user 156 ms, sys: 28.4 ms, total: 184 ms Wall time: 24.1 s
Grounding information with custom links (URL context)
The url_context tool allows you to provide specific URLs as additional context for your prompt. The model can then retrieve content from these URLs and use it to inform its response, enabling tasks like summarizing documents, comparing information across multiple links, or analyzing content for specific purposes.
First, a prompt is sent without the url_context tool. The model provides a general answer based on its training data.
Next, the same prompt is sent, but this time with a specific URL enabled via url_context. This ensures the model's response is grounded in the information provided on that particular web page, leading to a more precise and contextually relevant answer.
This final example demonstrates comparing information from multiple provided URLs, showcasing the url_context tool's ability to synthesize data from several sources.
Using the Gemini models thinking capability
The Gemini 2.5 series models incorporate an internal "thinking process" that significantly enhances their reasoning and multi-step planning abilities. This makes them highly effective for complex problems, allowing them to break down tasks and arrive at more accurate conclusions.
Select the thinking model you want to use
The gemini-2.5-flash model supports the thinking capability, which is enabled by default for 2.5 series models. But for thinking experiments you can count on the more robust gemini-2.5-pro model too.
Note: While gemini-2.5-pro is a thinking capable model too, the thinking_budget parameter is only available for the gemini-2.5-flash for now.
Starting with adaptive thinking
When using a Gemini 2.5 series model, thinking is enabled by default. The model dynamically adjusts its internal reasoning budget based on the complexity of the query, allowing it to solve problems that require multiple steps of thought.
CPU times: user 88.2 ms, sys: 7.7 ms, total: 95.9 ms Wall time: 13 s
The usage_metadata provides insights into the token counts for the prompt, internal thoughts, and the final output, helping to understand the computational effort involved in the model's reasoning process.
Prompt tokens: 59 Thoughts tokens: 1696 Output tokens: 766 Total tokens: 2521
Disabling thinking using the thinking_budget parameter
While thinking is powerful, for straightforward tasks where complex reasoning isn't required, you can disable it by setting thinking_budget to 0. This can potentially reduce latency and cost for simpler queries.
CPU times: user 45.7 ms, sys: 4.32 ms, total: 50 ms Wall time: 7.94 s
Observing the token counts again will show that thoughts_token_count is now zero, indicating that the thinking process was disabled.
Prompt tokens: 59 Thoughts tokens: None Output tokens: 1369 Total tokens: 1428
Multimodal interactions with thinking
Gemini's thinking capabilities extend to multimodal inputs. This example demonstrates providing an image along with a complex problem (a riddle about pool balls). The model uses its reasoning abilities, potentially with a higher thinking_budget, to solve the problem presented visually.
The model receives the image and the prompt, then uses its thinking process to attempt to solve the riddle.
Working with thinking process summaries
For complex tasks, understanding the model's internal reasoning process can be crucial for debugging or verifying its approach. The include_thoughts parameter allows you to retrieve thought summaries, which provide insights into the steps the model took to arrive at its answer.
CPU times: user 308 ms, sys: 37.1 ms, total: 345 ms Wall time: 1min
The output will separate the model's internal thoughts (reasoning steps) from its final answer.
Building one agentic solution
This section introduces the concept of building agentic solutions using the Google Agent Development Kit (ADK). An agent is a system that can reason, plan, and use tools to achieve a goal. Here, we define a multi-agent system consisting of a coordinator and specialized sub-agents for academic research.
First, the google-adk library is installed.
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.2/1.2 MB 17.2 MB/s eta 0:00:00 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 240.0/240.0 kB 13.7 MB/s eta 0:00:00 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 95.2/95.2 kB 4.5 MB/s eta 0:00:00 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 217.1/217.1 kB 13.5 MB/s eta 0:00:00 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 334.1/334.1 kB 15.0 MB/s eta 0:00:00 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 130.3/130.3 kB 7.9 MB/s eta 0:00:00 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 65.8/65.8 kB 3.5 MB/s eta 0:00:00 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 119.0/119.0 kB 6.8 MB/s eta 0:00:00 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 194.9/194.9 kB 10.2 MB/s eta 0:00:00 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 62.5/62.5 kB 4.3 MB/s eta 0:00:00 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 103.3/103.3 kB 6.3 MB/s eta 0:00:00 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 44.4/44.4 kB 2.1 MB/s eta 0:00:00 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 72.0/72.0 kB 3.8 MB/s eta 0:00:00
Importing required modules
Then, using prompts, the agents are defined with specific roles (System Role), instructions (Instruction), and the tools they can use.
And finally you instantiate your agents using the ADK library. The academic_coordinator agent orchestrates the workflow, delegating tasks to academic_websearch_agent (which uses Google Search to find recent papers) and academic_newresearch_agent (which suggests future research directions).
To test and validate your agentic solution, you create a local Colab session for it.
Session created: App='academic_research_app', User='user_1', Session='session_001' Runner created for agent 'academic_coordinator'.
Also you define one function to interact with your model directly on the colab (It is not necessary if you are deploying your solution somewhere - like on a Kubernetes cluster or at Google Cloud Run service).
Finally you can create a simple look to interact with your agentic solution in a conversational way.
Hi! how can I help you Today? what do you do? >>> User Query: what do you do?
<<< Agent Response: None Anything else? I want to know about the latest large language models related papers from the last 2 months >>> User Query: I want to know about the latest large language models related papers from the last 2 months
WARNING:google_genai.types:Warning: there are non-text parts in the response: ['function_call'], returning concatenated text result from text parts. Check the full candidates.content.parts accessor to get the full model response.
<<< Agent Response: None Anything else? I want to know more about the "The Clinicians' Guide to Large Language Models: A General Perspective With a Focus on Hallucinations" paper >>> User Query: I want to know more about the "The Clinicians' Guide to Large Language Models: A General Perspective With a Focus on Hallucinations" paper
<<< Agent Response: None Anything else? yes, I want you to search more about that >>> User Query: yes, I want you to search more about that
WARNING:google_genai.types:Warning: there are non-text parts in the response: ['function_call'], returning concatenated text result from text parts. Check the full candidates.content.parts accessor to get the full model response.
<<< Agent Response: None Anything else? bye
<EOF>