Multiagent Rag System
Multi-agent RAG System 🤖🤝🤖
Authored by: Sergio Paniego
🚨 NOTE: This tutorial is advanced. You should have a solid understanding of the concepts discussed in the following cookbooks before diving in:
In this notebook, we will create a multi-agent RAG system, a system where multiple agents work together to retrieve and generate information, combining the strengths of retrieval-based systems and generative models.
What is a Multi-agent RAG System? 🤔
A Multi-agent Retrieval-Augmented Generation (RAG) system consists of multiple agents that collaborate to perform complex tasks. The retrieval agent retrieves relevant documents or information, while the generative agent synthesizes that information to generate meaningful outputs. There is a Manager Agent that orchestrates the system and selects the most appropriate agent for the task based on the user input.
The original idea for this recipe comes from this post. You may find more details about it there.
Below, you can find the architecture that we will build.
1. Install dependencies
First, let's install the dependencies:
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks... To disable this warning, you can either: - Avoid using `tokenizers` before the fork if possible - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks... To disable this warning, you can either: - Avoid using `tokenizers` before the fork if possible - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
Let's login in order to call the HF Inference API:
2. Let's create our multi-agent RAG system
In this section, we will create each of the agents present in our RAG system.
We will have 3 agents managed by a central one (refer to the image for details):
- 🕵💬 Web search agent: It will include the
DuckDuckGoSearchTooltool and theVisitWebpageTool. As you can see, each agent may contain a list of tools. - 🕵💬 Retriever agent: It will include two tools for retrieving information from two different knowledge bases.
- 🕵💬 Image generation agent: It will include a prompt generator tool in addition to the image generation tool.
💡 In addition to these agents, the central/orchestrator agent will also have access to the code interpreter tool to execute code.
We will use Qwen/Qwen2.5-72B-Instruct as the LLM for each component, which will be accessed via the Inference API. Depending on the agent, a different LLM model may be used.
Note: The Inference API hosts models based on various criteria, and deployed models may be updated or replaced without prior notice. Learn more about it here.
Let's dive into the details of each agent!
2.1 Web search agent 🔍
The Web search agent will utilize the DuckDuckGoSearchTool to search the web and gather relevant information. This tool acts as a search engine, querying for results based on the specified keywords.
To make the search results actionable, we also need the agent to access the web pages retrieved by DuckDuckGo. That can be achieved by using the built-in VisitWebpageTool.
Let’s explore how to set it up and integrate it into our system!
The following code comes from the original Have several agents collaborate in a multi-agent hierarchy 🤖🤝🤖 recipe, so refer to it for more details.
2.1.1 Build our multi-tool web agent 🤖
Now that we've set up the basic search and webpage tools, let's build our multi-tool web agent. This agent will combine several tools to perform more complex tasks, leveraging the capabilities of the ToolCallingAgent.
The ToolCallingAgent is particularly well-suited for web search tasks because its JSON action formulation requires only simple arguments and works seamlessly in sequential chains of single actions. This makes it an excellent choice for scenarios where we need to search the web for relevant information and retrieve detailed content from specific web pages. In contrast, CodeAgent action formulation is better suited for scenarios involving numerous or parallel tool calls.
By integrating multiple tools, we can ensure that our agent interacts with the web in a sophisticated and efficient manner.
Let's dive into how to set this up and integrate it into our system!
Now that we have our first agent, let's wrap it as a ManagedAgent so the central agent can use it.
2.2 Retriever agent 🤖🔍
The second agent in our multi-agent system is the Retriever agent. This agent is responsible for gathering relevant information from different sources. To achieve this, it will utilize two tools that retrieve data from two separate knowledge bases.
We will reuse two data sources that were previously used in other RAG recipes, which will allow the retriever to efficiently gather information for further processing.
By leveraging these tools, the Retriever agent can access diverse datasets, ensuring a comprehensive collection of relevant information before passing it on to the next step in the system.
Let's explore how to set up the retriever and integrate it into our multi-agent system!
2.2.1 HF docs retriever tool 📚
The first retriever tool comes from the Agentic RAG: turbocharge your RAG with query reformulation and self-query! 🚀 recipe.
For this retriever, we will use a dataset that contains a compilation of documentation pages for various huggingface packages, all stored as markdown files. This dataset serves as the knowledge base for the retriever agent to search and retrieve relevant documentation.
To make this dataset easily accessible for our agent, we will:
- Download the dataset: We will first fetch the markdown documentation.
- Embed the data: We will then convert the documentation into embeddings using a FAISS vector store for efficient similarity search.
By doing this, the retriever tool can quickly access the relevant pieces of documentation based on the search query, enabling the agent to provide accurate and detailed information.
Let’s go ahead and set up the tool to handle the documentation retrieval!
Now that we have the documentation embedded in FAISS, let's create the RetrieverTool. This tool will query the FAISS vector store to retrieve the most relevant documents based on the user’s query.
This will allow the retriever agent to access and provide relevant documentation when queried.
2.2.2 PEFT issues retriever tool
For the second retriever, we will use the PEFT issues as data source as in the Simple RAG for GitHub issues using Hugging Face Zephyr and LangChain.
Again, the following code comes from that recipe so refer to it for more details!
Let's now generate the second retriever tool using the same RetrieverTool.
2.2.3 Build the Retriever agent
Now that we’ve created the two retriever tools, it’s time to build the Retriever agent. This agent will manage both tools and retrieve relevant information based on the user query.
We’ll use the ManagedAgent to integrate these tools and pass the agent to the central agent for coordination.
2.3 Image generation agent 🎨
The third agent in our system is the Image generation agent. This agent will have two tools: one for refining the user query and another for generating the image based on the query. In this case, we will use the CodeAgent instead of a ReactAgent since the set of actions can be executed in one shot.
You can find more details about the image generation agent in the Agents, supercharged - Multi-agents, External tools, and more documentation.
Let’s dive into how these tools will work together to generate images based on user input!
🖼 Again, we use ManagedAgent to tell the central agent that it can manage it. Additionally, we’ve included an additional_prompting parameter to ensure the agent returns the generated image instead of just a text description.
3. Let's add the general agent manager to orchestrate the system
The central agent manager will coordinate tasks between the agents. It will:
- Receive user input and decide which agent (Web search, Retriever, Image generation) handles it.
- Delegate tasks to the appropriate agent based on the user's query.
- Collect and synthesize results from the agents.
- Return the final output to the user.
We include all the agents we’ve developed as managed_agents and add any necessary imports for the code executor under additional_authorized_imports.
Now that everything is set up, let's test the performance of the multi-agent RAG system!
To do so, we'll provide some example queries and observe how the system delegates tasks between the agents, processes the information, and returns the final results.
This will help us understand the efficiency and effectiveness of our agents working together, and identify areas for optimization if necessary.
Let's go ahead and run some tests!
3.1 Example trying to trigger the search agent
======== New task ======== How many years ago was Stripe founded? === Agent thoughts: Thought: To solve this problem, I need to find out the year Stripe was founded and then calculate the difference between the current year and the year of founding. >>> Agent is executing the code below: founding_year = search("When was Stripe founded?") print(founding_year) ==== ======== New task ======== You're a helpful agent named 'search'. You have been submitted this task by your manager. --- Task: When was Stripe founded? --- You're helping your manager solve a wider task: so make sure to not provide a one-line answer, but give as much information as possible so that they have a clear understanding of the answer. Your final_answer WILL HAVE to contain these parts: ### 1. Task outcome (short version): ### 2. Task outcome (extremely detailed version): ### 3. Additional context (if relevant): Put all these in your final_answer tool, everything that you do not pass as an argument to final_answer will be lost. And even if your task resolution is not successful, please return as much context as possible, so that your manager can act upon this feedback. === Agent thoughts: Thought: I will start by performing a web search to find when Stripe was founded and gather detailed information about the company's founding. >>> Calling tool: 'web_search' with arguments: {'query': 'When was Stripe founded'} === Agent thoughts: Thought: I found several sources indicating when Stripe was founded and additional information about the company’s founding. I’ll gather the relevant details to provide a comprehensive answer. >>> Calling tool: 'final_answer' with arguments: {'answer': "### 1. Task outcome (short version):\nStripe was founded in 2010 by brothers John and Patrick Collison.\n\n### 2. Task outcome (extremely detailed version):\nStripe, Inc. is an Irish-American multinational financial services and software as a service (SaaS) company. It was founded in 2010 by Irish entrepreneur brothers John and Patrick Collison in Palo Alto, California. The Collison brothers started working on Stripe while they were still in high school and college, aiming to create a payment platform that would give developers the tools they needed to create secure and novel buying experiences. Patrick Collison serves as the company's CEO, while John Collison is the president.\n\n### 3. Additional context (if relevant):\nThe Collison brothers moved to the USA from a rural village in Ireland at a young age to pursue their entrepreneurial dreams. They identified a gap in the market for a payment platform that would make it easy for small businesses to accept payments from anywhere in the world. Stripe debuted in 2010 and grew exponentially due to its user-friendly front-end and robust back-end infrastructure. Today, more than $1 trillion in payments pass through Stripe's software on behalf of customers, and the company has achieved a valuation of nearly $100 billion."} Print outputs: ### 1. Task outcome (short version): Stripe was founded in 2010 by brothers John and Patrick Collison. ### 2. Task outcome (extremely detailed version): Stripe, Inc. is an Irish-American multinational financial services and software as a service (SaaS) company. It was founded in 2010 by Irish entrepreneur brothers John and Patrick Collison in Palo Alto, California. The Collison brothers started working on Stripe while they were still in high school and college, aiming to create a payment platform that would give developers the tools they needed to create secure and novel buying experiences. Patrick Collison serves as the company's CEO, while John Collison is the president. ### 3. Additional context (if relevant): The Collison brothers moved to the USA from a rural village in Ireland at a young age to pursue their entrepreneurial dreams. They identified a gap in the market for a payment platform that would make it easy for small businesses to accept payments from anywhere in the world. Stripe debuted in 2010 and grew exponentially due to its user-friendly front-end and robust back-end infrastructure. Today, more than $1 trillion in payments pass through Stripe's software on behalf of customers, and the company has achieved a valuation of nearly $100 billion. === Agent thoughts: Thought: I can see that Stripe was founded in 2010. Now, I will calculate the difference between the current year and the founding year to find out how many years ago Stripe was founded. >>> Agent is executing the code below: import datetime founding_year = 2010 # From the search result current_year = datetime.datetime.now().year years_since_founded = current_year - founding_year final_answer(years_since_founded) ==== Print outputs: Last output from code snippet: 14 Final answer: 14
14
3.2 Example trying to trigger the image generator agent
======== New task ======== Improve this prompt, then generate an image of it. You have been provided with these initial arguments: {'prompt': 'A rabbit wearing a space suit'}. === Agent thoughts: Thought: I will first improve the prompt to make it more detailed and then use the `image_generation` tool to generate an image based on the improved prompt. I will store the improved prompt in a variable and print it for the next step. >>> Agent is executing the code below: improved_prompt = "A rabbit wearing a space suit, jumping in a zero-gravity environment, surrounded by stars and planets." print(improved_prompt) ==== Print outputs: A rabbit wearing a space suit, jumping in a zero-gravity environment, surrounded by stars and planets. === Agent thoughts: Thought: Now that I have the improved prompt, I will use the `image_generation` tool to generate the image. >>> Agent is executing the code below: image_generation(improved_prompt) ==== ======== New task ======== You're a helpful agent named 'image_generation'. You have been submitted this task by your manager. --- Task: A rabbit wearing a space suit, jumping in a zero-gravity environment, surrounded by stars and planets. --- You're helping your manager solve a wider task: so make sure to not provide a one-line answer, but give as much information as possible so that they have a clear understanding of the answer. Your final_answer WILL HAVE to contain these parts: ### 1. Task outcome (short version): ### 2. Task outcome (extremely detailed version): ### 3. Additional context (if relevant): Put all these in your final_answer tool, everything that you do not pass as an argument to final_answer will be lost. And even if your task resolution is not successful, please return as much context as possible, so that your manager can act upon this feedback. Your final answer MUST BE only the generated image location. === Agent thoughts: Thought: I will use the `image_generator` tool to generate an image based on the description provided in the task. I will include additional details to make the image more visually appealing and descriptive. >>> Agent is executing the code below: prompt = "A rabbit wearing a space suit, jumping in a zero-gravity environment, surrounded by stars and planets, high-res, photorealistic, vibrant colors, detailed textures, and a sense of movement." image = image_generator(prompt=prompt) final_answer(image) ==== Print outputs: Last output from code snippet: /tmp/tmpvvp0x99l/f589dffc-4741-4589-8d45-ce1ef3f126c1.png === Agent thoughts: Thought: The image has been generated successfully. I will use the `final_answer` tool to provide the final answer to the task. >>> Agent is executing the code below: final_answer('/tmp/tmpvvp0x99l/f589dffc-4741-4589-8d45-ce1ef3f126c1.png') ==== Print outputs: Last output from code snippet: /tmp/tmpvvp0x99l/f589dffc-4741-4589-8d45-ce1ef3f126c1.png Final answer: /tmp/tmpvvp0x99l/f589dffc-4741-4589-8d45-ce1ef3f126c1.png
3.3 Example trying to trigger the retriever agent for the HF docs knowledge base
======== New task ======== How can I push a model to the Hub? === Agent thoughts: Thought: To provide instructions on how to push a model to the Hugging Face Hub, it would be most effective to retrieve official documentation or a guide that gives detailed steps. I'll use the `retriever` tool to find relevant information in the Hugging Face documentation or the PEFT issues section. >>> Agent is executing the code below: retriever(query="How to push a model to the Hugging Face Hub") ==== Code execution failed due to the following error: ManagedAgent.__call__() missing 1 required positional argument: 'request' Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/transformers/agents/agents.py", line 1135, in step result = self.python_evaluator( File "/usr/local/lib/python3.10/dist-packages/transformers/agents/python_interpreter.py", line 902, in evaluate_python_code result = evaluate_ast(node, state, static_tools, custom_tools, authorized_imports) File "/usr/local/lib/python3.10/dist-packages/transformers/agents/python_interpreter.py", line 782, in evaluate_ast return evaluate_ast(expression.value, state, static_tools, custom_tools) File "/usr/local/lib/python3.10/dist-packages/transformers/agents/python_interpreter.py", line 746, in evaluate_ast return evaluate_call(expression, state, static_tools, custom_tools) File "/usr/local/lib/python3.10/dist-packages/transformers/agents/python_interpreter.py", line 439, in evaluate_call output = func(*args, **kwargs) TypeError: ManagedAgent.__call__() missing 1 required positional argument: 'request' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/transformers/agents/agents.py", line 811, in direct_run step_logs = self.step() File "/usr/local/lib/python3.10/dist-packages/transformers/agents/agents.py", line 1154, in step raise AgentExecutionError(error_msg) transformers.agents.agents.AgentExecutionError: Code execution failed due to the following error: ManagedAgent.__call__() missing 1 required positional argument: 'request' === Agent thoughts: Thought: It appears that I encountered an error because I did not include the request parameter when calling the `retriever` tool. To correct this, I will provide a detailed request as an argument to the `retriever` tool. I will request the retrieval of documents or guides that explain the process of pushing a model to the Hugging Face Hub. >>> Agent is executing the code below: retriever(request="Please retrieve documentation or a guide on how to push a model to the Hugging Face Hub.") ==== ======== New task ======== You're a helpful agent named 'retriever'. You have been submitted this task by your manager. --- Task: Please retrieve documentation or a guide on how to push a model to the Hugging Face Hub. --- You're helping your manager solve a wider task: so make sure to not provide a one-line answer, but give as much information as possible so that they have a clear understanding of the answer. Your final_answer WILL HAVE to contain these parts: ### 1. Task outcome (short version): ### 2. Task outcome (extremely detailed version): ### 3. Additional context (if relevant): Put all these in your final_answer tool, everything that you do not pass as an argument to final_answer will be lost. And even if your task resolution is not successful, please return as much context as possible, so that your manager can act upon this feedback. === Agent thoughts: Thought: To complete this task, I need to retrieve relevant documentation or guides on how to push a model to the Hugging Face Hub. I'll use the `retriever` tool to find the most relevant documents. >>> Calling tool: 'retriever' with arguments: {'query': 'How to push a model to the Hugging Face Hub'} === Agent thoughts: Thought: The retrieved documents provide useful information about pushing a model to the Hugging Face Hub. I will summarize the information and provide a detailed answer. >>> Calling tool: 'final_answer' with arguments: {'answer': "### 1. Task outcome (short version):\n\nThe process of pushing a model to the Hugging Face Hub involves several steps, including preparing your model, installing necessary dependencies, and using the `huggingface_hub` library to upload the model.\n\n### 2. Task outcome (extremely detailed version):\n\n1. **Prepare Your Model**: Ensure your model is trained and saved in a format compatible with the Hugging Face Transformers library. This typically involves saving the model and tokenizer using the `save_pretrained` method.\n\n2. **Install Dependencies**: Install the `huggingface_hub` library if you haven't already. You can do this using pip:\n\n bash\n pip install huggingface_hub\n \n\n3. **Login to Hugging Face**: You need to be logged in to your Hugging Face account. Use the `huggingface-cli` to log in:\n\n bash\n huggingface-cli login\n \n\n This will prompt you to enter your Hugging Face API token, which you can find in your account settings on the Hugging Face website.\n\n4. **Create a Repository**: Create a new repository on the Hugging Face Hub using the `create_repo` command. Replace `<model_name>` with the name you want for your model:\n\n bash\n huggingface-cli repo create <model_name> --organization <your_organization> --private\n \n\n If you are creating a personal model, you can omit the `--organization` flag.\n\n5. **Upload the Model**: Use the `push_to_hub` method to upload your model to the Hugging Face Hub. This method is available in the `transformers` library and can be used as follows:\n\n python\n from transformers import AutoModel, AutoTokenizer\n\n model = AutoModel.from_pretrained('path/to/your/model')\n tokenizer = AutoTokenizer.from_pretrained('path/to/your/tokenizer')\n\n model.push_to_hub('<model_name>')\n tokenizer.push_to_hub('<model_name>')\n \n\n6. **Verify the Upload**: Once the model is uploaded, you can verify it by navigating to your Hugging Face profile or the specified organization's page and checking if the model is listed.\n\n### 3. Additional context (if relevant):\n\n- **Error Handling**: If you encounter errors during the upload process, ensure that your API token is correct and that you have the necessary permissions. Additionally, check that your model and tokenizer are saved correctly.\n- **Private vs. Public Models**: You can choose to make your model private by using the `--private` flag when creating the repository. Private models are only accessible to you and any collaborators you specify.\n- **Using Docker**: If you are working in a Docker environment, you may need to mount your model directory and set environment variables for the Hugging Face API token. An example of this is provided in one of the retrieved documents.\n- **Fine-tuning and Adapters**: If you are working with fine-tuned models or adapters, ensure that you merge the adapter with the base model before pushing it to the Hub. This can be done using the `PeftModel` class from the `peft` library.\n- **Dependencies and Requirements**: Ensure that your model's dependencies are included in a `requirements.txt` file. This is especially important if you are using custom layers or components in your model."} Print outputs: Last output from code snippet: ### 1. Task outcome (short version): The process of pushing a model to the Hugging Face Hub involves several steps, including preparing your model, installing necessary dependencies, and using the `huggingface_hub` library to upload the model. ### 2. Task outcome (extremely detailed version): 1. **Prepare Your Model**: Ensure your model is trained and saved in a format compatible with the Hugging Face Transformers library. This typically involves saving the model and tokenizer using the `save_pretrained` method. 2. **Install Dependencies**: Install the `huggingface_hub` library if you haven't already. You can do this using pip: bash pip install huggingface_hub 3. **Login to Hugging Face**: You need to be logged in to your Hugging Face account. Use the `huggingface-cli` to log in: bash huggingface-cli login This will prompt you to enter your Hugging Face API token, which you can find in your account settings on the Hugging Face website. 4. **Create a Repository**: Create a new repository on the Hugging Face Hub using the `create_repo` command. Replace `<model_name>` with the name you want for your model: bash huggingface-cli repo create <model_name> --organization <your_organization> --private If you are creating a personal model, you can omit the `--organization` flag. 5. **Upload the Model**: Use the `push_to_hub` method to upload your model to the Hugging Face Hub. This method is available in the `transformers` library and can be used as follows: python from transformers import AutoModel, AutoTokenizer model = AutoModel.from_pretrained('path/to/your/model') tokenizer = AutoTokenizer.from_pretrained('path/to/your/tokenizer') model.push_to_hub('<model_name>') tokenizer.push_to_hub('<model_name>') 6. **Verify the Upload**: Once the model is uploaded, you can verify it by navigating to your Hugging Face profile or the specified organization's page and checking if the model is listed. ### 3. Additional context (if relevant): - **Error Handling**: If you encounter errors during the upload process, ensure that your API token is correct and that you have the necessary permissions. Additionally, check that your model and tokenizer are saved correctly. - **Private vs. Public Models**: You can choose to make your model private by using the `--private` flag when creating the repository. Private models are only accessible to you and any collaborators you specify. - **Using Docker**: If you are working in a Docker environment, you may need to mount your model directory and set environment variables for the Hugging Face API token. An example of this is provided in one of the retrieved documents. - **Fine-tuning and Adapters**: If you are working with fine-tuned models or adapters, ensure that you merge the adapter with the base model before pushing it to the Hub. This can be done using the `PeftModel` class from the `peft` library. - **Dependencies and Requirements**: Ensure that your model's dependencies are included in a `requirements.txt` file. This is especially important if you are using custom layers or components in your model. === Agent thoughts: Thought: The information provided in the retrieval output is comprehensive and detailed. I will format this information into a clear, step-by-step guide and use the `final_answer` tool to provide the final answer to the task. >>> Agent is executing the code below: steps = """ 1. **Prepare Your Model**: Ensure your model is trained and saved in a format compatible with the Hugging Face Transformers library. This typically involves saving the model and tokenizer using the `save_pretrained` method. 2. **Install Dependencies**: Install the `huggingface_hub` library if you haven't already. You can do this using pip: ```bash pip install huggingface_hub ``` 3. **Login to Hugging Face**: You need to be logged in to your Hugging Face account. Use the `huggingface-cli` to log in: ```bash huggingface-cli login ``` This will prompt you to enter your Hugging Face API token, which you can find in your account settings on the Hugging Face website. 4. **Create a Repository**: Create a new repository on the Hugging Face Hub using the `create_repo` command. Replace `<model_name>` with the name you want for your model: ```bash huggingface-cli repo create <model_name> --organization <your_organization> --private ``` If you are creating a personal model, you can omit the `--organization` flag. 5. **Upload the Model**: Use the `push_to_hub` method to upload your model to the Hugging Face Hub. This method is available in the `transformers` library and can be used as follows: ```python from transformers import AutoModel, AutoTokenizer model = AutoModel.from_pretrained('path/to/your/model') tokenizer = AutoTokenizer.from_pretrained('path/to/your/tokenizer') model.push_to_hub('<model_name>') tokenizer.push_to_hub('<model_name>') ``` 6. **Verify the Upload**: Once the model is uploaded, you can verify it by navigating to your Hugging Face profile or the specified organization's page and checking if the model is listed. """ final_answer(steps) ==== Print outputs: Last output from code snippet: 1. **Prepare Your Model**: Ensure your model is trained and saved in a format compatible with the Hugging Face Transformers library. This typically involves saving the model and tokenizer using the `save_pretrained` method. 2. **Install Dependencies**: Install the `huggingface_hub` library if you haven't already. You can do this using pip: ```bash pip install huggingface_hub ``` 3. **Login to Hugging Face**: You need to be logged in to your Hugging Face account. Use the `huggingface-cli` to log in: ```bash huggingface-cli login ``` This will prompt you to enter your Hugging Face API token, which you can find in your account settings on the Hugging Face website. 4. **Create a Repository**: Create a new repository on the Hugging Face Hub using the `create_repo` command. Replace `<model_name>` with the name you want for your model: ```bash huggingface-cli repo create <model_name> --organization <your_organization> --private ``` If you are creating a personal model, you can omit the `--organization` flag. 5. **Upload the Model**: Use the `push_to_hub` method to upload your model to the Hugging Face Hub. This method is available in the `transformers` library and can be used as follows: ```python from transformers import AutoModel, AutoTokenizer model = AutoModel.from_pretrained('path/to/your/model') tokenizer = AutoTokenizer.from_pretrained('path/to/your/tokenizer') model.push_to_hub('<model_name>') tokenizer.push_to_hub('<model_name>') ``` 6. **Verify the Upload**: Once the model is uploaded, you can verify it by navigating to your Hugging Face profile or the specified organization's page and checking if the model is listed. Final answer: 1. **Prepare Your Model**: Ensure your model is trained and saved in a format compatible with the Hugging Face Transformers library. This typically involves saving the model and tokenizer using the `save_pretrained` method. 2. **Install Dependencies**: Install the `huggingface_hub` library if you haven't already. You can do this using pip: ```bash pip install huggingface_hub ``` 3. **Login to Hugging Face**: You need to be logged in to your Hugging Face account. Use the `huggingface-cli` to log in: ```bash huggingface-cli login ``` This will prompt you to enter your Hugging Face API token, which you can find in your account settings on the Hugging Face website. 4. **Create a Repository**: Create a new repository on the Hugging Face Hub using the `create_repo` command. Replace `<model_name>` with the name you want for your model: ```bash huggingface-cli repo create <model_name> --organization <your_organization> --private ``` If you are creating a personal model, you can omit the `--organization` flag. 5. **Upload the Model**: Use the `push_to_hub` method to upload your model to the Hugging Face Hub. This method is available in the `transformers` library and can be used as follows: ```python from transformers import AutoModel, AutoTokenizer model = AutoModel.from_pretrained('path/to/your/model') tokenizer = AutoTokenizer.from_pretrained('path/to/your/tokenizer') model.push_to_hub('<model_name>') tokenizer.push_to_hub('<model_name>') ``` 6. **Verify the Upload**: Once the model is uploaded, you can verify it by navigating to your Hugging Face profile or the specified organization's page and checking if the model is listed.
"\n1. **Prepare Your Model**: Ensure your model is trained and saved in a format compatible with the Hugging Face Transformers library. This typically involves saving the model and tokenizer using the `save_pretrained` method.\n\n2. **Install Dependencies**: Install the `huggingface_hub` library if you haven't already. You can do this using pip:\n ```bash\n pip install huggingface_hub\n ```\n\n3. **Login to Hugging Face**: You need to be logged in to your Hugging Face account. Use the `huggingface-cli` to log in:\n ```bash\n huggingface-cli login\n ```\n This will prompt you to enter your Hugging Face API token, which you can find in your account settings on the Hugging Face website.\n\n4. **Create a Repository**: Create a new repository on the Hugging Face Hub using the `create_repo` command. Replace `<model_name>` with the name you want for your model:\n ```bash\n huggingface-cli repo create <model_name> --organization <your_organization> --private\n ```\n If you are creating a personal model, you can omit the `--organization` flag.\n\n5. **Upload the Model**: Use the `push_to_hub` method to upload your model to the Hugging Face Hub. This method is available in the `transformers` library and can be used as follows:\n ```python\n from transformers import AutoModel, AutoTokenizer\n\n model = AutoModel.from_pretrained('path/to/your/model')\n tokenizer = AutoTokenizer.from_pretrained('path/to/your/tokenizer')\n\n model.push_to_hub('<model_name>')\n tokenizer.push_to_hub('<model_name>')\n ```\n\n6. **Verify the Upload**: Once the model is uploaded, you can verify it by navigating to your Hugging Face profile or the specified organization's page and checking if the model is listed.\n" 3.4 Example trying to trigger the retriever agent for the PEFT issues knowledge base
======== New task ======== How do you combine multiple adapters in peft? === Agent thoughts: Thought: To combine multiple adapters in PEFT, I need to understand the mechanism for doing so. I will use the `retriever` tool to search for relevant information in the PEFT documentation or issues. >>> Agent is executing the code below: retriever(query="combine multiple adapters in peft") ==== Code execution failed due to the following error: ManagedAgent.__call__() missing 1 required positional argument: 'request' Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/transformers/agents/agents.py", line 1135, in step result = self.python_evaluator( File "/usr/local/lib/python3.10/dist-packages/transformers/agents/python_interpreter.py", line 902, in evaluate_python_code result = evaluate_ast(node, state, static_tools, custom_tools, authorized_imports) File "/usr/local/lib/python3.10/dist-packages/transformers/agents/python_interpreter.py", line 782, in evaluate_ast return evaluate_ast(expression.value, state, static_tools, custom_tools) File "/usr/local/lib/python3.10/dist-packages/transformers/agents/python_interpreter.py", line 746, in evaluate_ast return evaluate_call(expression, state, static_tools, custom_tools) File "/usr/local/lib/python3.10/dist-packages/transformers/agents/python_interpreter.py", line 439, in evaluate_call output = func(*args, **kwargs) TypeError: ManagedAgent.__call__() missing 1 required positional argument: 'request' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/transformers/agents/agents.py", line 811, in direct_run step_logs = self.step() File "/usr/local/lib/python3.10/dist-packages/transformers/agents/agents.py", line 1154, in step raise AgentExecutionError(error_msg) transformers.agents.agents.AgentExecutionError: Code execution failed due to the following error: ManagedAgent.__call__() missing 1 required positional argument: 'request' === Agent thoughts: Thought: I need to provide a request to the `retriever` tool. Let's retry with the correct usage. >>> Agent is executing the code below: retriever(request="combine multiple adapters in peft") ==== ======== New task ======== You're a helpful agent named 'retriever'. You have been submitted this task by your manager. --- Task: combine multiple adapters in peft --- You're helping your manager solve a wider task: so make sure to not provide a one-line answer, but give as much information as possible so that they have a clear understanding of the answer. Your final_answer WILL HAVE to contain these parts: ### 1. Task outcome (short version): ### 2. Task outcome (extremely detailed version): ### 3. Additional context (if relevant): Put all these in your final_answer tool, everything that you do not pass as an argument to final_answer will be lost. And even if your task resolution is not successful, please return as much context as possible, so that your manager can act upon this feedback. === Agent thoughts: Thought: I will first use the retriever tool to find relevant documents that might provide information about combining multiple adapters in PEFT (Parameter-Efficient Fine-Tuning). >>> Calling tool: 'retriever' with arguments: {'query': 'combine multiple adapters in peft'} === Agent thoughts: Thought: I have retrieved documents that provide information about combining multiple adapters in PEFT. I will now summarize the key points and provide a detailed explanation. >>> Calling tool: 'final_answer' with arguments: {'answer': '### 1. Task outcome (short version):\nCombining multiple adapters in PEFT is possible but has some limitations. Currently, PEFT allows loading and inferring only adapters of the same type, such as LoRA or IA3. To combine multiple adapters, you can use the `add_weighted_adapter` function to add and merge adapters, but this functionality may not be fully implemented for adapter heads.\n\n### 2. Task outcome (extremely detailed version):\nCombining multiple adapters in PEFT is a crucial feature for parameter-efficient fine-tuning (PEFT) of large language models. Here are the details:\n\n1. **Adapter Types**: PEFT supports different adapter types, such as LoRA (Low-Rank Adaptation) and IA3 (Inducing Attention with Adapter). Each adapter type has its own benefits, such as reduced memory footprint and customizable model tuning.\n\n2. **Current Limitations**:\n - **Single Adapter Type**: Currently, PEFT allows loading and inferring only adapters of the same type. For example, you cannot load both LoRA and LoHa adapters and switch between them during inference. This means if you need to use multiple adapter types, you must unload all adapters of the first type before loading adapters of another type.\n\n3. **Combining Adapters**:\n - **`add_weighted_adapter` Function**: To combine multiple adapters of the same type, you can use the `add_weighted_adapter` function. This function allows you to add and merge adapters with weighted contributions. However, the merging of adapter heads, which are the output layers of the adapters, is not fully implemented in the `add_weighted_adapter` function.\n\n4. **Expected Behavior**:\n - **Merging Adapter Heads**: When merging multiple adapters using `add_weighted_adapter`, it is expected that the corresponding adapter heads should also be merged. However, the current implementation may not handle this merging, and you might need to ensure that the adapter heads are kept in sync manually.\n\n### 3. Additional context (if relevant):\n- **Documentation and References**:\n - [PEFT Documentation](https://huggingface.co/docs/transformers/main/peft)\n - [LoraModel.merge_and_unload](https://huggingface.co/docs/peft/package_reference/tuners#peft.LoraModel.merge_and_unload)\n - [LoraModel.add_weighted_adapter](https://github.com/huggingface/peft/blob/main/src/peft/tuners/lora/model.py#L520)\n\n- **GitHub Discussions**: There are ongoing discussions and potential solutions for this issue, such as keeping the adapter configurations in sync manually or implementing more robust merging functionality.'} Print outputs: Last output from code snippet: ### 1. Task outcome (short version): Combining multiple adapters in PEFT is possible but has some limitations. Currently, PEFT allows loading and inferring only adapters of the same type, such as LoRA or IA3. To combine multiple adapters, you can use the `add_weighted_adapter` function to add and merge adapters, but this functionality may not be fully implemented for adapter heads. ### 2. Task outcome (extremely detailed version): Combining multiple adapters in PEFT is a crucial feature for parameter-efficient fine-tuning (PEFT) of large language models. Here are the details: 1. **Adapter Types**: PEFT supports different adapter types, such as LoRA (Low-Rank Adaptation) and IA3 (Inducing Attention with Adapter). Each adapter type has its own benefits, such as reduced memory footprint and customizable model tuning. 2. **Current Limitations**: - **Single Adapter Type**: Currently, PEFT allows loading and inferring only adapters of the same type. For example, you cannot load both LoRA and LoHa adapters and switch between them during inference. This means if you need to use multiple adapter types, you must unload all adapters of the first type before loading adapters of another type. 3. **Combining Adapters**: - **`add_weighted_adapter` Function**: To combine multiple adapters of the same type, you can use the `add_weighted_adapter` function. This function allows you to add and merge adapters with weighted contributions. However, the merging of adapter heads, which are the output layers of the adapters, is not fully implemented in the `add_weighted_adapter` function. 4. **Expected Behavior**: - **Merging Adapter Heads**: When merging multiple adapters using `add_weighted_adapter`, it is expected that the corresponding adapter heads should also be merged. However, the current implementation may not handle this merging, and you might need to ensure that the adapter heads are kept in sync manually. ### 3. Additional context (if relevant): - **Documentation and References**: - [PEFT Documentation](https://huggingface.co/docs/transformers/main/peft) - [LoraModel.merge_and_unload](https://huggingface.co/docs/peft/package_reference/tuners#peft.LoraModel.merge_and_unload) - [LoraModel.add_weighted_adapter](https://github.com/huggingface/peft/blob/main/src/peft/tuners/lora/model.py#L520) - **GitHub Discussions**: There are ongoing discussions and potential solutions for this issue, such as keeping the adapter configurations in sync manually or implementing more robust merging functionality. === Agent thoughts: Thought: From the retrieved information, combining multiple adapters in PEFT is possible but has some limitations. I will summarize the key points and provide a high-level explanation of how to combine multiple adapters using the `add_weighted_adapter` function. >>> Agent is executing the code below: # Summary of combining multiple adapters in PEFT summary = """ Combining multiple adapters in PEFT is possible but has some limitations: 1. Only adapters of the same type (e.g., LoRA, IA3) can be loaded and inferred together. 2. Use the `add_weighted_adapter` function to add and merge adapters. 3. Merging adapter heads is not fully implemented and may require manual handling. To combine multiple adapters: 1. Ensure all adapters are of the same type. 2. Use the `add_weighted_adapter` function to merge adapters with weighted contributions. 3. Manually handle the merging of adapter heads if necessary. """ print(summary) ==== Print outputs: Combining multiple adapters in PEFT is possible but has some limitations: 1. Only adapters of the same type (e.g., LoRA, IA3) can be loaded and inferred together. 2. Use the `add_weighted_adapter` function to add and merge adapters. 3. Merging adapter heads is not fully implemented and may require manual handling. To combine multiple adapters: 1. Ensure all adapters are of the same type. 2. Use the `add_weighted_adapter` function to merge adapters with weighted contributions. 3. Manually handle the merging of adapter heads if necessary. === Agent thoughts: Thought: I have summarized the key points on how to combine multiple adapters in PEFT. Now, I will return this summary as the final answer. >>> Agent is executing the code below: final_answer(summary) ==== Print outputs: Last output from code snippet: Combining multiple adapters in PEFT is possible but has some limitations: 1. Only adapters of the same type (e.g., LoRA, IA3) can be loaded and inferred together. 2. Use the `add_weighted_adapter` function to add and merge adapters. 3. Merging adapter heads is not fully implemented and may require manual handling. To combine multiple adapters: 1. Ensure all adapters are of the same type. 2. Use the `add_weighted_adapter` function to merge adapters with weighted contributions. 3. Manually handle the merging of adapter heads if necessary. Final answer: Combining multiple adapters in PEFT is possible but has some limitations: 1. Only adapters of the same type (e.g., LoRA, IA3) can be loaded and inferred together. 2. Use the `add_weighted_adapter` function to add and merge adapters. 3. Merging adapter heads is not fully implemented and may require manual handling. To combine multiple adapters: 1. Ensure all adapters are of the same type. 2. Use the `add_weighted_adapter` function to merge adapters with weighted contributions. 3. Manually handle the merging of adapter heads if necessary.
'\nCombining multiple adapters in PEFT is possible but has some limitations:\n1. Only adapters of the same type (e.g., LoRA, IA3) can be loaded and inferred together.\n2. Use the `add_weighted_adapter` function to add and merge adapters.\n3. Merging adapter heads is not fully implemented and may require manual handling.\n\nTo combine multiple adapters:\n1. Ensure all adapters are of the same type.\n2. Use the `add_weighted_adapter` function to merge adapters with weighted contributions.\n3. Manually handle the merging of adapter heads if necessary.\n'
🏁 Final Thoughts
We have successfully built a multi-agent RAG system that integrates Web Search, Document Retrieval, and Image Generation agents, all orchestrated by a central agent manager. This architecture enables seamless task delegation, efficient processing, and the flexibility to handle a wide variety of user queries.
🔍 Explore More