RAG Cohere Weaviate V4 Client
Retrieval-Augmented Generation with Cohere language models on Amazon Bedrock and Weaviate vector database on AWS Market place
The example use case generates targeted advertisements for vacation stay listings based on a target audience. The goal is to use the user query for the target audience (e.g., “family with small children”) to retrieve the most relevant vacation stay listing (e.g., a listing with playgrounds close by) and then to generate an advertisement for the retrieved listing tailored to the target audience.
Note that the following code uses the newer v4 Weaviate Python client, which uses gRPC under the hood and is currently in beta (as of November 2023).
This notebook should work well with the Data Science 3.0 kernel in SageMaker Studio.
Dataset Overview
The dataset is available from Inside AirBnB and is licensed under a Creative Commons Attribution 4.0 International License.
Download the data and save it in a folder called data.
Prerequisites
To be able to follow along and use any AWS services in the following tutorial, please make sure you have an AWS account.
Step 1: Enable components of the AI-native technology stack
First, you will need to enable the relevant components discussed in the solution overview in your AWS account. First, enable access to the Cohere Command and Embed foundation models available on Amazon Bedrock through the AWS Management Console. Navigate to the Model access page, click on Edit, and select the foundation models of your choice.
Next, set up a Weaviate cluster. First, subscribe to the Weaviate Kubernetes Cluster on AWS Marketplace. Then, launch the software using a CloudFormation template according to your preferred availability zone. The CloudFormation template is pre-filled with default values. To follow along in this guide, edit the following fields:
- Stack name: Enter a stack name
- Authentication: It is recommended to enable authentication by setting helmauthenticationtype to apikey and defining a helmauthenticationapikey.
- Enabled modules: Make sure “tex2vec-aws” and “generative-aws” are present in the list of enabled modules within Weaviate.
This template takes about 30 minutes to complete.
Step 2: Connect to Weaviate
On the SageMaker console, navigate to Notebook instances and create a new notebook instance.
Then, install the Weaviate client package with the required dependencies:
Now, you can connect to your Weaviate instance with the following code. You can find the relevant information as follows:
- Weaviate URL: Access Weaviate via the load balancer URL. Go to the Services section of AWS, under EC2 > Load Balancers find the load balancer, and look for the DNS name column.
- Weaviate API Key: This is the key you set earlier in the CloudFormation template (helmauthenticationapikey).
- AWS Access Key: You can retrieve the access keys for your user in the AWS Identity and Access Management (IAM) Console.
Step 3: Configure the Amazon Bedrock module to enable Cohere models
Next, you will define a data collection (i.e., class) called Listings to store the listings’ data objects, which is analogous to creating a table in a relational database. In this step, you will configure the relevant modules to enable the usage of Cohere language models hosted on Amazon Bedrock natively from within the Weaviate vector database. The vectorizer ("text2vec-aws") and generative module ( "generative-aws") are specified in the data collection definition. Both of these modules take three parameters:
"service":"bedrock"for Amazon Bedrock (Alternatively,"sagemaker"for Amazon Sagemaker Jumpstart)"Region": The region where your model is deployed"model": The foundation model’s name
In this step, you will also define the structure of the data collection by configuring its properties. Aside from the property’s name and data type, you can also configure if only the data object shall be stored or if it shall be stored together with its vector embeddings. In this example, host_name and property_type are not vectorized.
Run the following code to create the collection in your Weaviate instance.
Step 4: Ingest data into the Weaviate vector database
You can now add objects to Weaviate. You will be using a batch import process for maximum efficiency. Run the code below to import data. During the import, Weaviate will use the defined vectorizer to create a vector embedding for each object. The following code loads objects initializes a batch process, and adds objects to the target collection one by one.
Step 5: Retrieval-Augmented Generation to generate targeted advertisements
Finally, you can build a RAG pipeline by implementing a generative search query on your Weaviate instance. For this, you will first define a prompt template in the form of an f-string that can take in the user query ({target_audience}) directly and the additional context ({{host_name}}, {{property_type}}, {{description}}, {{neighborhood_overview}}) from the vector database at runtime.
Next, you will run a generative search query. This prompts the defined generative model with a prompt that is comprised of the user query as well as the retrieved data. The following query retrieves one listing object (.with_limit(1)) from the Listings collection that is most similar to the user query (.with_near_text({"concepts": target_audience})). Then the user query (target_audience) and the retrieved listings properties (["description", "neighborhood", "host_name", "property_type"]) are fed into the prompt template.
Below, you can see that the results for the target_audience = “Family with small children”.
Here is another example for an elderly couple.