Finetuning On Bedrock
Finetuning Claude 3 Haiku on Bedrock
In this notebook, we'll walk you through the process of finetuning Claude 3 Haiku on Amazon Bedrock
What You'll Need
- An AWS account with access to Bedrock
- A dataset (or you can use the sample dataset provided here)
- A service role capable of accessing the s3 bucket where you save your training data
Install Dependencies
Prep a Dataset
Your dataset for bedrock finetuning needs to be a JSONL file (i.e. a file with a json object on each line).
Each line in the JSONL file should be a JSON object with the following structure:
{
"system": "<optional_system_message>",
"messages": [
{"role": "user", "content": "user message"},
{"role": "assistant", "content": "assistant response"},
...
]
}
- The
systemfield is optional. - There must be at least two messages.
- The first message must be from the "user".
- The last message must be from the "assistant".
- User and assistant messages must alternate.
- No extraneous keys are allowed.
Sample Dataset - JSON Mode
We've included a sample dataset that teaches a model to respond to all questions with JSON. Here's what that dataset looks like:
Upload your dataset to S3
Your dataset for finetuning should be available on s3; for this demo we'll write the sample dataset to an s3 bucket you control
Launch Bedrock Finetuning Job
Now that you have your dataset ready, you can launch a finetuning job using boto3. First we'll configure a few parameters for the job:
Then we can launch the job with boto3
You can use this to check the status of your job while its training:
Use your finetuned model!
To use your finetuned model, you'll need to host it using Provisioned Throughput in Amazon Bedrock. Once your model is ready with Provisioned Throughput, you can invoked your model via the Bedrock API.