5.Advanced Optimizers
Welcome to the Notebook! π₯³
In this notebook, you will learn how to optimize your DSPy program using BootstrapFewShot, BootstrapFewShotWithRandomSearch, BootstrapFewShotOptuna, COPRO, and MIPRO!
We will use Cohere's Command-R+ and Command-R LLMs, as well as OpenAI's GPT-4. We will log LLM calls and pipeline traces with Arize Phoenix, and show how you can use Weights & Biases to monitor BootstrapFewShot runs, our first step in this integration!
We will also of course use the Weaviate database, storing and indexing the Weaviate blog posts.
A few requirements:
- You'll need a running Weaviate instance
- You can create a 14-day free cluster on WCS
- Or run Weaviate locally (use the
yamlfile in this folder withdocker-compose up -d)
- Generate Cohere and/or OpenAI API keys
- Installations
- weaviate-client
- dspy-ai
- Load your Weaviate cluster with data
- If you want to use the Weaviate blogs as the dataset, refer to the
Weaviate-Import.ipynbfile in this folder.
- If you want to use the Weaviate blogs as the dataset, refer to the
Connect DSPy to our LLMs and Weaviate
Test Connection
["Hello! How's it going?"] ['Hello! How can I assist you today?']
Connect to Arize Phoenix Observability
Existing running Phoenix instance detected! Shutting it down and starting a new instance... π To view the Phoenix app in your browser, visit http://localhost:6006/ πΊ To view the Phoenix app in a notebook, run `px.active_session().view()` π For more information on how to use Phoenix, check out https://docs.arize.com/phoenix
Overriding of current TracerProvider is not allowed Attempting to instrument while already instrumented
Load Dataset
Typed LLM Metrics
RAG
LGTM test query: What do cross encoders do? Uncompiled Answer: Cross-encoders are ranking models used for content-based re-ranking. They employ a classification mechanism rather than producing vector embeddings. The input consists of a pair of data items, such as two sentences, and the output is a similarity score indicating how similar the pair is. Cross-encoders are known for achieving high in-domain accuracy, but they are time-consuming compared to bi-encoders. LLM Metric Rating: 4.0
Evaluate the quality of a system's answer to a question according to a given criterion. --- Follow the following format. Criterion: The evaluation criterion. Question: The question asked to the system. Ground Truth Answer: An expert written Ground Truth Answer to the question. Predicted Answer: The system's answer to the question. Rating: A float rating between 1 and 5 (Respond with a single float value) --- Criterion: How aligned is the predicted_answer with the ground_truth? Question: What is the syntax error in the provided GraphQL query example related to the evaluation of n-gram matches? Ground Truth Answer: The syntax error in the provided GraphQL query example is the missing comma between the `bm25` and `where` arguments in the `JobListing` function. This error could potentially affect the evaluation of n-gram matches by causing inaccurate keyword construction. Predicted Answer: The context describes an evaluation method that looks for a match between keywords derived from queries and the corresponding generated queries. One of the keywords expected in the query is "bm25", however, the provided GraphQL query example lacks a comma after "bm25", which is necessary for the query to execute successfully. Therefore, the syntax error related to the evaluation of n-gram matches in the provided context is the missing comma after the "bm25" keyword. Rating:5.0 (and 2 other completions)
'\n\n\nEvaluate the quality of a system\'s answer to a question according to a given criterion.\n\n---\n\nFollow the following format.\n\nCriterion: The evaluation criterion.\n\nQuestion: The question asked to the system.\n\nGround Truth Answer: An expert written Ground Truth Answer to the question.\n\nPredicted Answer: The system\'s answer to the question.\n\nRating: A float rating between 1 and 5 (Respond with a single float value)\n\n---\n\nCriterion: How aligned is the predicted_answer with the ground_truth?\n\nQuestion: What is the syntax error in the provided GraphQL query example related to the evaluation of n-gram matches?\n\nGround Truth Answer: The syntax error in the provided GraphQL query example is the missing comma between the `bm25` and `where` arguments in the `JobListing` function. This error could potentially affect the evaluation of n-gram matches by causing inaccurate keyword construction.\n\nPredicted Answer: The context describes an evaluation method that looks for a match between keywords derived from queries and the corresponding generated queries. One of the keywords expected in the query is "bm25", however, the provided GraphQL query example lacks a comma after "bm25", which is necessary for the query to execute successfully. Therefore, the syntax error related to the evaluation of n-gram matches in the provided context is the missing comma after the "bm25" keyword.\n\nRating:\x1b[32m5.0\x1b[0m\x1b[31m \t (and 2 other completions)\x1b[0m\n\n\n'
Average Metric: 46.5 / 10 (465.0%)
BootstrapFewShot
4%|ββ | 1/25 [00:01<00:34, 1.43s/it]
Bootstrapped 1 full traces after 2 examples in round 0.
Average Metric: 47.5 / 10 (475.0%)
Compiled RAG Score at Demos = 1: 475.0
8%|ββββ | 2/25 [00:02<00:31, 1.37s/it]
Bootstrapped 2 full traces after 3 examples in round 0. Average Metric: 45.75 / 10 (457.5%) Compiled RAG Score at Demos = 2: 457.5
12%|βββββ | 3/25 [00:03<00:27, 1.24s/it]
Bootstrapped 3 full traces after 4 examples in round 0. Average Metric: 45.5 / 10 (455.0%) Compiled RAG Score at Demos = 3: 455.0
Weights & Biases
Learn more about how you can use Weights & Biases logging for BootstrapFewShot runs here!
In [PR #849] to DSPy, we introduce wandb logging in order to see the metric_val returned for each bootstrapped example. To motivate the use case, you may have a rating on a scale of 1 to 5 for answers and you only want to use examples that achieve a 5 in your prompt. This is the first of many in our collaborations between Weaviate and Weights & Biases!
BootstrapFewShotWithRandomSearch
Going to sample between 1 and 2 traces per predictor. Will attempt to train 2 candidate sets.
Average Metric: 106.5 / 25 (426.0): 100%|ββββ| 25/25 [00:09<00:00, 2.65it/s]
Average Metric: 106.5 / 25 (426.0%) Score: 426.0 for set: [0] New best score: 426.0 for seed -3 Scores so far: [426.0] Best score: 426.0
Average Metric: 106.5 / 25 (426.0): 100%|ββββ| 25/25 [00:09<00:00, 2.64it/s]
Average Metric: 106.5 / 25 (426.0%) Score: 426.0 for set: [16] Scores so far: [426.0, 426.0] Best score: 426.0
8%|ββββ | 2/25 [00:02<00:27, 1.21s/it]
Bootstrapped 2 full traces after 3 examples in round 0.
Average Metric: 104.5 / 25 (418.0): 100%|ββββ| 25/25 [00:07<00:00, 3.33it/s]
Average Metric: 104.5 / 25 (418.0%) Score: 418.0 for set: [16] Scores so far: [426.0, 426.0, 418.0] Best score: 426.0 Average of max per entry across top 1 scores: 4.26 Average of max per entry across top 2 scores: 4.28 Average of max per entry across top 3 scores: 4.36 Average of max per entry across top 5 scores: 4.36 Average of max per entry across top 8 scores: 4.36 Average of max per entry across top 9999 scores: 4.36
8%|ββββ | 2/25 [00:02<00:32, 1.42s/it]
Bootstrapped 2 full traces after 3 examples in round 0.
Average Metric: 106.5 / 25 (426.0): 100%|ββββ| 25/25 [00:09<00:00, 2.71it/s]
Average Metric: 106.5 / 25 (426.0%) Score: 426.0 for set: [16] Scores so far: [426.0, 426.0, 418.0, 426.0] Best score: 426.0 Average of max per entry across top 1 scores: 4.26 Average of max per entry across top 2 scores: 4.28 Average of max per entry across top 3 scores: 4.42 Average of max per entry across top 5 scores: 4.46 Average of max per entry across top 8 scores: 4.46 Average of max per entry across top 9999 scores: 4.46
4%|ββ | 1/25 [00:01<00:47, 1.98s/it]
Bootstrapped 1 full traces after 2 examples in round 0.
Average Metric: 106.5 / 25 (426.0): 100%|ββββ| 25/25 [00:08<00:00, 3.11it/s]
Average Metric: 106.5 / 25 (426.0%)
Score: 426.0 for set: [16]
Scores so far: [426.0, 426.0, 418.0, 426.0, 426.0]
Best score: 426.0
Average of max per entry across top 1 scores: 4.26
Average of max per entry across top 2 scores: 4.28
Average of max per entry across top 3 scores: 4.42
Average of max per entry across top 5 scores: 4.62
Average of max per entry across top 8 scores: 4.62
Average of max per entry across top 9999 scores: 4.62
5 candidate programs found.
Average Metric: 46.5 / 10 (465.0%)
Compiled RAG Score: 465.0
Going to sample between 1 and 1 traces per predictor. Will attempt to train 5 candidate sets.
Average Metric: 106.5 / 25 (426.0): 100%|ββββ| 25/25 [00:09<00:00, 2.54it/s]
Average Metric: 106.5 / 25 (426.0%) Score: 426.0 for set: [0] New best score: 426.0 for seed -3 Scores so far: [426.0] Best score: 426.0
Average Metric: 106.5 / 25 (426.0): 100%|ββββ| 25/25 [00:09<00:00, 2.71it/s]
Average Metric: 106.5 / 25 (426.0%) Score: 426.0 for set: [16] Scores so far: [426.0, 426.0] Best score: 426.0
4%|ββ | 1/25 [00:01<00:26, 1.12s/it]
Bootstrapped 1 full traces after 2 examples in round 0.
Average Metric: 107.5 / 25 (430.0): 100%|ββββ| 25/25 [00:08<00:00, 2.87it/s]
Average Metric: 107.5 / 25 (430.0%) Score: 430.0 for set: [16] New best score: 430.0 for seed -1 Scores so far: [426.0, 426.0, 430.0] Best score: 430.0 Average of max per entry across top 1 scores: 4.3 Average of max per entry across top 2 scores: 4.44 Average of max per entry across top 3 scores: 4.48 Average of max per entry across top 5 scores: 4.48 Average of max per entry across top 8 scores: 4.48 Average of max per entry across top 9999 scores: 4.48
4%|ββ | 1/25 [00:01<00:44, 1.87s/it]
Bootstrapped 1 full traces after 2 examples in round 0.
Average Metric: 107.36 / 25 (429.4): 100%|βββ| 25/25 [00:38<00:00, 1.52s/it]
Average Metric: 107.36 / 25 (429.4%) Score: 429.44 for set: [16] Scores so far: [426.0, 426.0, 430.0, 429.44] Best score: 430.0 Average of max per entry across top 1 scores: 4.3 Average of max per entry across top 2 scores: 4.5344 Average of max per entry across top 3 scores: 4.6344 Average of max per entry across top 5 scores: 4.6744 Average of max per entry across top 8 scores: 4.6744 Average of max per entry across top 9999 scores: 4.6744
4%|ββ | 1/25 [00:02<00:52, 2.17s/it]
Bootstrapped 1 full traces after 2 examples in round 0.
Average Metric: 105.5 / 25 (422.0): 100%|ββββ| 25/25 [00:07<00:00, 3.17it/s]
Average Metric: 105.5 / 25 (422.0%) Score: 422.0 for set: [16] Scores so far: [426.0, 426.0, 430.0, 429.44, 422.0] Best score: 430.0 Average of max per entry across top 1 scores: 4.3 Average of max per entry across top 2 scores: 4.5344 Average of max per entry across top 3 scores: 4.6344 Average of max per entry across top 5 scores: 4.72 Average of max per entry across top 8 scores: 4.72 Average of max per entry across top 9999 scores: 4.72
4%|ββ | 1/25 [00:00<00:23, 1.01it/s]
Bootstrapped 1 full traces after 2 examples in round 0.
Average Metric: 106.0 / 25 (424.0): 100%|ββββ| 25/25 [00:07<00:00, 3.32it/s]
Average Metric: 106.0 / 25 (424.0%) Score: 424.0 for set: [16] Scores so far: [426.0, 426.0, 430.0, 429.44, 422.0, 424.0] Best score: 430.0 Average of max per entry across top 1 scores: 4.3 Average of max per entry across top 2 scores: 4.5344 Average of max per entry across top 3 scores: 4.6344 Average of max per entry across top 5 scores: 4.72 Average of max per entry across top 8 scores: 4.76 Average of max per entry across top 9999 scores: 4.76
4%|ββ | 1/25 [00:02<01:08, 2.84s/it]
Bootstrapped 1 full traces after 2 examples in round 0.
Average Metric: 106.5 / 25 (426.0): 100%|ββββ| 25/25 [00:10<00:00, 2.40it/s]
Average Metric: 106.5 / 25 (426.0%) Score: 426.0 for set: [16] Scores so far: [426.0, 426.0, 430.0, 429.44, 422.0, 424.0, 426.0] Best score: 430.0 Average of max per entry across top 1 scores: 4.3 Average of max per entry across top 2 scores: 4.5344 Average of max per entry across top 3 scores: 4.6344 Average of max per entry across top 5 scores: 4.7 Average of max per entry across top 8 scores: 4.78 Average of max per entry across top 9999 scores: 4.78
4%|ββ | 1/25 [00:01<00:45, 1.89s/it]
Bootstrapped 1 full traces after 2 examples in round 0.
Average Metric: 106.5 / 25 (426.0): 100%|ββββ| 25/25 [00:34<00:00, 1.37s/it]
Average Metric: 106.5 / 25 (426.0%)
Score: 426.0 for set: [16]
Scores so far: [426.0, 426.0, 430.0, 429.44, 422.0, 424.0, 426.0, 426.0]
Best score: 430.0
Average of max per entry across top 1 scores: 4.3
Average of max per entry across top 2 scores: 4.5344
Average of max per entry across top 3 scores: 4.6344
Average of max per entry across top 5 scores: 4.7
Average of max per entry across top 8 scores: 4.78
Average of max per entry across top 9999 scores: 4.78
8 candidate programs found.
Average Metric: 47.5 / 10 (475.0%)
Compiled RAG Score: 475.0
BootstrapFewShotWithOptuna
Going to sample between 1 and 2 traces per predictor. Will attempt to train 2 candidate sets.
8%|ββββ | 2/25 [00:02<00:26, 1.17s/it] [I 2024-04-14 19:53:22,635] A new study created in memory with name: no-name-2b4e80fd-1155-4f77-8f2c-bb040d481e89
Bootstrapped 2 full traces after 3 examples in round 0.
Average Metric: 106.5 / 25 (426.0): 100%|ββββ| 25/25 [00:09<00:00, 2.71it/s]
[I 2024-04-14 19:53:31,878] Trial 0 finished with value: 426.0 and parameters: {'demo_index_for_generate_answer': 12}. Best is trial 0 with value: 426.0.
Average Metric: 106.5 / 25 (426.0%)
Average Metric: 106.5 / 25 (426.0): 100%|ββββ| 25/25 [00:09<00:00, 2.78it/s]
[I 2024-04-14 19:53:40,901] Trial 1 finished with value: 426.0 and parameters: {'demo_index_for_generate_answer': 8}. Best is trial 0 with value: 426.0.
Average Metric: 106.5 / 25 (426.0%)
Best score: 426.0
Best program: generate_answer = Predict(GenerateAnswer(context, question -> answer
instructions='Assess the the context and answer the question.'
context = Field(annotation=str required=True json_schema_extra={'desc': 'Helpful information for answering the question.', '__dspy_field_type': 'input', 'prefix': 'Context:'})
question = Field(annotation=str required=True json_schema_extra={'__dspy_field_type': 'input', 'prefix': 'Question:', 'desc': '${question}'})
answer = Field(annotation=str required=True json_schema_extra={'desc': 'A detailed answer that is supported by the context.', '__dspy_field_type': 'output', 'prefix': 'Answer:'})
))
Average Metric: 46.5 / 10 (465.0%)
Compiled RAG Score: 465.0
COPRO
Iteration Depth: 1/3. At Depth 1/3, Evaluating Prompt Candidate #1/5 for Predictor 1 of 1.
Average Metric: 12.5 / 3 (416.7): 100%|ββββββββ| 3/3 [00:04<00:00, 1.54s/it]
Average Metric: 12.5 / 3 (416.7%)
/Users/cshorten/Desktop/DSPy-local/myenv/lib/python3.10/site-packages/dspy_ai-2.4.1-py3.10.egg/dspy/evaluate/evaluate.py:266: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['4.0' '4.5' '4.0']' has dtype incompatible with float64, please explicitly cast to a compatible dtype first. df.loc[:, metric_name] = df[metric_name].apply(
At Depth 1/3, Evaluating Prompt Candidate #2/5 for Predictor 1 of 1.
Average Metric: 12.0 / 3 (400.0): 100%|ββββββββ| 3/3 [00:03<00:00, 1.14s/it]
Average Metric: 12.0 / 3 (400.0%)
/Users/cshorten/Desktop/DSPy-local/myenv/lib/python3.10/site-packages/dspy_ai-2.4.1-py3.10.egg/dspy/evaluate/evaluate.py:266: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['4.0' '4.0' '4.0']' has dtype incompatible with float64, please explicitly cast to a compatible dtype first. df.loc[:, metric_name] = df[metric_name].apply(
At Depth 1/3, Evaluating Prompt Candidate #3/5 for Predictor 1 of 1.
Average Metric: 12.0 / 3 (400.0): 100%|ββββββββ| 3/3 [00:03<00:00, 1.32s/it]
Average Metric: 12.0 / 3 (400.0%)
/Users/cshorten/Desktop/DSPy-local/myenv/lib/python3.10/site-packages/dspy_ai-2.4.1-py3.10.egg/dspy/evaluate/evaluate.py:266: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['4.0' '4.0' '4.0']' has dtype incompatible with float64, please explicitly cast to a compatible dtype first. df.loc[:, metric_name] = df[metric_name].apply(
At Depth 1/3, Evaluating Prompt Candidate #4/5 for Predictor 1 of 1.
Average Metric: 13.0 / 3 (433.3): 100%|ββββββββ| 3/3 [00:04<00:00, 1.36s/it]
Average Metric: 13.0 / 3 (433.3%)
/Users/cshorten/Desktop/DSPy-local/myenv/lib/python3.10/site-packages/dspy_ai-2.4.1-py3.10.egg/dspy/evaluate/evaluate.py:266: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['4.5' '4.5' '4.0']' has dtype incompatible with float64, please explicitly cast to a compatible dtype first. df.loc[:, metric_name] = df[metric_name].apply(
At Depth 1/3, Evaluating Prompt Candidate #5/5 for Predictor 1 of 1.
Average Metric: 12.5 / 3 (416.7): 100%|ββββββββ| 3/3 [00:04<00:00, 1.50s/it]
Average Metric: 12.5 / 3 (416.7%)
/Users/cshorten/Desktop/DSPy-local/myenv/lib/python3.10/site-packages/dspy_ai-2.4.1-py3.10.egg/dspy/evaluate/evaluate.py:266: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['4.0' '4.5' '4.0']' has dtype incompatible with float64, please explicitly cast to a compatible dtype first. df.loc[:, metric_name] = df[metric_name].apply(
Iteration Depth: 2/3. At Depth 2/3, Evaluating Prompt Candidate #1/5 for Predictor 1 of 1.
Average Metric: 12.0 / 3 (400.0): 100%|ββββββββ| 3/3 [00:04<00:00, 1.48s/it]
Average Metric: 12.0 / 3 (400.0%)
/Users/cshorten/Desktop/DSPy-local/myenv/lib/python3.10/site-packages/dspy_ai-2.4.1-py3.10.egg/dspy/evaluate/evaluate.py:266: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['4.0' '4.0' '4.0']' has dtype incompatible with float64, please explicitly cast to a compatible dtype first. df.loc[:, metric_name] = df[metric_name].apply(
At Depth 2/3, Evaluating Prompt Candidate #2/5 for Predictor 1 of 1.
Average Metric: 12.75 / 3 (425.0): 100%|βββββββ| 3/3 [00:05<00:00, 1.70s/it]
Average Metric: 12.75 / 3 (425.0%)
/Users/cshorten/Desktop/DSPy-local/myenv/lib/python3.10/site-packages/dspy_ai-2.4.1-py3.10.egg/dspy/evaluate/evaluate.py:266: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['4.75' '4.0' '4.0']' has dtype incompatible with float64, please explicitly cast to a compatible dtype first. df.loc[:, metric_name] = df[metric_name].apply(
At Depth 2/3, Evaluating Prompt Candidate #3/5 for Predictor 1 of 1.
Average Metric: 13.0 / 3 (433.3): 100%|ββββββββ| 3/3 [00:04<00:00, 1.51s/it]
Average Metric: 13.0 / 3 (433.3%)
/Users/cshorten/Desktop/DSPy-local/myenv/lib/python3.10/site-packages/dspy_ai-2.4.1-py3.10.egg/dspy/evaluate/evaluate.py:266: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['4.5' '4.5' '4.0']' has dtype incompatible with float64, please explicitly cast to a compatible dtype first. df.loc[:, metric_name] = df[metric_name].apply(
At Depth 2/3, Evaluating Prompt Candidate #4/5 for Predictor 1 of 1.
Average Metric: 12.0 / 3 (400.0): 100%|ββββββββ| 3/3 [00:06<00:00, 2.31s/it]
Average Metric: 12.0 / 3 (400.0%)
/Users/cshorten/Desktop/DSPy-local/myenv/lib/python3.10/site-packages/dspy_ai-2.4.1-py3.10.egg/dspy/evaluate/evaluate.py:266: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['4.0' '4.0' '4.0']' has dtype incompatible with float64, please explicitly cast to a compatible dtype first. df.loc[:, metric_name] = df[metric_name].apply(
At Depth 2/3, Evaluating Prompt Candidate #5/5 for Predictor 1 of 1.
Average Metric: 12.0 / 3 (400.0): 100%|ββββββββ| 3/3 [00:04<00:00, 1.56s/it]
Average Metric: 12.0 / 3 (400.0%)
/Users/cshorten/Desktop/DSPy-local/myenv/lib/python3.10/site-packages/dspy_ai-2.4.1-py3.10.egg/dspy/evaluate/evaluate.py:266: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['4.0' '4.0' '4.0']' has dtype incompatible with float64, please explicitly cast to a compatible dtype first. df.loc[:, metric_name] = df[metric_name].apply(
Iteration Depth: 3/3. At Depth 3/3, Evaluating Prompt Candidate #1/5 for Predictor 1 of 1.
Average Metric: 12.75 / 3 (425.0): 100%|βββββββ| 3/3 [00:09<00:00, 3.05s/it]
Average Metric: 12.75 / 3 (425.0%)
/Users/cshorten/Desktop/DSPy-local/myenv/lib/python3.10/site-packages/dspy_ai-2.4.1-py3.10.egg/dspy/evaluate/evaluate.py:266: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['4.75' '4.0' '4.0']' has dtype incompatible with float64, please explicitly cast to a compatible dtype first. df.loc[:, metric_name] = df[metric_name].apply(
At Depth 3/3, Evaluating Prompt Candidate #2/5 for Predictor 1 of 1.
Average Metric: 14.5 / 3 (483.3): 100%|ββββββββ| 3/3 [00:04<00:00, 1.46s/it]
Average Metric: 14.5 / 3 (483.3%)
/Users/cshorten/Desktop/DSPy-local/myenv/lib/python3.10/site-packages/dspy_ai-2.4.1-py3.10.egg/dspy/evaluate/evaluate.py:266: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['5.0' '4.5' '5.0']' has dtype incompatible with float64, please explicitly cast to a compatible dtype first. df.loc[:, metric_name] = df[metric_name].apply(
At Depth 3/3, Evaluating Prompt Candidate #3/5 for Predictor 1 of 1.
Average Metric: 13.0 / 3 (433.3): 100%|ββββββββ| 3/3 [00:03<00:00, 1.19s/it]
Average Metric: 13.0 / 3 (433.3%)
/Users/cshorten/Desktop/DSPy-local/myenv/lib/python3.10/site-packages/dspy_ai-2.4.1-py3.10.egg/dspy/evaluate/evaluate.py:266: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['4.5' '4.5' '4.0']' has dtype incompatible with float64, please explicitly cast to a compatible dtype first. df.loc[:, metric_name] = df[metric_name].apply(
At Depth 3/3, Evaluating Prompt Candidate #4/5 for Predictor 1 of 1.
Average Metric: 12.5 / 3 (416.7): 100%|ββββββββ| 3/3 [00:03<00:00, 1.20s/it]
Average Metric: 12.5 / 3 (416.7%)
/Users/cshorten/Desktop/DSPy-local/myenv/lib/python3.10/site-packages/dspy_ai-2.4.1-py3.10.egg/dspy/evaluate/evaluate.py:266: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['4.0' '4.5' '4.0']' has dtype incompatible with float64, please explicitly cast to a compatible dtype first. df.loc[:, metric_name] = df[metric_name].apply(
At Depth 3/3, Evaluating Prompt Candidate #5/5 for Predictor 1 of 1.
Average Metric: 12.0 / 3 (400.0): 100%|ββββββββ| 3/3 [00:05<00:00, 1.90s/it]
Average Metric: 12.0 / 3 (400.0%)
/Users/cshorten/Desktop/DSPy-local/myenv/lib/python3.10/site-packages/dspy_ai-2.4.1-py3.10.egg/dspy/evaluate/evaluate.py:266: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['4.0' '4.0' '4.0']' has dtype incompatible with float64, please explicitly cast to a compatible dtype first. df.loc[:, metric_name] = df[metric_name].apply(
Average Metric: 44.5 / 10 (445.0%) 445.0
Ref2Vec, short for reference-to-vector, is a Weaviate 1.16 module that enables vectorization of data objects while incorporating cross-references to other objects. Essentially, it generates a vector representation of the referencing object by computing the average, or centroid vector, of its cross-referenced vectors. This approach holds value for use cases like recommendations, re-ranking, and representing lengthy objects. Ref2Vec offers an efficient and powerful way to implement recommendation systems. It allows for real-time updates of user vectors, factoring in user preferences and actions. The process is computationally lightweight, ensuring swift and relevant results. In essence, it transforms Weaviate into a tool for recommendations with "user-as-query," enabling swift development of features akin to a Home Feed.
Proposed Instruction: Delve into the provided material, meticulously parsing through each detail and subtlety to gain a comprehensive understanding of the subject matter. Once you have thoroughly analyzed the context, proceed to answer the question with precision and coherence. Your response should succinctly encapsulate the relevant points and be directly responsive to the question's core. For queries that are inherently subjective or require extrapolation beyond the given information, articulate a reasoned argument, underpinned by evidence from the material and sound logic. In instances where the information at hand is incomplete or ambiguous, transparently communicate the limitations, while providing an educated estimation or hypothesis, noting any premises upon which your inference is based. Your goal is to furnish an enlightening and instructive answer that not only addresses the query but also advances the overall understanding of the topic at hand.
---
Follow the following format.
Context: Helpful information for answering the question.
Question: ${question}
Enlightened Answer: A detailed answer that is supported by the context.
---
Context:
[1] Β«---
title: What is Ref2Vec and why you need it for your recommendation system
slug: ref2vec-centroid
authors: [connor]
date: 2022-11-23
tags: ['integrations', 'concepts']
image: ./img/hero.png
description: "Weaviate introduces Ref2Vec, a new module that utilises Cross-References for Recommendation!"
---

<!-- truncate -->
Weaviate 1.16 introduced the [Ref2Vec](/developers/weaviate/modules/retriever-vectorizer-modules/ref2vec-centroid) module. In this article, we give you an overview of what Ref2Vec is and some examples in which it can add value such as recommendations or representing long objects. ## What is Ref2Vec? The name Ref2Vec is short for reference-to-vector, and it offers the ability to vectorize a data object with its cross-references to other objects. The Ref2Vec module currently holds the name ref2vec-**centroid** because it uses the average, or centroid vector, of the cross-referenced vectors to represent the **referencing** object.Β»
[2] Β«As you have seen above, we think Ref2Vec can add value for use cases such as recommendations, re-ranking, overcoming the cold start problem and representing long objects. We are also excited to see what you build with Ref2Vec, and excited to build on this module with its future iterations. Speaking of which, we have another blog post coming soon on the development directions of Ref2Vec for the future. We will discuss topics such as **collaborative filtering**, **multiple centroids**, **graph neural networks**, and more on **re-ranking** with Ref2Vec. Stay tuned!
import WhatNext from '/_includes/what-next.mdx'
<WhatNext />Β»
[3] Β«In other words, the User vector is being updated in real-time here to take into account their preferences and actions, which helps to produce more relevant results at speed. Another benefit of Ref2Vec is that this calculation is not compute-heavy, leading to low overhead. With Ref2Vec, you can use Weaviate to provide Recommendation with "user-as-query". This is a very common and powerful way to build Home Feed style features in apps. This can be done by sending queries like this to Weaviate:
```graphql
{
Get {
Product (
nearObject: {
id: "8abc5-4d5..." # id for the User object with vector defined by ref2vec-centroid
}
) {
product_name
price
}
}
}
```
This short query encapsulates the power of Ref2Vec.Β»
Question: What is ref2vec?
Enlightened Answer:Ref2Vec, short for reference-to-vector, is a Weaviate 1.16 module that enables vectorization of data objects while incorporating cross-references to other objects. Essentially, it generates a vector representation of the referencing object by computing the average, or centroid vector, of its cross-referenced vectors. This approach holds value for use cases like recommendations, re-ranking, and representing lengthy objects.
Ref2Vec offers an efficient and powerful way to implement recommendation systems. It allows for real-time updates of user vectors, factoring in user preferences and actions. The process is computationally lightweight, ensuring swift and relevant results. In essence, it transforms Weaviate into a tool for recommendations with "user-as-query," enabling swift development of features akin to a Home Feed. (and 831 other completions)
Proposed Instruction: Delve into the provided material, meticulously parsing through each detail and subtlety to gain a comprehensive understanding of the subject matter. Once you have thoroughly analyzed the context, proceed to answer the question with precision and coherence. Your response should succinctly encapsulate the relevant points and be directly responsive to the question's core. For queries that are inherently subjective or require extrapolation beyond the given information, articulate a reasoned argument, underpinned by evidence from the material and sound logic. In instances where the information at hand is incomplete or ambiguous, transparently communicate the limitations, while providing an educated estimation or hypothesis, noting any premises upon which your inference is based. Your goal is to furnish an enlightening and instructive answer that not only addresses the query but also advances the overall understanding of the topic at hand.
---
Follow the following format.
Context: Helpful information for answering the question.
Question: ${question}
Enlightened Answer: A detailed answer that is supported by the context.
---
Context:
[1] Β«---
title: What is Ref2Vec and why you need it for your recommendation system
slug: ref2vec-centroid
authors: [connor]
date: 2022-11-23
tags: ['integrations', 'concepts']
image: ./img/hero.png
description: "Weaviate introduces Ref2Vec, a new module that utilises Cross-References for Recommendation!"
---

<!-- truncate -->
Weaviate 1.16 introduced the [Ref2Vec](/developers/weaviate/modules/retriever-vectorizer-modules/ref2vec-centroid) module. In this article, we give you an overview of what Ref2Vec is and some examples in which it can add value such as recommendations or representing long objects. ## What is Ref2Vec? The name Ref2Vec is short for reference-to-vector, and it offers the ability to vectorize a data object with its cross-references to other objects. The Ref2Vec module currently holds the name ref2vec-**centroid** because it uses the average, or centroid vector, of the cross-referenced vectors to represent the **referencing** object.Β»
[2] Β«As you have seen above, we think Ref2Vec can add value for use cases such as recommendations, re-ranking, overcoming the cold start problem and representing long objects. We are also excited to see what you build with Ref2Vec, and excited to build on this module with its future iterations. Speaking of which, we have another blog post coming soon on the development directions of Ref2Vec for the future. We will discuss topics such as **collaborative filtering**, **multiple centroids**, **graph neural networks**, and more on **re-ranking** with Ref2Vec. Stay tuned!
import WhatNext from '/_includes/what-next.mdx'
<WhatNext />Β»
[3] Β«In other words, the User vector is being updated in real-time here to take into account their preferences and actions, which helps to produce more relevant results at speed. Another benefit of Ref2Vec is that this calculation is not compute-heavy, leading to low overhead. With Ref2Vec, you can use Weaviate to provide Recommendation with "user-as-query". This is a very common and powerful way to build Home Feed style features in apps. This can be done by sending queries like this to Weaviate:
```graphql
{
Get {
Product (
nearObject: {
id: "8abc5-4d5..." # id for the User object with vector defined by ref2vec-centroid
}
) {
product_name
price
}
}
}
```
This short query encapsulates the power of Ref2Vec.Β»
Question: What is ref2vec?
Enlightened Answer:Ref2Vec, short for reference-to-vector, is a Weaviate 1.16 module that enables vectorization of data objects while incorporating cross-references to other objects. Essentially, it generates a vector representation of the referencing object by computing the average, or centroid vector, of its cross-referenced vectors. This approach holds value for use cases like recommendations, re-ranking, and representing lengthy objects.
Ref2Vec offers an efficient and powerful way to implement recommendation systems. It allows for real-time updates of user vectors, factoring in user preferences and actions. The process is computationally lightweight, ensuring swift and relevant results. In essence, it transforms Weaviate into a tool for recommendations with "user-as-query," enabling swift development of features akin to a Home Feed. (and 831 other completions)
Typed COPRO
[Work in Progress]
from dspy.evaluate.evaluate import Evaluate
evaluator_for_TypedCOPRO = Evaluate(metric=MetricWrapper, devset=devset, num_threads=4, display_progress=False)
from dspy.teleprompt import optimize_signature
TypedCOPRO_compiled_RAG = optimize_signature(
student=RAG(),
evaluator=evaluator_for_TypedCOPRO,
n_iterations=10,
sorted_order="increasing",
strategy="best",
max_examples=20,
prompt_model=command_r,
initial_prompts=2,
verbose=False,
)
eval_score = evaluate(TypedCOPRO_compiled_RAG, metric=MetricWrapper)
print(eval_score)
MIPRO
Start index: 0. DatasetDescriptor output: Observations: 1. The dataset contains a question-and-answer format, where each sample includes a specific question followed by a detailed answer that provides information or explains a concept related to technology, algorithms, or datasets. 2. The content is technical and seems to be focused on topics related to computer science, search algorithms, data compression, and machine learning libraries or tools. 3. The syntax is formal and precise, with a focus on clarity and accuracy. The language used is indicative of a knowledgeable source, possibly aimed at an audience familiar with the subject matter. 4. The answers are concise yet informative, aiming to provide a comprehensive explanation without unnecessary elaboration. They are structured to directly address the question posed. 5. There is a consistent use of technical terms and jargon, such as "Binary Independence Model," "Inverse Document Frequency," "CRUD support," and "semantic search," which suggests that the dataset is intended for users with a certain level of expertise or familiarity with these concepts. 6. The dataset may enable tasks such as training a machine learning model for a question-answering system, particularly one that specializes in technical domains or for use in educational tools that assist with learning about specific technical topics. 7. The presence of specific product names (e.g., Weaviate, LangChain, `text2vec-openai`) and the context provided suggests that the dataset could also be used for training models to support customer service or product documentation automation in the tech industry. 8. The answers sometimes reference the process of using the tools or concepts mentioned, which could indicate that the dataset may be used to train models that assist users in troubleshooting or implementing these technologies. Based on these observations, the dataset is likely designed to train AI models for technical question-answering systems, possibly for use in educational platforms, customer support for tech products, or as a knowledge base for developers and researchers in the field of computer science and data analytics. Summarizer output: The dataset is a technical Q&A collection focused on computer science topics, with formal and precise language aimed at an audience with some expertise. It appears to be designed for training AI models for specialized question-answering systems, potentially serving educational platforms, tech product support, or as a knowledge base for professionals in the field. The content is structured, concise, and rich with technical terminology, indicating its utility in advanced machine learning applications. Start index: 5. DatasetDescriptor output: Observations: 1. The dataset contains pairs of questions and answers, suggesting a format conducive to supervised learning, particularly for natural language processing tasks such as question answering or chatbot training. 2. The content is domain-specific, focusing on technology and software, which implies that the dataset is intended for applications requiring a certain level of technical understanding. 3. The answers are detailed and provide explanations or definitions, indicating that the dataset could be used to train models that need to generate informative and contextually relevant responses rather than simple factual replies. 4. The language used is formal and contains industry-specific jargon, which could mean that the dataset is less suitable for general-purpose language models and more for specialized applications. 5. The questions are direct and to the point, while the answers are comprehensive, suggesting that conciseness in questioning and thoroughness in response are characteristics of the dataset. 6. The dataset includes references to specific technologies, such as Weaviate and `all-miniLM-L6-v2`, which could be indicative of a focus on recent and emerging technologies, possibly for keeping an AI model updated with current trends. 7. The mention of practical applications, such as handling large-scale performance and providing real-time user-based recommendations, suggests that the dataset may be used to train models for real-world problem-solving in a business or production environment. Based on these observations, the dataset seems to be tailored for training specialized AI models that can understand and respond to technical queries in a professional context, such as customer support for tech companies, AI assistants for developers, or educational tools for computer science students. The emphasis on detailed, informative answers also indicates potential use in knowledge extraction and automated documentation systems. Summarizer output: The dataset is designed for supervised learning in natural language processing, specifically tailored to technology and software domains, and is suitable for training models to provide detailed, contextually relevant responses. It features formal language with industry jargon, direct questions, and comprehensive answers, indicating its suitability for specialized applications such as tech support chatbots or educational tools for technical subjects. The focus on current technologies and practical applications suggests its use in training AI for real-world tech industry problem-solving. Start index: 10. DatasetDescriptor output: Observations: - The dataset contains a question-and-answer format, which is consistent across all examples. This indicates that the dataset is likely intended for training models on question-answering tasks. - The questions are specific and technical, requiring detailed and precise answers, which suggests that the dataset is curated for a specialized domain rather than general knowledge. - The answers are concise yet informative, providing just enough detail to accurately address the question without unnecessary elaboration. This conciseness is important for models that aim to generate direct and to-the-point responses. - The content of the dataset includes references to specific technologies, programming concepts, and software tools, which implies that the dataset is tailored for an audience with a certain level of technical expertise. - The syntax used in the examples is indicative of programming and technical language, which could mean that the dataset is also useful for training models to understand and generate code-related text. - The dataset seems to cover a variety of topics within the tech domain, including search algorithms, programming syntax, AI models, real estate features, and AI methodologies, suggesting a broad coverage within the niche of technology. - The answers often include lists or enumerated points, which could help in training models to structure information in a logical and organized manner. Given the prior observations and the additional points noted, the dataset appears comprehensive for the intended task of training specialized AI models for technical question-answering systems. The observations cover topics, content, syntax, and conciseness, which are key aspects of the data. Therefore, I would say 'COMPLETE'. Summarizer output: The dataset is designed for training specialized AI models in technical question-answering tasks, with a focus on a variety of topics within the technology domain. It features a consistent question-and-answer format with precise, concise answers and technical language, indicating its suitability for an audience with technical expertise. Start index: 15. DatasetDescriptor output: Observations: 1. The dataset contains a structured format where each data point is a pair of a question and its corresponding answer, suggesting that it is likely intended for training conversational agents or question-answering systems. 2. The content is technical and domain-specific, focusing on the usage and features of a particular software product called Weaviate. This indicates that the dataset is tailored for a narrow AI application, possibly a chatbot or helpdesk AI that assists users with Weaviate-related queries. 3. The answers are not only factually informative but also include actionable instructions, such as command-line examples, which implies that the dataset could be used to train models that provide executable solutions rather than just descriptive information. 4. The syntax used in the dataset is formal and includes technical jargon, command-line code, and references to specific software features, which suggests that the model trained on this data would need to understand and generate domain-specific language. 5. The dataset shows a level of conciseness in the answers, aiming to provide the most relevant information without unnecessary elaboration. This brevity is important for user interfaces where quick and accurate responses are valued. 6. There is an educational aspect to the dataset, as it seems to be designed not only to answer questions but also to instruct users on how to perform certain tasks, which could indicate its use in an interactive learning environment or self-service portal. Given the prior observations and the additional points noted, the dataset appears comprehensive for the intended purpose of training a technical question-answering system for users of the Weaviate software. The observations cover topics, content, syntax, and conciseness, which are key aspects of the data. Therefore, unless there are other dimensions of the data not represented in the examples provided, I would say the observations are "COMPLETE". Summarizer output: The dataset is structured for training AI in technical support, specifically for the Weaviate software, with a focus on providing concise, actionable answers that include command-line instructions. It is designed to be used in a narrow AI application such as a chatbot or helpdesk AI, with an educational component to instruct users. The formal syntax and technical jargon indicate that the trained model will need to understand and generate domain-specific language. Start index: 20. DatasetDescriptor output: Observations: - The dataset consistently follows a question-and-answer format, with each question being directly related to technical aspects of Weaviate or related technologies. - Answers are structured in a step-by-step manner, often including specific commands, settings, or procedures to follow, which suggests a focus on practical, actionable guidance. - The language used is technical and assumes a certain level of prior knowledge about Weaviate, Kubernetes, and other related technologies, indicating that the target audience is likely to be users with some technical expertise. - There is an emphasis on troubleshooting and optimization, as seen in the discussions about metrics, memory usage, and performance, which implies that the dataset may be used to train AI to assist with system maintenance and performance tuning. - The dataset includes explanations of why certain steps or configurations are necessary, providing context to the instructions, which could be beneficial for educational purposes or to enhance user understanding. - The content is concise and to the point, avoiding unnecessary elaboration, which is suitable for users looking for quick solutions to their technical issues. Given the prior observations and the additional points noted, the dataset seems comprehensive for the intended purpose of training an AI model for technical support and assistance with Weaviate. The model trained on this dataset would likely be capable of providing specific, detailed technical support and could potentially serve as an educational tool for users looking to deepen their understanding of Weaviate and related technologies. If there are no significant deviations from the patterns observed in the provided examples throughout the rest of the dataset, I would say the observations are COMPLETE. Summarizer output: The dataset is tailored for an audience with technical expertise in Weaviate and related technologies, focusing on providing practical, step-by-step troubleshooting and optimization guidance. It is designed to train AI for technical support, offering concise, actionable instructions and contextual explanations, which could also serve educational purposes for users seeking to enhance their understanding of these technologies.
WARNING: Projected Language Model (LM) Calls Please be advised that based on the parameters you have set, the maximum number of LM calls is projected as follows: - Task Model: 25 examples in dev set * 3 trials * # of LM calls in your program = (75 * # of LM calls in your program) task model calls - Prompt Model: # data summarizer calls (max 10) + 10 * 1 lm calls in program = 20 prompt model calls Estimated Cost Calculation: Total Cost = (Number of calls to task model * (Avg Input Token Length per Call * Task Model Price per Input Token + Avg Output Token Length per Call * Task Model Price per Output Token) + (Number of calls to prompt model * (Avg Input Token Length per Call * Task Prompt Price per Input Token + Avg Output Token Length per Call * Prompt Model Price per Output Token). For a preliminary estimate of potential costs, we recommend you perform your own calculations based on the task and prompt models you intend to use. If the projected costs exceed your budget or expectations, you may consider: - Reducing the number of trials (`num_trials`), the size of the trainset, or the number of LM calls in your program. - Using a cheaper task model to optimize the prompt. To proceed with the execution of this program, please confirm by typing 'y' for yes or 'n' for no. If you would like to bypass this confirmation step in future executions, set the `requires_permission_to_run` flag to `False`. Awaiting your input... Do you wish to continue? (y/n): y
0%| | 0/25 [00:00<?, ?it/s] 4%|ββ | 1/25 [00:02<00:52, 2.18s/it]
Bootstrapped 1 full traces after 2 examples in round 0.
0%| | 0/25 [00:00<?, ?it/s] 4%|ββ | 1/25 [00:01<00:24, 1.01s/it]
Bootstrapped 1 full traces after 2 examples in round 0.
0%| | 0/25 [00:00<?, ?it/s] 4%|ββ | 1/25 [00:02<00:53, 2.23s/it]
Bootstrapped 1 full traces after 2 examples in round 0.
0%| | 0/25 [00:00<?, ?it/s] 4%|ββ | 1/25 [00:02<01:08, 2.85s/it]
Bootstrapped 1 full traces after 2 examples in round 0.
0%| | 0/25 [00:00<?, ?it/s] 4%|ββ | 1/25 [00:00<00:22, 1.07it/s]
Bootstrapped 1 full traces after 2 examples in round 0.
0%| | 0/25 [00:00<?, ?it/s] 4%|ββ | 1/25 [00:01<00:41, 1.73s/it]
Bootstrapped 1 full traces after 2 examples in round 0.
0%| | 0/25 [00:00<?, ?it/s] 4%|ββ | 1/25 [00:01<00:46, 1.95s/it]
Bootstrapped 1 full traces after 2 examples in round 0.
0%| | 0/25 [00:00<?, ?it/s] 4%|ββ | 1/25 [00:01<00:34, 1.44s/it]
Bootstrapped 1 full traces after 2 examples in round 0.
0%| | 0/25 [00:00<?, ?it/s] 4%|ββ | 1/25 [00:01<00:31, 1.30s/it]
Bootstrapped 1 full traces after 2 examples in round 0.
[I 2024-04-01 23:09:02,814] A new study created in memory with name: no-name-0544ab56-44df-45a8-85a0-896ca66f8887
Starting trial #0
0%| | 0/25 [00:00<?, ?it/s]
Average Metric: 2.0 / 1 (200.0): 4%|β | 1/25 [00:02<01:01, 2.55s/it]
Average Metric: 6.5 / 2 (325.0): 8%|β | 2/25 [00:05<01:00, 2.62s/it]
Average Metric: 11.5 / 3 (383.3): 12%|β | 3/25 [00:07<00:53, 2.45s/it]
Average Metric: 15.5 / 4 (387.5): 16%|ββ | 4/25 [00:09<00:45, 2.16s/it]
Average Metric: 16.5 / 5 (330.0): 20%|ββ | 5/25 [00:13<01:02, 3.12s/it]
Average Metric: 20.5 / 6 (341.7): 24%|βββ | 6/25 [00:15<00:45, 2.40s/it]
Average Metric: 24.5 / 7 (350.0): 28%|βββ | 7/25 [00:17<00:44, 2.47s/it]
Average Metric: 29.5 / 8 (368.8): 32%|βββ | 8/25 [00:21<00:51, 3.01s/it]
Average Metric: 34.5 / 9 (383.3): 36%|ββββ | 9/25 [00:24<00:47, 3.00s/it]
Average Metric: 36.0 / 10 (360.0): 40%|βββ | 10/25 [00:30<00:56, 3.75s/it]
Average Metric: 40.0 / 11 (363.6): 44%|βββ | 11/25 [00:33<00:50, 3.61s/it]
Average Metric: 43.5 / 12 (362.5): 48%|ββββ | 12/25 [00:37<00:50, 3.86s/it]
Average Metric: 48.5 / 13 (373.1): 52%|ββββ | 13/25 [00:39<00:36, 3.06s/it]
Average Metric: 53.5 / 14 (382.1): 56%|ββββ | 14/25 [00:41<00:30, 2.74s/it]
Average Metric: 57.5 / 15 (383.3): 60%|βββββ | 15/25 [00:47<00:37, 3.72s/it]
Average Metric: 58.5 / 16 (365.6): 64%|βββββ | 16/25 [00:50<00:32, 3.61s/it]
Average Metric: 59.5 / 17 (350.0): 68%|βββββ | 17/25 [00:54<00:30, 3.82s/it]
Average Metric: 60.5 / 18 (336.1): 72%|βββββ | 18/25 [00:57<00:24, 3.46s/it]
Average Metric: 64.5 / 19 (339.5): 76%|ββββββ | 19/25 [01:00<00:19, 3.27s/it]
Average Metric: 65.5 / 20 (327.5): 80%|ββββββ | 20/25 [01:06<00:20, 4.07s/it]
Average Metric: 69.5 / 21 (331.0): 84%|ββββββ | 21/25 [01:13<00:19, 4.97s/it]
Average Metric: 74.5 / 22 (338.6): 88%|βββββββ| 22/25 [01:14<00:11, 3.93s/it]
Average Metric: 75.5 / 23 (328.3): 92%|βββββββ| 23/25 [01:18<00:07, 3.95s/it]
Average Metric: 76.5 / 24 (318.8): 96%|βββββββ| 24/25 [01:23<00:04, 4.09s/it]
Average Metric: 78.5 / 25 (314.0): 100%|βββββββ| 25/25 [01:28<00:00, 3.55s/it]
[I 2024-04-01 23:10:31,498] Trial 0 finished with value: 314.0 and parameters: {'11300723232_predictor_instruction': 1, '11300723232_predictor_demos': 1}. Best is trial 0 with value: 314.0.
Average Metric: 78.5 / 25 (314.0%) Starting trial #1
0%| | 0/25 [00:00<?, ?it/s]
Average Metric: 5.0 / 1 (500.0): 4%|β | 1/25 [00:01<00:29, 1.22s/it]
Average Metric: 9.0 / 2 (450.0): 8%|β | 2/25 [00:03<00:45, 1.96s/it]
Average Metric: 13.0 / 3 (433.3): 12%|β | 3/25 [00:04<00:33, 1.52s/it]
Average Metric: 17.0 / 4 (425.0): 16%|ββ | 4/25 [00:06<00:30, 1.44s/it]
Average Metric: 21.0 / 5 (420.0): 20%|ββ | 5/25 [00:08<00:34, 1.72s/it]
Average Metric: 25.0 / 6 (416.7): 24%|βββ | 6/25 [00:09<00:28, 1.52s/it]
Average Metric: 29.0 / 7 (414.3): 28%|βββ | 7/25 [00:11<00:28, 1.57s/it]
Average Metric: 34.0 / 8 (425.0): 32%|βββ | 8/25 [00:12<00:23, 1.39s/it]
Average Metric: 39.0 / 9 (433.3): 36%|ββββ | 9/25 [00:12<00:19, 1.22s/it]
Average Metric: 43.0 / 10 (430.0): 40%|βββ | 10/25 [00:15<00:23, 1.58s/it]
Average Metric: 48.0 / 11 (436.4): 44%|βββ | 11/25 [00:16<00:20, 1.46s/it]
Average Metric: 51.5 / 12 (429.2): 48%|ββββ | 12/25 [00:19<00:24, 1.86s/it]
Average Metric: 56.5 / 13 (434.6): 52%|ββββ | 13/25 [00:20<00:18, 1.57s/it]
Average Metric: 60.5 / 14 (432.1): 56%|ββββ | 14/25 [00:23<00:22, 2.08s/it]
Average Metric: 64.5 / 15 (430.0): 60%|βββββ | 15/25 [00:24<00:18, 1.89s/it]
Average Metric: 68.5 / 16 (428.1): 64%|βββββ | 16/25 [00:26<00:16, 1.85s/it]
Average Metric: 73.5 / 17 (432.4): 68%|βββββ | 17/25 [00:28<00:14, 1.85s/it]
Average Metric: 77.5 / 18 (430.6): 72%|βββββ | 18/25 [01:01<01:18, 11.26s/it]
Average Metric: 82.0 / 19 (431.6): 76%|ββββββ | 19/25 [01:03<00:50, 8.43s/it]
Average Metric: 86.0 / 20 (430.0): 80%|ββββββ | 20/25 [01:06<00:34, 6.82s/it]
Average Metric: 90.5 / 21 (431.0): 84%|ββββββ | 21/25 [01:09<00:22, 5.71s/it]
Average Metric: 95.5 / 22 (434.1): 88%|βββββββ| 22/25 [01:10<00:12, 4.30s/it]
Average Metric: 100.0 / 23 (434.8): 92%|ββββββ| 23/25 [01:12<00:06, 3.49s/it]
Average Metric: 104.0 / 24 (433.3): 96%|ββββββ| 24/25 [01:15<00:03, 3.39s/it]
Average Metric: 108.0 / 25 (432.0): 100%|ββββββ| 25/25 [01:18<00:00, 3.15s/it]
[I 2024-04-01 23:11:50,187] Trial 1 finished with value: 432.0 and parameters: {'11300723232_predictor_instruction': 5, '11300723232_predictor_demos': 4}. Best is trial 1 with value: 432.0.
Average Metric: 108.0 / 25 (432.0%) Starting trial #2
0%| | 0/25 [00:00<?, ?it/s]
Average Metric: 4.0 / 1 (400.0): 4%|β | 1/25 [00:02<00:57, 2.41s/it]
Average Metric: 8.0 / 2 (400.0): 8%|β | 2/25 [00:04<00:52, 2.30s/it]
Average Metric: 12.0 / 3 (400.0): 12%|β | 3/25 [00:06<00:42, 1.95s/it]
Average Metric: 16.0 / 4 (400.0): 16%|ββ | 4/25 [00:07<00:39, 1.87s/it]
Average Metric: 20.0 / 5 (400.0): 20%|ββ | 5/25 [00:10<00:40, 2.03s/it]
Average Metric: 24.0 / 6 (400.0): 24%|βββ | 6/25 [00:12<00:36, 1.94s/it]
Average Metric: 28.0 / 7 (400.0): 28%|βββ | 7/25 [00:15<00:42, 2.38s/it]
Average Metric: 33.0 / 8 (412.5): 32%|βββ | 8/25 [00:16<00:33, 1.98s/it]
Average Metric: 37.0 / 9 (411.1): 36%|ββββ | 9/25 [00:17<00:27, 1.69s/it]
Average Metric: 41.5 / 10 (415.0): 40%|βββ | 10/25 [00:20<00:31, 2.08s/it]
Average Metric: 46.0 / 11 (418.2): 44%|βββ | 11/25 [00:22<00:28, 2.04s/it]
Average Metric: 50.0 / 12 (416.7): 48%|ββββ | 12/25 [00:24<00:26, 2.04s/it]
Average Metric: 55.0 / 13 (423.1): 52%|ββββ | 13/25 [00:25<00:22, 1.85s/it]
Average Metric: 59.0 / 14 (421.4): 56%|ββββ | 14/25 [00:28<00:24, 2.22s/it]
Average Metric: 63.0 / 15 (420.0): 60%|βββββ | 15/25 [00:32<00:26, 2.64s/it]
Average Metric: 67.5 / 16 (421.9): 64%|βββββ | 16/25 [00:34<00:22, 2.46s/it]
Average Metric: 72.5 / 17 (426.5): 68%|βββββ | 17/25 [00:36<00:18, 2.33s/it]
Average Metric: 76.5 / 18 (425.0): 72%|βββββ | 18/25 [00:40<00:19, 2.77s/it]
Average Metric: 80.5 / 19 (423.7): 76%|ββββββ | 19/25 [00:42<00:15, 2.65s/it]
Average Metric: 84.5 / 20 (422.5): 80%|ββββββ | 20/25 [00:49<00:18, 3.77s/it]
Average Metric: 88.5 / 21 (421.4): 84%|ββββββ | 21/25 [00:53<00:15, 3.80s/it]
Average Metric: 92.5 / 22 (420.5): 88%|βββββββ| 22/25 [00:54<00:09, 3.24s/it]
Average Metric: 96.5 / 23 (419.6): 92%|βββββββ| 23/25 [00:57<00:06, 3.00s/it]
Average Metric: 100.5 / 24 (418.8): 96%|ββββββ| 24/25 [01:00<00:03, 3.19s/it]
Average Metric: 104.5 / 25 (418.0): 100%|ββββββ| 25/25 [01:04<00:00, 2.58s/it]
[I 2024-04-01 23:12:54,594] Trial 2 finished with value: 418.0 and parameters: {'11300723232_predictor_instruction': 3, '11300723232_predictor_demos': 0}. Best is trial 1 with value: 432.0.
Average Metric: 104.5 / 25 (418.0%)
Returning generate_answer = Predict(StringSignature(context, question -> answer
instructions='Proposed Instruction: \nCarefully read the provided context, which may contain technical information, links, and code snippets related to advanced technology and machine learning topics. Your task is to interpret this information and provide a clear, concise, and accurate answer to the question posed. Use domain-specific language appropriate for an audience knowledgeable in computational linguistics or AI technologies. Ensure that your answer is context-rich and directly addresses the question, citing any specific resources or guides mentioned in the context when relevant.'
context = Field(annotation=str required=True json_schema_extra={'desc': 'Helpful information for answering the question.', '__dspy_field_type': 'input', 'prefix': 'Context:'})
question = Field(annotation=str required=True json_schema_extra={'__dspy_field_type': 'input', 'prefix': 'Question:', 'desc': '${question}'})
answer = Field(annotation=str required=True json_schema_extra={'desc': 'A detailed answer that is supported by the context.', '__dspy_field_type': 'output', 'prefix': 'Answer:'})
))
trial_logs[0][program].generate_answer = Predict(StringSignature(context, question -> answer
instructions='Proposed Instruction: Carefully read the provided context, which includes technical details about search ranking mechanisms and the fusion of results from BM25 and Dense search using reciprocal ranks. Your task is to analyze the information and apply the Reciprocal Rank Fusion (RRF) method as described to determine the combined rankings of the documents. Ensure your answer is precise, uses the correct technical terminology, and is structured in a way that would be clear to an individual with knowledge in computational linguistics or AI technologies. Provide a step-by-step explanation of the ranking fusion process and the final order of documents A, B, and C based on the combined rankings.'
context = Field(annotation=str required=True json_schema_extra={'desc': 'Helpful information for answering the question.', '__dspy_field_type': 'input', 'prefix': 'Context:'})
question = Field(annotation=str required=True json_schema_extra={'__dspy_field_type': 'input', 'prefix': 'Question:', 'desc': '${question}'})
answer = Field(annotation=str required=True json_schema_extra={'desc': 'A detailed answer that is supported by the context.', '__dspy_field_type': 'output', 'prefix': 'Answer with Explanation:'})
))
trial_logs[1][program].generate_answer = Predict(StringSignature(context, question -> answer
instructions='Proposed Instruction: \nCarefully read the provided context, which may contain technical information, links, and code snippets related to advanced technology and machine learning topics. Your task is to interpret this information and provide a clear, concise, and accurate answer to the question posed. Use domain-specific language appropriate for an audience knowledgeable in computational linguistics or AI technologies. Ensure that your answer is context-rich and directly addresses the question, citing any specific resources or guides mentioned in the context when relevant.'
context = Field(annotation=str required=True json_schema_extra={'desc': 'Helpful information for answering the question.', '__dspy_field_type': 'input', 'prefix': 'Context:'})
question = Field(annotation=str required=True json_schema_extra={'__dspy_field_type': 'input', 'prefix': 'Question:', 'desc': '${question}'})
answer = Field(annotation=str required=True json_schema_extra={'desc': 'A detailed answer that is supported by the context.', '__dspy_field_type': 'output', 'prefix': 'Answer:'})
))
trial_logs[2][program].generate_answer = Predict(StringSignature(context, question -> answer
instructions='Proposed Instruction: Carefully read the provided context, which includes technical details and descriptions related to advanced technology and machine learning topics. Your task is to synthesize the information and craft a precise, context-rich answer to the question posed. Ensure that your response is informative and uses domain-specific language appropriate for an audience knowledgeable in computational linguistics or AI technologies. If the exact answer is not directly stated in the context, use relevant information to provide a comprehensive explanation that addresses the question as thoroughly as possible. Highlight key features or functionalities when relevant to the question, and maintain a professional and authoritative tone throughout your response.'
context = Field(annotation=str required=True json_schema_extra={'desc': 'Helpful information for answering the question.', '__dspy_field_type': 'input', 'prefix': 'Context:'})
question = Field(annotation=str required=True json_schema_extra={'__dspy_field_type': 'input', 'prefix': 'Question:', 'desc': '${question}'})
answer = Field(annotation=str required=True json_schema_extra={'desc': 'A detailed answer that is supported by the context.', '__dspy_field_type': 'output', 'prefix': 'Expert Response:'})
)) from continue_program
Average Metric: 43.5 / 10 (435.0%)
435.0
Ref2Vec, short for reference-to-vector, is a Weaviate module that enables vectorization of a data object by incorporating cross-references to other objects. It utilizes the centroid vector, also known as the average of cross-referenced vectors, to represent the referencing object. Ref2Vec is valuable for recommendation systems, re-ranking, and representing long objects. It allows for efficient real-time updates, making it suitable for building Home Feed features. The module's latest iteration is called ref2vec-centroid, and Weaviate's blog has more details on its development directions.
Proposed Instruction:
Carefully read the provided context, which may contain technical information, links, and code snippets related to advanced technology and machine learning topics. Your task is to interpret this information and provide a clear, concise, and accurate answer to the question posed. Use domain-specific language appropriate for an audience knowledgeable in computational linguistics or AI technologies. Ensure that your answer is context-rich and directly addresses the question, citing any specific resources or guides mentioned in the context when relevant.
---
Follow the following format.
Context: Helpful information for answering the question.
Question: ${question}
Answer: A detailed answer that is supported by the context.
---
Context:
[1] Β«For more information about this new feature, read this [blog post](/blog/ref2vec-centroid) by Connor Shorten. Weaviate also has a bunch of example use cases on [GitHub](https://github.com/weaviate/weaviate-examples). Find your favorite example, give it a star, and try to recreate it yourself!
## Feature Comparison - Library versus Database
The table below summarizes the differences between vector libraries and databases. This is by no means an exhaustive list of features, and not every library or database has the same features. |**Feature**|**Vector Library**|**Vector Database** (Weaviate as an example)|
| ----- | ----- | ----- |
| Filtering (in combination with Vector Search)| No| Yes|
| Updatability (CRUD)| No (some do, e.g. hnswlib)| Yes|
| Incremental importing, concurrent reading while importing| No (some do, e.g. hnswlib) | Yes |
| Stores objects and vectors | No | Yes|
|Speed | Typically faster than full-blown database| Typically slower than pure library|
| Performance optimized for | In-memory similarity search| End2end-callstack, including: vector search, object retrieval from persistent storage, optional inverted index filtering, network requests, etc.Β»
[2] Β«This is why we have **Vector Databases** and **Vector Libraries**. They both use the Approximate Nearest Neighbor (ANN) algorithm to search through vectors in a tiny fraction of the time. You can learn more about this topic from ["Why Vectors Search is so Fast."](/blog/why-is-vector-search-so-fast)
## The Big Question
So, if both vector databases and vector libraries allow you to efficiently search through your vectors. What are the key differences between them, and why/when should you choose one over the other? ## Vector Libraries
Vector libraries store vector embeddings in in-memory indexes, in order to perform similarity search.Β»
[3] Β«Thus, an advantage of AI-native vector databases over vector-capable databases is their efficiency in vector search due to vector indexing. ### Vector Database vs. Vector Indexing Library
Similarly to vector databases, [vector libraries](https://weaviate.io/blog/vector-library-vs-vector-database) also enable fast vector search. However, vector libraries only store vector embeddings of data objects, and they store them in in-memory indexes. This results in two key differences:
1.Β»
Question: What are the differences between vector libraries and vector databases like Weaviate in terms of features such as filtering, updatability, and performance?
Answer: Here are the key differences between vector libraries and vector databases such as Weaviate: - Filtering: Vector databases support filtering, while vector libraries do not. - Updatability: Vector libraries typically do not support CRUD operations, whereas vector databases like Weaviate do. - Importing: Vector databases allow incremental importing and concurrent reading while importing, which is not a common feature in vector libraries. - Data Storage: Vector databases store objects and vectors, while vector libraries only store vectors. - Performance: Vector libraries are generally faster at in-memory similarity searches. Vector databases are slower due to the additional operations they perform, such as object retrieval from persistent storage. - Use Case: Vector libraries are optimized for similarity search, whereas vector databases are designed to handle the entire end2end callstack. Overall, the choice between a vector library and a vector database depends on your specific use case and requirements. Vector libraries excel at fast in-memory searches but lack the functionality to handle more complex operations. Vector databases, like Weaviate, offer a more comprehensive solution at the cost of slightly reduced search speed.
---
Context:
[1] Β«---
title: What is Ref2Vec and why you need it for your recommendation system
slug: ref2vec-centroid
authors: [connor]
date: 2022-11-23
tags: ['integrations', 'concepts']
image: ./img/hero.png
description: "Weaviate introduces Ref2Vec, a new module that utilises Cross-References for Recommendation!"
---

<!-- truncate -->
Weaviate 1.16 introduced the [Ref2Vec](/developers/weaviate/modules/retriever-vectorizer-modules/ref2vec-centroid) module. In this article, we give you an overview of what Ref2Vec is and some examples in which it can add value such as recommendations or representing long objects. ## What is Ref2Vec? The name Ref2Vec is short for reference-to-vector, and it offers the ability to vectorize a data object with its cross-references to other objects. The Ref2Vec module currently holds the name ref2vec-**centroid** because it uses the average, or centroid vector, of the cross-referenced vectors to represent the **referencing** object.Β»
[2] Β«As you have seen above, we think Ref2Vec can add value for use cases such as recommendations, re-ranking, overcoming the cold start problem and representing long objects. We are also excited to see what you build with Ref2Vec, and excited to build on this module with its future iterations. Speaking of which, we have another blog post coming soon on the development directions of Ref2Vec for the future. We will discuss topics such as **collaborative filtering**, **multiple centroids**, **graph neural networks**, and more on **re-ranking** with Ref2Vec. Stay tuned!
import WhatNext from '/_includes/what-next.mdx'
<WhatNext />Β»
[3] Β«In other words, the User vector is being updated in real-time here to take into account their preferences and actions, which helps to produce more relevant results at speed. Another benefit of Ref2Vec is that this calculation is not compute-heavy, leading to low overhead. With Ref2Vec, you can use Weaviate to provide Recommendation with "user-as-query". This is a very common and powerful way to build Home Feed style features in apps. This can be done by sending queries like this to Weaviate:
```graphql
{
Get {
Product (
nearObject: {
id: "8abc5-4d5..." # id for the User object with vector defined by ref2vec-centroid
}
) {
product_name
price
}
}
}
```
This short query encapsulates the power of Ref2Vec.Β»
Question: What is ref2vec?
Answer:Ref2Vec, short for reference-to-vector, is a Weaviate module that enables vectorization of a data object by incorporating cross-references to other objects. It utilizes the centroid vector, also known as the average of cross-referenced vectors, to represent the referencing object. Ref2Vec is valuable for recommendation systems, re-ranking, and representing long objects. It allows for efficient real-time updates, making it suitable for building Home Feed features. The module's latest iteration is called ref2vec-centroid, and Weaviate's blog has more details on its development directions. (and 592 other completions)
Proposed Instruction:
Carefully read the provided context, which may contain technical information, links, and code snippets related to advanced technology and machine learning topics. Your task is to interpret this information and provide a clear, concise, and accurate answer to the question posed. Use domain-specific language appropriate for an audience knowledgeable in computational linguistics or AI technologies. Ensure that your answer is context-rich and directly addresses the question, citing any specific resources or guides mentioned in the context when relevant.
---
Follow the following format.
Context: Helpful information for answering the question.
Question: ${question}
Answer: A detailed answer that is supported by the context.
---
Context:
[1] Β«For more information about this new feature, read this [blog post](/blog/ref2vec-centroid) by Connor Shorten. Weaviate also has a bunch of example use cases on [GitHub](https://github.com/weaviate/weaviate-examples). Find your favorite example, give it a star, and try to recreate it yourself!
## Feature Comparison - Library versus Database
The table below summarizes the differences between vector libraries and databases. This is by no means an exhaustive list of features, and not every library or database has the same features. |**Feature**|**Vector Library**|**Vector Database** (Weaviate as an example)|
| ----- | ----- | ----- |
| Filtering (in combination with Vector Search)| No| Yes|
| Updatability (CRUD)| No (some do, e.g. hnswlib)| Yes|
| Incremental importing, concurrent reading while importing| No (some do, e.g. hnswlib) | Yes |
| Stores objects and vectors | No | Yes|
|Speed | Typically faster than full-blown database| Typically slower than pure library|
| Performance optimized for | In-memory similarity search| End2end-callstack, including: vector search, object retrieval from persistent storage, optional inverted index filtering, network requests, etc.Β»
[2] Β«This is why we have **Vector Databases** and **Vector Libraries**. They both use the Approximate Nearest Neighbor (ANN) algorithm to search through vectors in a tiny fraction of the time. You can learn more about this topic from ["Why Vectors Search is so Fast."](/blog/why-is-vector-search-so-fast)
## The Big Question
So, if both vector databases and vector libraries allow you to efficiently search through your vectors. What are the key differences between them, and why/when should you choose one over the other? ## Vector Libraries
Vector libraries store vector embeddings in in-memory indexes, in order to perform similarity search.Β»
[3] Β«Thus, an advantage of AI-native vector databases over vector-capable databases is their efficiency in vector search due to vector indexing. ### Vector Database vs. Vector Indexing Library
Similarly to vector databases, [vector libraries](https://weaviate.io/blog/vector-library-vs-vector-database) also enable fast vector search. However, vector libraries only store vector embeddings of data objects, and they store them in in-memory indexes. This results in two key differences:
1.Β»
Question: What are the differences between vector libraries and vector databases like Weaviate in terms of features such as filtering, updatability, and performance?
Answer: Here are the key differences between vector libraries and vector databases such as Weaviate: - Filtering: Vector databases support filtering, while vector libraries do not. - Updatability: Vector libraries typically do not support CRUD operations, whereas vector databases like Weaviate do. - Importing: Vector databases allow incremental importing and concurrent reading while importing, which is not a common feature in vector libraries. - Data Storage: Vector databases store objects and vectors, while vector libraries only store vectors. - Performance: Vector libraries are generally faster at in-memory similarity searches. Vector databases are slower due to the additional operations they perform, such as object retrieval from persistent storage. - Use Case: Vector libraries are optimized for similarity search, whereas vector databases are designed to handle the entire end2end callstack. Overall, the choice between a vector library and a vector database depends on your specific use case and requirements. Vector libraries excel at fast in-memory searches but lack the functionality to handle more complex operations. Vector databases, like Weaviate, offer a more comprehensive solution at the cost of slightly reduced search speed.
---
Context:
[1] Β«---
title: What is Ref2Vec and why you need it for your recommendation system
slug: ref2vec-centroid
authors: [connor]
date: 2022-11-23
tags: ['integrations', 'concepts']
image: ./img/hero.png
description: "Weaviate introduces Ref2Vec, a new module that utilises Cross-References for Recommendation!"
---

<!-- truncate -->
Weaviate 1.16 introduced the [Ref2Vec](/developers/weaviate/modules/retriever-vectorizer-modules/ref2vec-centroid) module. In this article, we give you an overview of what Ref2Vec is and some examples in which it can add value such as recommendations or representing long objects. ## What is Ref2Vec? The name Ref2Vec is short for reference-to-vector, and it offers the ability to vectorize a data object with its cross-references to other objects. The Ref2Vec module currently holds the name ref2vec-**centroid** because it uses the average, or centroid vector, of the cross-referenced vectors to represent the **referencing** object.Β»
[2] Β«As you have seen above, we think Ref2Vec can add value for use cases such as recommendations, re-ranking, overcoming the cold start problem and representing long objects. We are also excited to see what you build with Ref2Vec, and excited to build on this module with its future iterations. Speaking of which, we have another blog post coming soon on the development directions of Ref2Vec for the future. We will discuss topics such as **collaborative filtering**, **multiple centroids**, **graph neural networks**, and more on **re-ranking** with Ref2Vec. Stay tuned!
import WhatNext from '/_includes/what-next.mdx'
<WhatNext />Β»
[3] Β«In other words, the User vector is being updated in real-time here to take into account their preferences and actions, which helps to produce more relevant results at speed. Another benefit of Ref2Vec is that this calculation is not compute-heavy, leading to low overhead. With Ref2Vec, you can use Weaviate to provide Recommendation with "user-as-query". This is a very common and powerful way to build Home Feed style features in apps. This can be done by sending queries like this to Weaviate:
```graphql
{
Get {
Product (
nearObject: {
id: "8abc5-4d5..." # id for the User object with vector defined by ref2vec-centroid
}
) {
product_name
price
}
}
}
```
This short query encapsulates the power of Ref2Vec.Β»
Question: What is ref2vec?
Answer:Ref2Vec, short for reference-to-vector, is a Weaviate module that enables vectorization of a data object by incorporating cross-references to other objects. It utilizes the centroid vector, also known as the average of cross-referenced vectors, to represent the referencing object. Ref2Vec is valuable for recommendation systems, re-ranking, and representing long objects. It allows for efficient real-time updates, making it suitable for building Home Feed features. The module's latest iteration is called ref2vec-centroid, and Weaviate's blog has more details on its development directions. (and 592 other completions)