Question Answering Using Embeddings
Question answering using embeddings-based search
GPT excels at answering questions, but only on topics it remembers from its training data.
What should you do if you want GPT to answer questions about unfamiliar topics? E.g.,
- Recent events after October 2023 for GPT 4 series models
- Your non-public documents
- Information from past conversations
- etc.
This notebook demonstrates a two-step Search-Ask method for enabling GPT to answer questions using a library of reference text.
- Search: search your library of text for relevant text sections
- Ask: insert the retrieved text sections into a message to GPT and ask it the question
Why search is better than fine-tuning
GPT can learn knowledge in two ways:
- Via model weights (i.e., fine-tune the model on a training set)
- Via model inputs (i.e., insert the knowledge into an input message)
Although fine-tuning can feel like the more natural option—training on data is how GPT learned all of its other knowledge, after all—we generally do not recommend it as a way to teach the model knowledge. Fine-tuning is better suited to teaching specialized tasks or styles, and is less reliable for factual recall.
As an analogy, model weights are like long-term memory. When you fine-tune a model, it's like studying for an exam a week away. When the exam arrives, the model may forget details, or misremember facts it never read.
In contrast, message inputs are like short-term memory. When you insert knowledge into a message, it's like taking an exam with open notes. With notes in hand, the model is more likely to arrive at correct answers.
One downside of text search relative to fine-tuning is that each model is limited by a maximum amount of text it can read at once:
| Model | Maximum text length |
|---|---|
gpt-4o-mini | 128,000 tokens (~384 pages) |
gpt-4o | 128,000 tokens (~384 pages) |
Continuing the analogy, you can think of the model like a student who can only look at a few pages of notes at a time, despite potentially having shelves of textbooks to draw upon.
Therefore, to build a system capable of drawing upon large quantities of text to answer questions, we recommend using a Search-Ask approach.
Search
Text can be searched in many ways. E.g.,
- Lexical-based search
- Graph-based search
- Embedding-based search
This example notebook uses embedding-based search. Embeddings are simple to implement and work especially well with questions, as questions often don't lexically overlap with their answers.
Consider embeddings-only search as a starting point for your own system. Better search systems might combine multiple search methods, along with features like popularity, recency, user history, redundancy with prior search results, click rate data, etc. Q&A retrieval performance may also be improved with techniques like HyDE, in which questions are first transformed into hypothetical answers before being embedded. Similarly, GPT can also potentially improve search results by automatically transforming questions into sets of keywords or search terms.
Full procedure
Specifically, this notebook demonstrates the following procedure:
- Prepare search data (once per document)
- Collect: We'll download a few hundred Wikipedia articles about the 2022 Olympics
- Chunk: Documents are split into short, mostly self-contained sections to be embedded
- Embed: Each section is embedded with the OpenAI API
- Store: Embeddings are saved (for large datasets, use a vector database)
- Search (once per query)
- Given a user question, generate an embedding for the query from the OpenAI API
- Using the embeddings, rank the text sections by relevance to the query
- Ask (once per query)
- Insert the question and the most relevant sections into a message to GPT
- Return GPT's answer
Costs
Because GPT models are more expensive than embeddings search, a system with a decent volume of queries will have its costs dominated by step 3.
- For
gpt-4o, considering ~1000 tokens per query, it costs ~$0.0025 per query, or ~450 queries per dollar (as of Nov 2024) - For
gpt-4o-mini, using ~1000 tokens per query, it costs ~$0.00015 per query, or ~6000 queries per dollar (as of Nov 2024)
Of course, exact costs will depend on the system specifics and usage patterns.
Preamble
We'll begin by:
- Importing the necessary libraries
- Selecting models for embeddings search and question answering
Troubleshooting: Installing libraries
If you need to install any of the libraries above, run pip install {library_name} in your terminal.
For example, to install the openai library, run:
pip install openai
(You can also do this in a notebook cell with !pip install openai or %pip install openai.)
After installing, restart the notebook kernel so the libraries can be loaded.
Troubleshooting: Setting your API key
The OpenAI library will try to read your API key from the OPENAI_API_KEY environment variable. If you haven't already, you can set this environment variable by following these instructions.
Motivating example: GPT cannot answer questions about current events
Because the training data for gpt-4o-mini mostly ended in October 2023, the models cannot answer questions about more recent events, such as the 2024 Elections or recent games.
For example, let's try asking 'How many ?':
I'm sorry, but I don't have information on the outcomes of the 2024 Summer Olympics, including which athletes won the most gold medals. My training only includes data up to October 2023, and the Olympics are scheduled to take place in Paris from July 26 to August 11, 2024. You might want to check the latest updates from reliable sports news sources or the official Olympics website for the most current information.
In this case, the model has no knowledge of 2024 and is unable to answer the question. In a similar way, if you ask a question pertaining to a recent political event (that occured in Nov 2024 for example), GPT-4o-mini models will not be able to answer due to its knowledge cut-off date of Oct 2023.
I'm sorry, but I don't have information on events or elections that occurred after October 2023. For the latest updates on the 2024 US elections, I recommend checking reliable news sources.
You can give GPT knowledge about a topic by inserting it into an input message
To help give the model knowledge of curling at the 2022 Winter Olympics, we can copy and paste the top half of a relevant Wikipedia article into our message:
The countries that won the maximum number of gold, silver, and bronze medals respectively at the 2024 Summer Olympics are: - Gold: United States and China (tied with 40 gold medals each) - Silver: United States (44 silver medals) - Bronze: United States (42 bronze medals)
Thanks to the Wikipedia article included in the input message, GPT answers correctly.
Of course, this example partly relied on human intelligence. We knew the question was about summer olympics, so we inserted a Wikipedia article about 2024 paris olympics game.
The rest of this notebook shows how to automate this knowledge insertion with embeddings-based search.
1. Prepare search data
To save you the time & expense, we've prepared a pre-embedded dataset of a few hundred Wikipedia articles about the 2022 Winter Olympics.
To see how we constructed this dataset, or to modify it yourself, see Embedding Wikipedia articles for search.
2. Search
Now we'll define a search function that:
- Takes a user query and a dataframe with text & embedding columns
- Embeds the user query with the OpenAI API
- Uses distance between query embedding and text embeddings to rank the texts
- Returns two lists:
- The top N texts, ranked by relevance
- Their corresponding relevance scores
relatedness=0.630
'Curling at the 2022 Winter Olympics\n\n==Medal summary==\n\n===Medal table===\n\n{{Medals table\n | caption = \n | host = \n | flag_template = flagIOC\n | event = 2022 Winter\n | team = \n | gold_CAN = 0 | silver_CAN = 0 | bronze_CAN = 1\n | gold_ITA = 1 | silver_ITA = 0 | bronze_ITA = 0\n | gold_NOR = 0 | silver_NOR = 1 | bronze_NOR = 0\n | gold_SWE = 1 | silver_SWE = 0 | bronze_SWE = 2\n | gold_GBR = 1 | silver_GBR = 1 | bronze_GBR = 0\n | gold_JPN = 0 | silver_JPN = 1 | bronze_JPN - 0\n}}'relatedness=0.576
"Curling at the 2022 Winter Olympics\n\n==Results summary==\n\n===Men's tournament===\n\n====Playoffs====\n\n=====Gold medal game=====\n\n''Saturday, 19 February, 14:50''\n{{#lst:Curling at the 2022 Winter Olympics – Men's tournament|GM}}\n{{Player percentages\n| team1 = {{flagIOC|GBR|2022 Winter}}\n| [[Hammy McMillan Jr.]] | 95%\n| [[Bobby Lammie]] | 80%\n| [[Grant Hardie]] | 94%\n| [[Bruce Mouat]] | 89%\n| teampct1 = 90%\n| team2 = {{flagIOC|SWE|2022 Winter}}\n| [[Christoffer Sundgren]] | 99%\n| [[Rasmus Wranå]] | 95%\n| [[Oskar Eriksson]] | 93%\n| [[Niklas Edin]] | 87%\n| teampct2 = 94%\n}}"relatedness=0.569
"Curling at the 2022 Winter Olympics\n\n==Results summary==\n\n===Men's tournament===\n\n====Playoffs====\n\n{{4TeamBracket-with 3rd\n| Team-Width = 150\n| RD1 = Semifinals\n| RD2 = Gold medal game\n| RD2b = Bronze medal game\n\n| RD1-seed1 = 1\n| RD1-team1 = '''{{flagIOC|GBR|2022 Winter}}'''\n| RD1-score1 = '''8'''\n| RD1-seed2 = 4\n| RD1-team2 = {{flagIOC|USA|2022 Winter}}\n| RD1-score2 = 4\n| RD1-seed3 = 2\n| RD1-team3 = '''{{flagIOC|SWE|2022 Winter}}'''\n| RD1-score3 = '''5'''\n| RD1-seed4 = 3\n| RD1-team4 = {{flagIOC|CAN|2022 Winter}}\n| RD1-score4 = 3\n\n| RD2-seed1 = 1\n| RD2-team1 = {{flagIOC|GBR|2022 Winter}}\n| RD2-score1 = 4\n| RD2-seed2 = 2\n| RD2-team2 = '''{{flagIOC|SWE|2022 Winter}}'''\n| RD2-score2 = '''5'''\n\n| RD2b-seed1 = 4\n| RD2b-team1 = {{flagIOC|USA|2022 Winter}}\n| RD2b-score1 = 5\n| RD2b-seed2 = 3\n| RD2b-team2 = '''{{flagIOC|CAN|2022 Winter}}'''\n| RD2b-score2 = '''8'''\n}}"relatedness=0.565
"Curling at the 2022 Winter Olympics\n\n==Medal summary==\n\n===Medalists===\n\n{| {{MedalistTable|type=Event|columns=1}}\n|-\n|Men<br/>{{DetailsLink|Curling at the 2022 Winter Olympics – Men's tournament}}\n|{{flagIOC|SWE|2022 Winter}}<br>[[Niklas Edin]]<br>[[Oskar Eriksson]]<br>[[Rasmus Wranå]]<br>[[Christoffer Sundgren]]<br>[[Daniel Magnusson (curler)|Daniel Magnusson]]\n|{{flagIOC|GBR|2022 Winter}}<br>[[Bruce Mouat]]<br>[[Grant Hardie]]<br>[[Bobby Lammie]]<br>[[Hammy McMillan Jr.]]<br>[[Ross Whyte]]\n|{{flagIOC|CAN|2022 Winter}}<br>[[Brad Gushue]]<br>[[Mark Nichols (curler)|Mark Nichols]]<br>[[Brett Gallant]]<br>[[Geoff Walker (curler)|Geoff Walker]]<br>[[Marc Kennedy]]\n|-\n|Women<br/>{{DetailsLink|Curling at the 2022 Winter Olympics – Women's tournament}}\n|{{flagIOC|GBR|2022 Winter}}<br>[[Eve Muirhead]]<br>[[Vicky Wright]]<br>[[Jennifer Dodds]]<br>[[Hailey Duff]]<br>[[Mili Smith]]\n|{{flagIOC|JPN|2022 Winter}}<br>[[Satsuki Fujisawa]]<br>[[Chinami Yoshida]]<br>[[Yumi Suzuki]]<br>[[Yurika Yoshida]]<br>[[Kotomi Ishizaki]]\n|{{flagIOC|SWE|2022 Winter}}<br>[[Anna Hasselborg]]<br>[[Sara McManus]]<br>[[Agnes Knochenhauer]]<br>[[Sofia Mabergs]]<br>[[Johanna Heldin]]\n|-\n|Mixed doubles<br/>{{DetailsLink|Curling at the 2022 Winter Olympics – Mixed doubles tournament}}\n|{{flagIOC|ITA|2022 Winter}}<br>[[Stefania Constantini]]<br>[[Amos Mosaner]]\n|{{flagIOC|NOR|2022 Winter}}<br>[[Kristin Skaslien]]<br>[[Magnus Nedregotten]]\n|{{flagIOC|SWE|2022 Winter}}<br>[[Almida de Val]]<br>[[Oskar Eriksson]]\n|}"relatedness=0.561
"Curling at the 2022 Winter Olympics\n\n==Results summary==\n\n===Mixed doubles tournament===\n\n====Playoffs====\n\n{{4TeamBracket-with 3rd\n| Team-Width = 150\n| RD1 = Semifinals\n| RD2 = Gold medal game\n| RD2b = Bronze medal game\n\n| RD1-seed1 = 1\n| RD1-team1 = '''{{flagIOC|ITA|2022 Winter}}'''\n| RD1-score1 = '''8'''\n| RD1-seed2 = 4\n| RD1-team2 = {{flagIOC|SWE|2022 Winter}}\n| RD1-score2 = 1\n| RD1-seed3 = 2\n| RD1-team3 = '''{{flagIOC|NOR|2022 Winter}}'''\n| RD1-score3 = '''6'''\n| RD1-seed4 = 3\n| RD1-team4 = {{flagIOC|GBR|2022 Winter}}\n| RD1-score4 = 5\n\n| RD2-seed1 = 1\n| RD2-team1 = '''{{flagIOC|ITA|2022 Winter}}'''\n| RD2-score1 = '''8'''\n| RD2-seed2 = 2\n| RD2-team2 = {{flagIOC|NOR|2022 Winter}}\n| RD2-score2 = 5\n\n| RD2b-seed1 = 4\n| RD2b-team1 = '''{{flagIOC|SWE|2022 Winter}}'''\n| RD2b-score1 = '''9'''\n| RD2b-seed2 = 3\n| RD2b-team2 = {{flagIOC|GBR|2022 Winter}}\n| RD2b-score2 = 3\n}}" 3. Ask
With the search function above, we can now automatically retrieve relevant knowledge and insert it into messages to GPT.
Below, we define a function ask that:
- Takes a user query
- Searches for text relevant to the query
- Stuffs that text into a message for GPT
- Sends the message to GPT
- Returns GPT's answer
Example questions
Finally, let's ask our system our original question about gold medal curlers:
"The athletes who won the gold medal in curling at the 2022 Winter Olympics are:\n\n- Men's tournament: Niklas Edin, Oskar Eriksson, Rasmus Wranå, Christoffer Sundgren, and Daniel Magnusson from Sweden.\n- Women's tournament: Eve Muirhead, Vicky Wright, Jennifer Dodds, Hailey Duff, and Mili Smith from Great Britain.\n- Mixed doubles tournament: Stefania Constantini and Amos Mosaner from Italy."
With latest model and using embedding search, our search system was able to retrieve reference text for the model to read, allowing it to correctly list the gold medal winners in the Men's and Women's tournaments.
Troubleshooting wrong answers
In case we get any mistakes in the output, we can see whether a mistake is from a lack of relevant source text (i.e., failure of the search step) or a lack of reasoning reliability (i.e., failure of the ask step), you can look at the text GPT was given by setting print_message=True.
In this particular case, looking at the text below, it looks like the #1 article given to the model did contain medalists for all three events, but the later results emphasized the Men's and Women's tournaments, which may have distracted the model from giving a more complete answer.
Use the below articles on the 2022 Winter Olympics to answer the subsequent question. If the answer cannot be found in the articles, write "I could not find an answer."
Wikipedia article section:
"""
List of 2022 Winter Olympics medal winners
==Curling==
{{main|Curling at the 2022 Winter Olympics}}
{|{{MedalistTable|type=Event|columns=1|width=225|labelwidth=200}}
|-valign="top"
|Men<br/>{{DetailsLink|Curling at the 2022 Winter Olympics – Men's tournament}}
|{{flagIOC|SWE|2022 Winter}}<br/>[[Niklas Edin]]<br/>[[Oskar Eriksson]]<br/>[[Rasmus Wranå]]<br/>[[Christoffer Sundgren]]<br/>[[Daniel Magnusson (curler)|Daniel Magnusson]]
|{{flagIOC|GBR|2022 Winter}}<br/>[[Bruce Mouat]]<br/>[[Grant Hardie]]<br/>[[Bobby Lammie]]<br/>[[Hammy McMillan Jr.]]<br/>[[Ross Whyte]]
|{{flagIOC|CAN|2022 Winter}}<br/>[[Brad Gushue]]<br/>[[Mark Nichols (curler)|Mark Nichols]]<br/>[[Brett Gallant]]<br/>[[Geoff Walker (curler)|Geoff Walker]]<br/>[[Marc Kennedy]]
|-valign="top"
|Women<br/>{{DetailsLink|Curling at the 2022 Winter Olympics – Women's tournament}}
|{{flagIOC|GBR|2022 Winter}}<br/>[[Eve Muirhead]]<br/>[[Vicky Wright]]<br/>[[Jennifer Dodds]]<br/>[[Hailey Duff]]<br/>[[Mili Smith]]
|{{flagIOC|JPN|2022 Winter}}<br/>[[Satsuki Fujisawa]]<br/>[[Chinami Yoshida]]<br/>[[Yumi Suzuki]]<br/>[[Yurika Yoshida]]<br/>[[Kotomi Ishizaki]]
|{{flagIOC|SWE|2022 Winter}}<br/>[[Anna Hasselborg]]<br/>[[Sara McManus]]<br/>[[Agnes Knochenhauer]]<br/>[[Sofia Mabergs]]<br/>[[Johanna Heldin]]
|-valign="top"
|Mixed doubles<br/>{{DetailsLink|Curling at the 2022 Winter Olympics – Mixed doubles tournament}}
|{{flagIOC|ITA|2022 Winter}}<br/>[[Stefania Constantini]]<br/>[[Amos Mosaner]]
|{{flagIOC|NOR|2022 Winter}}<br/>[[Kristin Skaslien]]<br/>[[Magnus Nedregotten]]
|{{flagIOC|SWE|2022 Winter}}<br/>[[Almida de Val]]<br/>[[Oskar Eriksson]]
|}
"""
Wikipedia article section:
"""
Curling at the 2022 Winter Olympics
==Medal summary==
===Medal table===
{{Medals table
| caption =
| host =
| flag_template = flagIOC
| event = 2022 Winter
| team =
| gold_CAN = 0 | silver_CAN = 0 | bronze_CAN = 1
| gold_ITA = 1 | silver_ITA = 0 | bronze_ITA = 0
| gold_NOR = 0 | silver_NOR = 1 | bronze_NOR = 0
| gold_SWE = 1 | silver_SWE = 0 | bronze_SWE = 2
| gold_GBR = 1 | silver_GBR = 1 | bronze_GBR = 0
| gold_JPN = 0 | silver_JPN = 1 | bronze_JPN - 0
}}
"""
Wikipedia article section:
"""
Curling at the 2022 Winter Olympics
==Medal summary==
===Medalists===
{| {{MedalistTable|type=Event|columns=1}}
|-
|Men<br/>{{DetailsLink|Curling at the 2022 Winter Olympics – Men's tournament}}
|{{flagIOC|SWE|2022 Winter}}<br>[[Niklas Edin]]<br>[[Oskar Eriksson]]<br>[[Rasmus Wranå]]<br>[[Christoffer Sundgren]]<br>[[Daniel Magnusson (curler)|Daniel Magnusson]]
|{{flagIOC|GBR|2022 Winter}}<br>[[Bruce Mouat]]<br>[[Grant Hardie]]<br>[[Bobby Lammie]]<br>[[Hammy McMillan Jr.]]<br>[[Ross Whyte]]
|{{flagIOC|CAN|2022 Winter}}<br>[[Brad Gushue]]<br>[[Mark Nichols (curler)|Mark Nichols]]<br>[[Brett Gallant]]<br>[[Geoff Walker (curler)|Geoff Walker]]<br>[[Marc Kennedy]]
|-
|Women<br/>{{DetailsLink|Curling at the 2022 Winter Olympics – Women's tournament}}
|{{flagIOC|GBR|2022 Winter}}<br>[[Eve Muirhead]]<br>[[Vicky Wright]]<br>[[Jennifer Dodds]]<br>[[Hailey Duff]]<br>[[Mili Smith]]
|{{flagIOC|JPN|2022 Winter}}<br>[[Satsuki Fujisawa]]<br>[[Chinami Yoshida]]<br>[[Yumi Suzuki]]<br>[[Yurika Yoshida]]<br>[[Kotomi Ishizaki]]
|{{flagIOC|SWE|2022 Winter}}<br>[[Anna Hasselborg]]<br>[[Sara McManus]]<br>[[Agnes Knochenhauer]]<br>[[Sofia Mabergs]]<br>[[Johanna Heldin]]
|-
|Mixed doubles<br/>{{DetailsLink|Curling at the 2022 Winter Olympics – Mixed doubles tournament}}
|{{flagIOC|ITA|2022 Winter}}<br>[[Stefania Constantini]]<br>[[Amos Mosaner]]
|{{flagIOC|NOR|2022 Winter}}<br>[[Kristin Skaslien]]<br>[[Magnus Nedregotten]]
|{{flagIOC|SWE|2022 Winter}}<br>[[Almida de Val]]<br>[[Oskar Eriksson]]
|}
"""
Wikipedia article section:
"""
Curling at the 2022 Winter Olympics
==Results summary==
===Men's tournament===
====Playoffs====
=====Gold medal game=====
''Saturday, 19 February, 14:50''
{{#lst:Curling at the 2022 Winter Olympics – Men's tournament|GM}}
{{Player percentages
| team1 = {{flagIOC|GBR|2022 Winter}}
| [[Hammy McMillan Jr.]] | 95%
| [[Bobby Lammie]] | 80%
| [[Grant Hardie]] | 94%
| [[Bruce Mouat]] | 89%
| teampct1 = 90%
| team2 = {{flagIOC|SWE|2022 Winter}}
| [[Christoffer Sundgren]] | 99%
| [[Rasmus Wranå]] | 95%
| [[Oskar Eriksson]] | 93%
| [[Niklas Edin]] | 87%
| teampct2 = 94%
}}
"""
Wikipedia article section:
"""
Curling at the 2022 Winter Olympics
==Results summary==
===Men's tournament===
====Playoffs====
{{4TeamBracket-with 3rd
| Team-Width = 150
| RD1 = Semifinals
| RD2 = Gold medal game
| RD2b = Bronze medal game
| RD1-seed1 = 1
| RD1-team1 = '''{{flagIOC|GBR|2022 Winter}}'''
| RD1-score1 = '''8'''
| RD1-seed2 = 4
| RD1-team2 = {{flagIOC|USA|2022 Winter}}
| RD1-score2 = 4
| RD1-seed3 = 2
| RD1-team3 = '''{{flagIOC|SWE|2022 Winter}}'''
| RD1-score3 = '''5'''
| RD1-seed4 = 3
| RD1-team4 = {{flagIOC|CAN|2022 Winter}}
| RD1-score4 = 3
| RD2-seed1 = 1
| RD2-team1 = {{flagIOC|GBR|2022 Winter}}
| RD2-score1 = 4
| RD2-seed2 = 2
| RD2-team2 = '''{{flagIOC|SWE|2022 Winter}}'''
| RD2-score2 = '''5'''
| RD2b-seed1 = 4
| RD2b-team1 = {{flagIOC|USA|2022 Winter}}
| RD2b-score1 = 5
| RD2b-seed2 = 3
| RD2b-team2 = '''{{flagIOC|CAN|2022 Winter}}'''
| RD2b-score2 = '''8'''
}}
"""
Wikipedia article section:
"""
Curling at the 2022 Winter Olympics
==Participating nations==
A total of 114 athletes from 14 nations (including the IOC's designation of ROC) were scheduled to participate (the numbers of athletes are shown in parentheses). Some curlers competed in both the 4-person and mixed doubles tournament, therefore, the numbers included on this list are the total athletes sent by each NOC to the Olympics, not how many athletes they qualified. Both Australia and the Czech Republic made their Olympic sport debuts.
{{columns-list|colwidth=20em|
* {{flagIOC|AUS|2022 Winter|2}}
* {{flagIOC|CAN|2022 Winter|12}}
* {{flagIOC|CHN|2022 Winter|12}}
* {{flagIOC|CZE|2022 Winter|2}}
* {{flagIOC|DEN|2022 Winter|10}}
* {{flagIOC|GBR|2022 Winter|10}}
* {{flagIOC|ITA|2022 Winter|6}}
* {{flagIOC|JPN|2022 Winter|5}}
* {{flagIOC|NOR|2022 Winter|6}}
* {{flagIOC|ROC|2022 Winter|10}}
* {{flagIOC|KOR|2022 Winter|5}}
* {{flagIOC|SWE|2022 Winter|11}}
* {{flagIOC|SUI|2022 Winter|12}}
* {{flagIOC|USA|2022 Winter|11}}
}}
"""
Wikipedia article section:
"""
Curling at the 2022 Winter Olympics
==Teams==
===Mixed doubles===
{| class=wikitable
|-
!width=200|{{flagIOC|AUS|2022 Winter}}
!width=200|{{flagIOC|CAN|2022 Winter}}
!width=200|{{flagIOC|CHN|2022 Winter}}
!width=200|{{flagIOC|CZE|2022 Winter}}
!width=200|{{flagIOC|GBR|2022 Winter}}
|-
|
'''Female:''' [[Tahli Gill]]<br>
'''Male:''' [[Dean Hewitt]]
|
'''Female:''' [[Rachel Homan]]<br>
'''Male:''' [[John Morris (curler)|John Morris]]
|
'''Female:''' [[Fan Suyuan]]<br>
'''Male:''' [[Ling Zhi]]
|
'''Female:''' [[Zuzana Paulová]]<br>
'''Male:''' [[Tomáš Paul]]
|
'''Female:''' [[Jennifer Dodds]]<br>
'''Male:''' [[Bruce Mouat]]
|-
!width=200|{{flagIOC|ITA|2022 Winter}}
!width=200|{{flagIOC|NOR|2022 Winter}}
!width=200|{{flagIOC|SWE|2022 Winter}}
!width=200|{{flagIOC|SUI|2022 Winter}}
!width=200|{{flagIOC|USA|2022 Winter}}
|-
|
'''Female:''' [[Stefania Constantini]]<br>
'''Male:''' [[Amos Mosaner]]
|
'''Female:''' [[Kristin Skaslien]]<br>
'''Male:''' [[Magnus Nedregotten]]
|
'''Female:''' [[Almida de Val]]<br>
'''Male:''' [[Oskar Eriksson]]
|
'''Female:''' [[Jenny Perret]]<br>
'''Male:''' [[Martin Rios]]
|
'''Female:''' [[Vicky Persinger]]<br>
'''Male:''' [[Chris Plys]]
|}
"""
Wikipedia article section:
"""
Curling at the 2022 Winter Olympics
==Results summary==
===Men's tournament===
====Playoffs====
=====Bronze medal game=====
''Friday, 18 February, 14:05''
{{#lst:Curling at the 2022 Winter Olympics – Men's tournament|BM}}
{{Player percentages
| team1 = {{flagIOC|USA|2022 Winter}}
| [[John Landsteiner]] | 80%
| [[Matt Hamilton (curler)|Matt Hamilton]] | 86%
| [[Chris Plys]] | 74%
| [[John Shuster]] | 69%
| teampct1 = 77%
| team2 = {{flagIOC|CAN|2022 Winter}}
| [[Geoff Walker (curler)|Geoff Walker]] | 84%
| [[Brett Gallant]] | 86%
| [[Mark Nichols (curler)|Mark Nichols]] | 78%
| [[Brad Gushue]] | 78%
| teampct2 = 82%
}}
"""
Wikipedia article section:
"""
Curling at the 2022 Winter Olympics
==Results summary==
===Women's tournament===
====Playoffs====
{{4TeamBracket-with 3rd
| Team-Width = 150
| RD1 = Semifinals
| RD2 = Gold medal game
| RD2b = Bronze medal game
| RD1-seed1 = 1
| RD1-team1 = {{flagIOC|SUI|2022 Winter}}
| RD1-score1 = 6
| RD1-seed2 = 4
| RD1-team2 = '''{{flagIOC|JPN|2022 Winter}}'''
| RD1-score2 = '''8'''
| RD1-seed3 = 2
| RD1-team3 = {{flagIOC|SWE|2022 Winter}}
| RD1-score3 = 11
| RD1-seed4 = 3
| RD1-team4 = '''{{flagIOC|GBR|2022 Winter}}'''
| RD1-score4 = '''12'''
| RD2-seed1 = 4
| RD2-team1 = {{flagIOC|JPN|2022 Winter}}
| RD2-score1 = 3
| RD2-seed2 = 3
| RD2-team2 = '''{{flagIOC|GBR|2022 Winter}}'''
| RD2-score2 = '''10'''
| RD2b-seed1 = 1
| RD2b-team1 = {{flagIOC|SUI|2022 Winter}}
| RD2b-score1 = 7
| RD2b-seed2 = 2
| RD2b-team2 = '''{{flagIOC|SWE|2022 Winter}}'''
| RD2b-score2 = '''9'''
}}
"""
Question: Which athletes won the gold medal in curling at the 2022 Winter Olympics?
"The athletes who won the gold medal in curling at the 2022 Winter Olympics are:\n\n- Men's tournament: Niklas Edin, Oskar Eriksson, Rasmus Wranå, Christoffer Sundgren, and Daniel Magnusson from Sweden.\n- Women's tournament: Eve Muirhead, Vicky Wright, Jennifer Dodds, Hailey Duff, and Mili Smith from Great Britain.\n- Mixed doubles tournament: Stefania Constantini and Amos Mosaner from Italy."
Knowing that sometimes, this mistake can be due to imperfect reasoning in the ask step, than imperfect retrieval in the search step, one can focus on improving the ask step.
The easiest way to improve results is to use a more capable models, such as GPT-4o-mini or GPT-4o models. Let's try it.
"The gold medal in curling at the 2022 Winter Olympics was won by the following athletes:\n\n- Men's tournament: Niklas Edin, Oskar Eriksson, Rasmus Wranå, Christoffer Sundgren, Daniel Magnusson from Sweden.\n- Women's tournament: Eve Muirhead, Vicky Wright, Jennifer Dodds, Hailey Duff, Mili Smith from Great Britain.\n- Mixed doubles: Stefania Constantini and Amos Mosaner from Italy."
GPT-4 models tend to succeed, correctly identifying all 12 gold medal winners in curling.
More examples
Below are a few more examples of the system in action. Feel free to try your own questions, and see how it does. In general, search-based systems do best on questions that have a simple lookup, and worst on questions that require multiple partial sources to be combined and reasoned about.
'There were 2 world records and 24 Olympic records set at the 2022 Winter Olympics.'
"Jamaica had more athletes at the 2022 Winter Olympics. Jamaica's team consisted of seven athletes. There is no information provided about Cuba's participation in the 2022 Winter Olympics, so I cannot determine the number of athletes they had, if any."
'I could not find an answer.'
'I could not find an answer.'
'I am here to provide information about the 2022 Winter Olympics. If you have any questions related to that topic, feel free to ask!'
"In the marsh, a silhouette stark,\nStands the elegant Shoebill Stork.\nWith a gaze so keen and bill so bold,\nNature's marvel, a sight to behold."
"The gold medal winners in curling at the 2022 Winter Olympics were:\n\n- Men's tournament: Sweden (Niklas Edin, Oskar Eriksson, Rasmus Wranå, Christoffer Sundgren, Daniel Magnusson)\n- Women's tournament: Great Britain (Eve Muirhead, Vicky Wright, Jennifer Dodds, Hailey Duff, Mili Smith)\n- Mixed doubles tournament: Italy (Stefania Constantini, Amos Mosaner)"
'I could not find an answer.'
'I could not find an answer.'
"COVID-19 had a significant impact on the 2022 Winter Olympics in several ways:\n\n1. **Qualification Changes**: The pandemic led to changes in the qualification process for sports like curling and women's ice hockey due to the cancellation of tournaments in 2020. Qualification for curling was based on placement in the 2021 World Curling Championships and an Olympic Qualification Event, while the IIHF used existing world rankings for women's ice hockey.\n\n2. **Biosecurity Protocols**: The IOC announced strict biosecurity protocols, requiring all athletes to remain within a bio-secure bubble, undergo daily COVID-19 testing, and only travel to and from Games-related venues. Athletes who were not fully vaccinated or did not have a valid medical exemption had to quarantine for 21 days upon arrival.\n\n3. **Spectator Restrictions**: Initially, only residents of the People's Republic of China were allowed to attend as spectators. Later, ticket sales to the general public were canceled, and only limited numbers of spectators were admitted by invitation, making it the second consecutive Olympics closed to the general public.\n\n4. **NHL Withdrawal**: The National Hockey League (NHL) withdrew its players from the men's hockey tournament due to COVID-19 concerns and the need to make up postponed games.\n\n5. **Quarantine and Testing**: Everyone present at the Games had to use the My2022 mobile app for health reporting and COVID-19 testing records. Concerns about the app's security led some delegations to advise athletes to use burner phones and laptops.\n\n6. **Athlete Absences**: Some top athletes, considered medal contenders, were unable to travel to China after testing positive for COVID-19, even if asymptomatic. This included athletes like Austrian ski jumper Marita Kramer and Russian skeletonist Nikita Tregubov.\n\n7. **Complaints and Controversies**: There were complaints from athletes and team officials about quarantine conditions, including issues with food, facilities, and lack of training equipment. Some athletes expressed frustration over the testing process and quarantine management.\n\n8. **COVID-19 Cases**: A total of 437 COVID-19 cases were reported during the Olympics, with 171 cases among the protective bubble residents and 266 detected from airport testing. Despite strict containment efforts, the number of cases was only slightly lower than those reported during the 2020 Tokyo Summer Olympics.\n\nOverall, COVID-19 significantly influenced the organization, participation, and experience of the 2022 Winter Olympics."