Notebooks
M
Mistral AI
Haystack Chat With Docs

Haystack Chat With Docs

mistral-cookbookHaystackthird_party

Using Mistral AI with Haystack

In this cookbook, we will use Mistral embeddings and generative models in 2 Haystack pipelines:

  1. We will build an indexing pipeline that can create embeddings for the contents of URLs and indexes them into a vector database
  2. We will build a retrieval-augmented chat pipeline to chat with the contents of the URLs

First, we install our dependencies

[ ]
[11]
'2.10.3'

Next, we need to set the MISTRAL_API_KEY environment variable πŸ‘‡

[12]

Index URLs with Mistral Embeddings

Below, we are using mistral-embed in a full Haystack indexing pipeline. We create embeddings for the contents of the chosen URLs with mistral-embed and write them to an InMemoryDocumentStore using the MistralDocumentEmbedder.

πŸ’‘This document store is the simplest to get started with as it has no requirements to setup. Feel free to change this document store to any of the vector databases available for Haystack 2.0 such as Weaviate, Chroma, AstraDB etc.

[13]
<haystack.core.pipeline.pipeline.Pipeline object at 0x1370196a0>
,πŸš… Components
,  - fetcher: LinkContentFetcher
,  - converter: HTMLToDocument
,  - embedder: MistralDocumentEmbedder
,  - writer: DocumentWriter
,πŸ›€οΈ Connections
,  - fetcher.streams -> converter.sources (List[ByteStream])
,  - converter.documents -> embedder.documents (List[Document])
,  - embedder.documents -> writer.documents (List[Document])
[14]
Calculating embeddings: 1it [00:00,  3.69it/s]
{'embedder': {'meta': {'model': 'mistral-embed',
,   'usage': {'prompt_tokens': 1658,
,    'total_tokens': 1658,
,    'completion_tokens': 0}}},
, 'writer': {'documents_written': 2}}

Chat With the URLs with Mistral Generative Models

Now that we have indexed the contents and embeddings of various URLs, we can create a RAG pipeline that uses the MistralChatGenerator component with mistral-small. A few more things to know about this pipeline:

  • We are using the MistralTextEmbdder to embed our question and retrieve the most relevant 1 document
  • We are enabling streaming responses by providing a streaming_callback
  • documents is being provided to the chat template by the retriever, while we provide query to the pipeline when we run it.
[15]
[16]
<haystack.core.pipeline.pipeline.Pipeline object at 0x13705e2d0>
,πŸš… Components
,  - text_embedder: MistralTextEmbedder
,  - retriever: InMemoryEmbeddingRetriever
,  - prompt_builder: ChatPromptBuilder
,  - llm: MistralChatGenerator
,πŸ›€οΈ Connections
,  - text_embedder.embedding -> retriever.query_embedding (List[float])
,  - retriever.documents -> prompt_builder.documents (List[Document])
,  - prompt_builder.prompt -> llm.messages (List[ChatMessage])
[17]
The Mistral platform has three generative endpoints: mistral-tiny, mistral-small, and mistral-medium. Each endpoint serves a different model with varying performance and language support. Mistral-tiny serves Mistral 7B Instruct v0.2, which is the most cost-effective and only supports English. Mistral-small serves Mixtral 8x7B, which supports English, French, Italian, German, Spanish, and code. Mistral-medium serves a prototype model with higher performance, also supporting the same languages and code as Mistral-small. Additionally, the platform offers an embedding endpoint called Mistral-embed, which serves an embedding model with a 1024 embedding dimension designed for retrieval capabilities.