Haystack Chat With Docs
Using Mistral AI with Haystack
In this cookbook, we will use Mistral embeddings and generative models in 2 Haystack pipelines:
- We will build an indexing pipeline that can create embeddings for the contents of URLs and indexes them into a vector database
- We will build a retrieval-augmented chat pipeline to chat with the contents of the URLs
First, we install our dependencies
'2.10.3'
Next, we need to set the MISTRAL_API_KEY environment variable π
Index URLs with Mistral Embeddings
Below, we are using mistral-embed in a full Haystack indexing pipeline. We create embeddings for the contents of the chosen URLs with mistral-embed and write them to an InMemoryDocumentStore using the MistralDocumentEmbedder.
π‘This document store is the simplest to get started with as it has no requirements to setup. Feel free to change this document store to any of the vector databases available for Haystack 2.0 such as Weaviate, Chroma, AstraDB etc.
<haystack.core.pipeline.pipeline.Pipeline object at 0x1370196a0> ,π Components , - fetcher: LinkContentFetcher , - converter: HTMLToDocument , - embedder: MistralDocumentEmbedder , - writer: DocumentWriter ,π€οΈ Connections , - fetcher.streams -> converter.sources (List[ByteStream]) , - converter.documents -> embedder.documents (List[Document]) , - embedder.documents -> writer.documents (List[Document])
Calculating embeddings: 1it [00:00, 3.69it/s]
{'embedder': {'meta': {'model': 'mistral-embed',
, 'usage': {'prompt_tokens': 1658,
, 'total_tokens': 1658,
, 'completion_tokens': 0}}},
, 'writer': {'documents_written': 2}} Chat With the URLs with Mistral Generative Models
Now that we have indexed the contents and embeddings of various URLs, we can create a RAG pipeline that uses the MistralChatGenerator component with mistral-small.
A few more things to know about this pipeline:
- We are using the
MistralTextEmbdderto embed our question and retrieve the most relevant 1 document - We are enabling streaming responses by providing a
streaming_callback documentsis being provided to the chat template by the retriever, while we providequeryto the pipeline when we run it.
<haystack.core.pipeline.pipeline.Pipeline object at 0x13705e2d0> ,π Components , - text_embedder: MistralTextEmbedder , - retriever: InMemoryEmbeddingRetriever , - prompt_builder: ChatPromptBuilder , - llm: MistralChatGenerator ,π€οΈ Connections , - text_embedder.embedding -> retriever.query_embedding (List[float]) , - retriever.documents -> prompt_builder.documents (List[Document]) , - prompt_builder.prompt -> llm.messages (List[ChatMessage])
The Mistral platform has three generative endpoints: mistral-tiny, mistral-small, and mistral-medium. Each endpoint serves a different model with varying performance and language support. Mistral-tiny serves Mistral 7B Instruct v0.2, which is the most cost-effective and only supports English. Mistral-small serves Mixtral 8x7B, which supports English, French, Italian, German, Spanish, and code. Mistral-medium serves a prototype model with higher performance, also supporting the same languages and code as Mistral-small. Additionally, the platform offers an embedding endpoint called Mistral-embed, which serves an embedding model with a 1024 embedding dimension designed for retrieval capabilities.