Pinecone Tutorial
Pinecone Quickstart
1. Load API Key with .env
2. Initialize Pinecone client
Next, use your API key to initialize your client:
3. Prepare language model for vector encoder
We use a small transformers language model to create 364-dimensional embeddings. You can out models for generating embeddings
4. Create Pinecone
This creates a serverless index named "quickstart" that performs nearest-neighbor search using the Euclidean distance metric for your vectors
5. Generate vector values from wikipedia text
We retrieve a wikipedia based dataset with Hugging Face's datasets library. Note that this dataset contains Cohere's vectors, but we're generating our own in this notebook.
6. Upsert vectors
Now that you’ve created your index and the vector embeddings of your wikipedia data, you can upsert these vectors into your index.
7. Check the index
Pinecone is eventually consistent, so there can be a delay before your upserted vectors are available to query. Use the describe_index_stats operation to check if the current vector count matches the number of vectors you upserted:
8. Run a similarity search
9. Deploy an app to port forward and share publically
10. Clean up
When you no longer need the index, use the delete_index operation to delete it.