American Data Science logo
Notebooks
P
Pinecone
Pinecone Tutorial

Pinecone Tutorial

Vector Databaseembeddings

Pinecone Quickstart

1. Load API Key with .env

[ ]

2. Initialize Pinecone client

Next, use your API key to initialize your client:

[ ]

3. Prepare language model for vector encoder

We use a small transformers language model to create 364-dimensional embeddings. You can out models for generating embeddings

[ ]

4. Create Pinecone

This creates a serverless index named "quickstart" that performs nearest-neighbor search using the Euclidean distance metric for your vectors

[ ]

5. Generate vector values from wikipedia text

We retrieve a wikipedia based dataset with Hugging Face's datasets library. Note that this dataset contains Cohere's vectors, but we're generating our own in this notebook.

[ ]

6. Upsert vectors

Now that you’ve created your index and the vector embeddings of your wikipedia data, you can upsert these vectors into your index.

[ ]

7. Check the index

Pinecone is eventually consistent, so there can be a delay before your upserted vectors are available to query. Use the describe_index_stats operation to check if the current vector count matches the number of vectors you upserted:

[ ]

8. Run a similarity search

[ ]

9. Deploy an app to port forward and share publically

[ ]

10. Clean up

When you no longer need the index, use the delete_index operation to delete it.

[ ]
[ ]