SaneGenius — Traverse Time. Master Technology.

Embeddings and Vector DBs

Introduction

How do we make an AI search for information conceptually rather than relying on exact keyword matches? The answer lies in embeddings and vector databases.

An embedding is a numerical representation of a piece of data (like text) in a high-dimensional vector space. Words or sentences with similar meanings are located closer together in this space. For example, the vectors for "dog" and "puppy" will be very close to each other, while "dog" and "car" will be far apart.

Vector Databases

Traditional databases search for exact keyword matches. Vector databases store embeddings and perform "similarity searches" using mathematical operations like Cosine Similarity. This allows you to find documents that are semantically relevant to a query, even if they share no exact keywords.

Assignment

Read this blog post on how Text Embeddings work under the hood.
Explore the documentation for ChromaDB or Pinecone and set up a local instance or free account.

Embeddings and Vector DBs

Introduction

What are Embeddings?

Vector Databases

Assignment

Support Us!