Vector Stores
PostgreSQL
As we need a backend SQL database to store conversation history and other info, using Postgres as a vector store is very attractive for us. Implementeing all this functionalities using the same technology reduces deployment overhead and complexity.
See the recipes for database configs here
pip install psycopg2-binary pgvector
# backend/config.yaml
VectorStoreConfig: &VectorStoreConfig
source: PGVector
source_config:
connection_string: {{ DATABASE_URL }}
retriever_search_type: similarity_score_threshold
retriever_config:
k: 20
score_threshold: 0.5
insertion_mode: null
top_k
: maximum number of documents to fetch.
score_threshold
: score below which a document is deemed irrelevant and not fetched.
insertion_mode
: null
| full
| incremental
. How document indexing and insertion in the vector store is handled.
Local Chroma
# backend/config.yaml
VectorStoreConfig: &VectorStoreConfig
source: Chroma
source_config:
persist_directory: vector_database/
collection_metadata:
hnsw:space: cosine
retriever_search_type: similarity_score_threshold
retriever_config:
k: 20
score_threshold: 0.5
insertion_mode: null
persist_directory
: where, locally the Chroma database will be persisted.
hnsw:space: cosine
: distance function used. Default is l2
. Cosine is bounded [0; 1], making it easier to set a score threshold for retrival.
top_k
: maximum number of documents to fetch.
score_threshold
: score below which a document is deemed irrelevant and not fetched.
insertion_mode
: null
| full
| incremental
. How document indexing and insertion in the vector store is handled.