Skip to main content

DocuMancer AI

Vector Database Backend

FAISS (Facebook AI Similarity Search) is used as the vector store in our project. It stores and searches embeddings efficiently and supports high-performance similarity search, enabling fast and accurate retrieval of the most relevant document chunks based on the user's query.

Install required libraries here is the requirements.txt file,

langchain-openai
langchain_community
langchain_core
faiss-cpu
markdown
tiktoken
fastapi
pydantic
uvicorn
python-dotenv
scikit-learn
requests

Inside project directory, go to venv/Scripts/activate for windows user

C:/project-directory/> venv/Scripts/activate

(venv)C:/project-directory/>cd vector_store

(venv)C:/project-directory/vector_store>

Then install libraries

pip install -r requirements.txt

Import libraries in index.py file,

from fastapi import FastAPI, HTTPException
from fastapi.middleware.cors import CORSMiddleware
from pydantic import BaseModel
import uvicorn
import os
from typing import List
from langchain_community.vectorstores import FAISS
from langchain_openai import AzureOpenAIEmbeddings
from langchain_community.docstore.in_memory import InMemoryDocstore
from langchain.schema import Document
from faiss import IndexFlatL2
from dotenv import load_dotenv

Load environmental variables (like Azure credentials and host/port),

load_dotenv()

# Constants
VECTOR_STORE_PATH = "vector_store"
INDEX_NAME = "index"
EMBEDDING_DIM = 1536  # for Azure text-embedding-ada-002, adjust if needed
PORT = int(os.environ["PORT"])
HOST = os.environ["HOST"]

Setup fastapi, cors and Azure embedding model for converting query and searching vectors in database.

# Fast API Setup
app = FastAPI()

# enable cors
app.add_middleware(
    CORSMiddleware,
    allow_origins=["*"],
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)

# Azure embedding model
embedding_model = AzureOpenAIEmbeddings(
    azure_endpoint=os.environ["AZURE_ENDPOINT"],
    deployment=os.environ["AZURE_EMBEDDING_DEPLOYMENT"],
    api_key=os.environ["AZURE_API_KEY"],
    openai_api_version=os.environ["AZURE_EMBEDDING_VERSION"]
)

Initialize vector store, if a FAISS index file exists, it loads it. If not, it creates new IndexFlatL2 index which is standard L2 norm vector index.

def load_vector_store():
    if os.path.exists(os.path.join(VECTOR_STORE_PATH, f"{INDEX_NAME}.faiss")):
        return FAISS.load_local(VECTOR_STORE_PATH, embedding_model, index_name=INDEX_NAME, allow_dangerous_deserialization=True)
    else:
        index = IndexFlatL2(EMBEDDING_DIM)
        return FAISS(embedding_model, index, InMemoryDocstore({}), {})

vector_store = load_vector_store()

/store endpoint,

  • Accepts list of items and convert these into Document objects.
  • Calls the vector_store.add_texts(...) to embed and store them.
  • Saves the vector store to disk using save_local.
@app.post("/store")
async def store_embeddings(data: List[EmbeddingItem]):
    try:
        docs = [
            Document(
                page_content=item.content,
                metadata=item.metadata
            )
            for item in data
        ]
        # add to vector db
        vector_store.add_texts(
            texts=[doc.page_content for doc in docs],
            metadatas=[doc.metadata for doc in docs]
        )
        # Persist store
        vector_store.save_local(VECTOR_STORE_PATH, index_name=INDEX_NAME)
        return {"status": "success", "stored": len(data)}
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))

/search endpoint,

  • Accepts query and embeds that query using Azure OpenAI Embedding.
  • Searches the vector store for the most similar documents and returns the content and metadata of the top matches.
@app.post("/search")
async def search_query(request: QueryRequest):
    try:
        docs_and_scores = vector_store.similarity_search(
            query=request.query,
            k=request.top_k
        )
        result = [{
                "content": doc.page_content,
                "metadata": doc.metadata
            } 
            for doc in docs_and_scores
        ]
        return result
    except Exception as e:
        import traceback
        traceback.print_exc()
        raise HTTPException(status_code=500, detail=str(e))

The vector store backend starts from the main, lets say host is localhost and port as 8001 for local development.

if __name__ == "__main__":
    uvicorn.run(app, host=HOST, port=PORT)

To run the code,

C:/rag_chatbot_k8>source venv/Scripts/activate
(venv)C:/rag_chatbot_k8>cd vector_store
(venv)C:/rag_chatbot_k8/vector_store>python index.py

Now it is running in service http://localhost:8000