Overview of DocuMancer AI
DocuMancer AI is an open-source AI chatbot that reads your Markdown (.md) files and generates response using RAG (Retrieval-Augmented Generation) technique and Kubernetes documentation via GitHub.
Whether you're building an internal support tool, documentation assistant, or a smart chatbot for your team, this project shows you how to build one from scratch using your own docs and modern AI tools.
How DocuMancer AI Works using RAG
RAG combines two powerful techniques:
- Retrieval: Searches for the most relevant documents from your Markdown (.md) files.
- Generation: Passes that information to an LLM (like GPT) to generate an accurate answer.
Instead of training a new model, plug Kubernetes GitHub docs into GPT, helping it answer smarter.
Tools and Technologies
LangChainto orchestrate the RAG workflowFastAPIfor building a backend APIReactto create a friendly frontend UIAzure OpenAI-GPT-4o modelor any LLM provider to answer questionsAzure Opena AI-text-embedding-3-smallfor text embeddingsFAISSvector database is to store and search your document chunks efficiently
Developing DocuMancer AI
Clone the repository and rag_chatbot_k8 folder is your DocuMancer AI project.
The folder structure of this project
rag_chatbot_k8
|__ frontend
|__ main_backend
|__ sync_backend
|__ vector_store
|__k8s/ # for heml charts
|__ Readme.mdThere are three services,
main_backendis the query processing logic, where it accepts user query from frontend and sends the query to vector-store.vector_storeis the vector database service which converts query to embedding. Also searches the relevant document using similarity search. FAISS is used as vector database for this project.sync_backendis the cronjob service rather than API service. This is batch process that schedules for week or month. It clones and copies the Kubernetes GitHub docs, does text embeddings for those documents and save those embeddings in vector store.
The frontend is the UI part which is coded in React.