Initial setup: Knowledge base RAG system with LlamaIndex and ChromaDB

- Add Python project with uv package manager
- Implement LlamaIndex + ChromaDB RAG pipeline
- Add sentence-transformers for local embeddings (all-MiniLM-L6-v2)
- Create MCP server with semantic search, indexing, and stats tools
- Add Markdown chunker with heading/wikilink/frontmatter support
- Add Dockerfile and docker-compose.yaml for self-hosted deployment
- Include sample Obsidian vault files for testing
- Add .gitignore and .env.example
This commit is contained in:
2026-03-03 20:42:42 -05:00
parent 94dd158d1c
commit 11c3f705ce
11 changed files with 5319 additions and 0 deletions

32
docker-compose.yaml Normal file
View File

@ -0,0 +1,32 @@
version: "3.8"
services:
knowledge-rag:
build:
context: .
dockerfile: Dockerfile
container_name: knowledge-rag
volumes:
# Mount your obsidian vault here
- ${VAULT_PATH:-./knowledge}:/data/vault
# Persist ChromaDB vector store
- ./data/chroma_db:/data/chroma_db
# Persist embeddings cache
- ./data/embeddings_cache:/data/embeddings_cache
environment:
- VAULT_PATH=/data/vault
- EMBEDDING_MODEL=${EMBEDDING_MODEL:-all-MiniLM-L6-v2}
- EMBEDDINGS_CACHE_DIR=/data/embeddings_cache
restart: unless-stopped
# Optional: Watchtower for auto-updates
# watchtower:
# image: containrr/watchtower
# container_name: watchtower
# volumes:
# - /var/run/docker.sock:/var/run/docker.sock
# environment:
# - WATCHTOWER_CLEANUP=true
# - WATCHTOWER_INCLUDE_STOPPED=true
# command: --interval 3600 knowledge-rag
# restart: unless-stopped