Initial setup: Knowledge base RAG system with LlamaIndex and ChromaDB

- Add Python project with uv package manager - Implement LlamaIndex + ChromaDB RAG pipeline - Add sentence-transformers for local embeddings (all-MiniLM-L6-v2) - Create MCP server with semantic search, indexing, and stats tools - Add Markdown chunker with heading/wikilink/frontmatter support - Add Dockerfile and docker-compose.yaml for self-hosted deployment - Include sample Obsidian vault files for testing - Add .gitignore and .env.example
2026-03-03 20:42:42 -05:00
parent 94dd158d1c
commit 11c3f705ce
11 changed files with 5319 additions and 0 deletions
--- a/33
+++ b/33
@ -0,0 +1,33 @@
+FROM python:3.11-slim
+
+# Install system dependencies for sentence-transformers
+RUN apt-get update && apt-get install -y --no-install-recommends \
+    gcc \
+    g++ \
+    && rm -rf /var/lib/apt/lists/*
+
+# Set working directory
+WORKDIR /app
+
+# Install uv
+RUN pip install uv
+
+# Copy pyproject.toml
+COPY pyproject.toml .
+
+# Install dependencies
+RUN uv sync --frozen --no-dev
+
+# Copy source code
+COPY src/ ./src/
+
+# Create data directories
+RUN mkdir -p /data/vault /data/chroma_db /data/embeddings_cache
+
+# Set environment variables
+ENV PYTHONUNBUFFERED=1 \
+    VAULT_PATH=/data/vault \
+    EMBEDDINGS_CACHE_DIR=/data/embeddings_cache
+
+# Default command runs the MCP server
+CMD ["python", "-m", "knowledge_rag.server"]