Initial setup: Knowledge base RAG system with LlamaIndex and ChromaDB

- Add Python project with uv package manager
- Implement LlamaIndex + ChromaDB RAG pipeline
- Add sentence-transformers for local embeddings (all-MiniLM-L6-v2)
- Create MCP server with semantic search, indexing, and stats tools
- Add Markdown chunker with heading/wikilink/frontmatter support
- Add Dockerfile and docker-compose.yaml for self-hosted deployment
- Include sample Obsidian vault files for testing
- Add .gitignore and .env.example
This commit is contained in:
2026-03-03 20:42:42 -05:00
parent 94dd158d1c
commit 11c3f705ce
11 changed files with 5319 additions and 0 deletions

33
Dockerfile Normal file
View File

@ -0,0 +1,33 @@
FROM python:3.11-slim
# Install system dependencies for sentence-transformers
RUN apt-get update && apt-get install -y --no-install-recommends \
gcc \
g++ \
&& rm -rf /var/lib/apt/lists/*
# Set working directory
WORKDIR /app
# Install uv
RUN pip install uv
# Copy pyproject.toml
COPY pyproject.toml .
# Install dependencies
RUN uv sync --frozen --no-dev
# Copy source code
COPY src/ ./src/
# Create data directories
RUN mkdir -p /data/vault /data/chroma_db /data/embeddings_cache
# Set environment variables
ENV PYTHONUNBUFFERED=1 \
VAULT_PATH=/data/vault \
EMBEDDINGS_CACHE_DIR=/data/embeddings_cache
# Default command runs the MCP server
CMD ["python", "-m", "knowledge_rag.server"]