Remove Docker, update README with setup and auto-start instructions
- Remove Dockerfile and docker-compose.yaml (not suitable for this project) - Update README.md with comprehensive setup documentation - Add systemd, tmux, and rc.local auto-start options - Add troubleshooting section
This commit is contained in:
172
README.md
172
README.md
@ -1,21 +1,171 @@
|
||||
# Knowledge Base
|
||||
# Knowledge Base RAG System
|
||||
|
||||
Personal knowledge base repository for storing useful information, notes, and documentation.
|
||||
A self-hosted RAG (Retrieval Augmented Generation) system for your Obsidian vault with MCP server integration.
|
||||
|
||||
## Contents
|
||||
## Features
|
||||
|
||||
- [Getting Started](#getting-started)
|
||||
- [Contributing](#contributing)
|
||||
- [License](#license)
|
||||
- **Semantic Search**: Find relevant content using embeddings, not just keywords
|
||||
- **MCP Server**: Exposes search, indexing, and stats tools via MCP protocol
|
||||
- **Local-first**: No external APIs - everything runs locally
|
||||
- **Obsidian Compatible**: Works with your existing markdown vault
|
||||
|
||||
## Getting Started
|
||||
## Requirements
|
||||
|
||||
This repository contains various knowledge articles, how-to guides, and reference documentation.
|
||||
- Python 3.11+
|
||||
- ~2GB disk space for embeddings model
|
||||
|
||||
## Contributing
|
||||
## Quick Start
|
||||
|
||||
Feel free to contribute by creating issues or submitting pull requests.
|
||||
### 1. Install uv (if not already)
|
||||
|
||||
```bash
|
||||
curl -LsSf https://astral.sh/uv/install.sh | sh
|
||||
source ~/.local/bin/env
|
||||
```
|
||||
|
||||
### 2. Clone and setup
|
||||
|
||||
```bash
|
||||
cd ~/knowledge-base
|
||||
cp .env.example .env
|
||||
```
|
||||
|
||||
### 3. Configure
|
||||
|
||||
Edit `.env` to set your vault path:
|
||||
|
||||
```bash
|
||||
VAULT_PATH=/path/to/your/obsidian-vault
|
||||
EMBEDDING_MODEL=all-MiniLM-L6-v2 # optional
|
||||
```
|
||||
|
||||
### 4. Install dependencies
|
||||
|
||||
```bash
|
||||
uv sync
|
||||
```
|
||||
|
||||
### 5. Run the server
|
||||
|
||||
```bash
|
||||
source .venv/bin/activate
|
||||
VAULT_PATH=./knowledge python -m knowledge_rag.server
|
||||
```
|
||||
|
||||
The server will:
|
||||
- Auto-index your vault on startup
|
||||
- Listen for MCP requests via stdio
|
||||
|
||||
## MCP Tools
|
||||
|
||||
Once running, these tools are available:
|
||||
|
||||
| Tool | Description |
|
||||
|------|-------------|
|
||||
| `search_knowledge` | Semantic search across your vault |
|
||||
| `index_knowledge` | Re-index the vault (use after adding files) |
|
||||
| `get_knowledge_stats` | View indexing statistics |
|
||||
|
||||
## Usage Example
|
||||
|
||||
```python
|
||||
# Example: Searching the knowledge base
|
||||
# (via MCP client or Claude Desktop integration)
|
||||
|
||||
await search_knowledge({
|
||||
"query": "how does the RAG system work",
|
||||
"top_k": 5
|
||||
})
|
||||
```
|
||||
|
||||
## Auto-Start on Boot
|
||||
|
||||
### Option 1: Systemd Service
|
||||
|
||||
Create `/etc/systemd/system/knowledge-rag.service`:
|
||||
|
||||
```ini
|
||||
[Unit]
|
||||
Description=Knowledge Base RAG MCP Server
|
||||
After=network.target
|
||||
|
||||
[Service]
|
||||
Type=simple
|
||||
User=ernie
|
||||
WorkingDirectory=/home/ernie/knowledge-base
|
||||
Environment="VAULT_PATH=/home/ernie/knowledge"
|
||||
Environment="PATH=/home/ernie/.local/bin:/usr/bin:/bin"
|
||||
ExecStart=/home/ernie/knowledge-base/.venv/bin/python -m knowledge_rag.server
|
||||
Restart=always
|
||||
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
```
|
||||
|
||||
Then enable:
|
||||
|
||||
```bash
|
||||
sudo systemctl daemon-reload
|
||||
sudo systemctl enable knowledge-rag.service
|
||||
sudo systemctl start knowledge-rag.service
|
||||
```
|
||||
|
||||
### Option 2: tmux/screen
|
||||
|
||||
```bash
|
||||
# Start in tmux
|
||||
tmux new -s knowledge-rag
|
||||
source .venv/bin/activate
|
||||
VAULT_PATH=./knowledge python -m knowledge_rag.server
|
||||
# Detach: Ctrl+b, then d
|
||||
```
|
||||
|
||||
### Option 3: rc.local or startup script
|
||||
|
||||
Add to your `~/.bashrc` or startup script:
|
||||
|
||||
```bash
|
||||
# Only start if not already running
|
||||
if ! pgrep -f "knowledge_rag.server" > /dev/null; then
|
||||
cd ~/knowledge-base
|
||||
source .venv/bin/activate
|
||||
VAULT_PATH=./knowledge nohup python -m knowledge_rag.server > /tmp/knowledge-rag.log 2>&1 &
|
||||
fi
|
||||
```
|
||||
|
||||
## Project Structure
|
||||
|
||||
```
|
||||
knowledge-base/
|
||||
├── src/knowledge_rag/ # Source code
|
||||
│ ├── server.py # MCP server
|
||||
│ ├── chunker.py # Markdown chunking
|
||||
│ ├── embeddings.py # Sentence-transformers wrapper
|
||||
│ └── vector_store.py # ChromaDB wrapper
|
||||
├── knowledge/ # Your Obsidian vault (gitignored)
|
||||
├── pyproject.toml # Project config
|
||||
└── .env.example # Environment template
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
| Variable | Default | Description |
|
||||
|----------|---------|-------------|
|
||||
| `VAULT_PATH` | `/data/vault` | Path to your Obsidian vault |
|
||||
| `EMBEDDING_MODEL` | `all-MiniLM-L6-v2` | Sentence-transformers model |
|
||||
| `EMBEDDINGS_CACHE_DIR` | `/data/embeddings_cache` | Model cache location |
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### First run is slow
|
||||
The embedding model (~90MB) downloads on first run. Subsequent runs are faster.
|
||||
|
||||
### No search results
|
||||
Run `index_knowledge` tool to index your vault, or restart the server.
|
||||
|
||||
### Out of memory
|
||||
The default model is lightweight. For even smaller models, try `paraphrase-MiniLM-L3-v2`.
|
||||
|
||||
## License
|
||||
|
||||
MIT License
|
||||
MIT
|
||||
|
||||
Reference in New Issue
Block a user