RAGrep — Retrieval-Augmented Code Search

Key Features

Semantic Search: Vector-powered retrieval surfaces contextually relevant code and prose.
Agent-Friendly CLI: Designed so automated agents can index, query, and inspect repositories safely.
Local-First: Runs entirely on your machine — no external APIs or credentials required.
Chunk-Aware Processing: Smart document processor for Markdown, Python, JavaScript, HTML, and more.
Persistent Vector Store: Uses ChromaDB for durable, high-performance embeddings.

Quick Start

RAGrep requires Python 3.9+ and a working pip installation.

git clone https://github.com/pierce403/ragrep.git
cd ragrep
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
pip install -e .

Install prerelease wheels for torch if your architecture is not covered by the default wheels.

CLI Essentials

# Index the current repository
ragrep index

# Index a specific directory
ragrep index ./src

# Retrieve relevant chunks
ragrep dump "authentication middleware" --limit 5

# Inspect vector store stats
ragrep stats

Use --db-path to point commands at a custom vector store directory and --verbose for richer progress logs.

Project Highlights

Vector Store: Built on chromadb ≥ 1.2.0 for Python 3.9–3.12 support.
Embeddings: Powered by sentence-transformers and torch.
Agent UX: Clean, deterministic CLI output optimized for machine consumption.
CI/CD: GitHub Actions matrix builds across modern Python versions with automated packaging checks.

Stay in the Loop

RAGrep is evolving quickly. Track updates, file issues, and contribute improvements: