- Rust 85.3%
- Python 14.7%
| assets | ||
| core-rs | ||
| data | ||
| docs | ||
| scripts | ||
| tests | ||
| .gitignore | ||
| LICENSE | ||
| README.md | ||
Local Semantic Search Engine (Local-First Notes)
A fully local, privacy-focused semantic search engine for personal notes.
The goal is simple: search by intent, not by keywords, while keeping all data and inference on-device.
Current Architecture (Semantic-First)
This version is pure semantic retrieval (ANN over embeddings).
There is no lexical retrieval stage (no FTS5/BM25) in the current pipeline.
Components
- DuckDB as the embedded database
- VSS (Vector Similarity Search) extension for vector indexes
- HNSW (ANN) index for fast nearest-neighbor retrieval
- Embedding model → converts text into vectors (e.g.
FLOAT[1024]) - Reranker model (optional but recommended) → re-scores query–chunk pairs for higher precision
- (Optional) Tagger / Topics → enriches chunks with tags/topics for better UX and filtering
Models
This repo assumes local models stored under something like:
core-rs/models/
bge-m3.gguf
bge-reranker-v2-m3-Q8_0.gguf
paraphrase-multilingual-mpnet-base-v2-onnx/
1) Embeddings (Semantic Retrieval)
Default (GGUF via llama.cpp)
- Model:
bge-m3.gguf - Output: 1024-dim embedding (stored as
FLOAT[1024])
How it’s used:
- At import time: each chunk → embedding
- At search time: query string → embedding
- DuckDB VSS + HNSW retrieves Top-K nearest embeddings
2) Reranking (Precision Boost)
Default (GGUF via llama.cpp)
- Model:
bge-reranker-v2-m3-Q8_0.gguf - Output: a score for each (query, chunk_text) pair
How it’s used:
- After ANN retrieval (Top-K candidates), reranker re-scores and reorders them.
- This usually improves quality on ambiguous queries or short queries.
3) ONNX (Optional Runtime)
- Folder:
paraphrase-multilingual-mpnet-base-v2-onnx/ - Intended use: run a model via ONNX Runtime inside the application runtime (no external binary).
- Status: optional / future-facing (helps remove dependency on
llama.cppfor some pipelines).
Database
DuckDB database file (example): data/remin.duckdb
Tables (high-level)
notes— note metadata (title, timestamps, etc.)chunks— chunked note textchunk_embeddings— embedding vectors per chunk (FLOAT[1024])chunk_tags/chunk_topics— optional enrichment layer (if enabled)
Vector Index
VSS creates an HNSW index on chunk_embeddings.embedding so search does not full-scan all vectors.
Setup
Requirements
- Rust (stable)
- DuckDB (embedded, shipped via dependency)
- DuckDB VSS extension available/enabled at runtime
- Local models (GGUF and/or ONNX)
Configuration
Suggested environment variables:
export REMIN_DB_PATH="data/remin.duckdb"
export REMIN_MODELS_DIR="core-rs/models"
export REMIN_EMBED_MODEL="bge-m3.gguf"
export REMIN_RERANK_MODEL="bge-reranker-v2-m3-Q8_0.gguf" # optional
# export REMIN_ONNX_DIR="core-rs/models/paraphrase-multilingual-mpnet-base-v2-onnx" # optional
Initialize the Database
Create the DB file, schema and vector index:
cargo run -- init-db
This should:
- Create DuckDB database file (if missing)
- Enable VSS extension
- Create tables
- Create HNSW vector index
Import Notes
Import a JSONL dataset:
cargo run -- import notes.jsonl
Import pipeline (semantic-first):
- Parse notes from JSONL
- Split each note into chunks
- Generate an embedding per chunk (BGE-M3 →
FLOAT[1024]) - Insert
notes,chunks,chunk_embeddings - (Optional) generate tags/topics and store them
Search
Run semantic search:
cargo run -- search "Andalusian cold soup"
Search pipeline:
- Generate query embedding (BGE-M3)
- ANN search via HNSW (DuckDB VSS) → Top-K candidates
- (Optional) rerank candidates (BGE reranker) for precision
- Return final ranked results (with note + chunk context)
Roadmap
- ONNX Runtime integration to reduce/replace
llama.cppwhere appropriate - Mobile packaging (local-first)
- Multimodal notes (OCR + image embeddings)
- Better tagging/topic extraction and UI-level filtering
Philosophy
- Local-first
- Privacy by default
- Intent over keywords
- Engineering-driven AI systems
License
This project is licensed under the GNU General Public License v3.0 (GPLv3).
Why GPLv3?
Remin is built around the idea of local-first, privacy-focused AI systems. Choosing GPLv3 ensures that:
- Any distributed derivative work must remain open-source.
- Improvements to the core engine benefit the community.
- The project cannot be closed off into proprietary forks.
If you plan to use this project in a commercial or closed-source context, please review the implications of GPLv3 carefully.
For full license terms, see the LICENSE file or visit: https://www.gnu.org/licenses/gpl-3.0.html