Remin is a local-first semantic search engine for personal notes. It combines embeddings, ANN vector indexing (HNSW), and reranking models to retrieve information based on intent rather than keywords — fully running on-device for privacy. Built with Rust, DuckDB (VSS), and modern embedding models.
  • Rust 85.3%
  • Python 14.7%
Find a file
2026-03-09 09:55:43 +01:00
assets feat: Added LICENCE 2026-02-23 10:46:27 +01:00
core-rs feat: core-rs improvements and doc/avatar.png 2026-03-09 09:55:43 +01:00
data feat: core-rs improvements and doc/avatar.png 2026-03-09 09:55:43 +01:00
docs feat: core-rs improvements and doc/avatar.png 2026-03-09 09:55:43 +01:00
scripts feat: core-rs improvements and doc/avatar.png 2026-03-09 09:55:43 +01:00
tests feat: Changed ML. Improved and fix importer 2026-02-19 15:23:49 +01:00
.gitignore feat: core-rs improvements and doc/avatar.png 2026-03-09 09:55:43 +01:00
LICENSE feat: Added LICENCE 2026-02-23 10:46:27 +01:00
README.md feat: Added LICENCE 2026-02-23 10:46:27 +01:00

Local Semantic Search Engine (Local-First Notes)

A fully local, privacy-focused semantic search engine for personal notes.

The goal is simple: search by intent, not by keywords, while keeping all data and inference on-device.


Current Architecture (Semantic-First)

This version is pure semantic retrieval (ANN over embeddings).
There is no lexical retrieval stage (no FTS5/BM25) in the current pipeline.

Components

  • DuckDB as the embedded database
  • VSS (Vector Similarity Search) extension for vector indexes
  • HNSW (ANN) index for fast nearest-neighbor retrieval
  • Embedding model → converts text into vectors (e.g. FLOAT[1024])
  • Reranker model (optional but recommended) → re-scores querychunk pairs for higher precision
  • (Optional) Tagger / Topics → enriches chunks with tags/topics for better UX and filtering

Models

This repo assumes local models stored under something like:

core-rs/models/
  bge-m3.gguf
  bge-reranker-v2-m3-Q8_0.gguf
  paraphrase-multilingual-mpnet-base-v2-onnx/

1) Embeddings (Semantic Retrieval)

Default (GGUF via llama.cpp)

  • Model: bge-m3.gguf
  • Output: 1024-dim embedding (stored as FLOAT[1024])

How its used:

  • At import time: each chunk → embedding
  • At search time: query string → embedding
  • DuckDB VSS + HNSW retrieves Top-K nearest embeddings

2) Reranking (Precision Boost)

Default (GGUF via llama.cpp)

  • Model: bge-reranker-v2-m3-Q8_0.gguf
  • Output: a score for each (query, chunk_text) pair

How its used:

  • After ANN retrieval (Top-K candidates), reranker re-scores and reorders them.
  • This usually improves quality on ambiguous queries or short queries.

3) ONNX (Optional Runtime)

  • Folder: paraphrase-multilingual-mpnet-base-v2-onnx/
  • Intended use: run a model via ONNX Runtime inside the application runtime (no external binary).
  • Status: optional / future-facing (helps remove dependency on llama.cpp for some pipelines).

Database

DuckDB database file (example): data/remin.duckdb

Tables (high-level)

  • notes — note metadata (title, timestamps, etc.)
  • chunks — chunked note text
  • chunk_embeddings — embedding vectors per chunk (FLOAT[1024])
  • chunk_tags / chunk_topics — optional enrichment layer (if enabled)

Vector Index

VSS creates an HNSW index on chunk_embeddings.embedding so search does not full-scan all vectors.


Setup

Requirements

  • Rust (stable)
  • DuckDB (embedded, shipped via dependency)
  • DuckDB VSS extension available/enabled at runtime
  • Local models (GGUF and/or ONNX)

Configuration

Suggested environment variables:

export REMIN_DB_PATH="data/remin.duckdb"
export REMIN_MODELS_DIR="core-rs/models"
export REMIN_EMBED_MODEL="bge-m3.gguf"
export REMIN_RERANK_MODEL="bge-reranker-v2-m3-Q8_0.gguf"   # optional
# export REMIN_ONNX_DIR="core-rs/models/paraphrase-multilingual-mpnet-base-v2-onnx"  # optional

Initialize the Database

Create the DB file, schema and vector index:

cargo run -- init-db

This should:

  • Create DuckDB database file (if missing)
  • Enable VSS extension
  • Create tables
  • Create HNSW vector index

Import Notes

Import a JSONL dataset:

cargo run -- import notes.jsonl

Import pipeline (semantic-first):

  1. Parse notes from JSONL
  2. Split each note into chunks
  3. Generate an embedding per chunk (BGE-M3 → FLOAT[1024])
  4. Insert notes, chunks, chunk_embeddings
  5. (Optional) generate tags/topics and store them

Run semantic search:

cargo run -- search "Andalusian cold soup"

Search pipeline:

  1. Generate query embedding (BGE-M3)
  2. ANN search via HNSW (DuckDB VSS) → Top-K candidates
  3. (Optional) rerank candidates (BGE reranker) for precision
  4. Return final ranked results (with note + chunk context)

Roadmap

  • ONNX Runtime integration to reduce/replace llama.cpp where appropriate
  • Mobile packaging (local-first)
  • Multimodal notes (OCR + image embeddings)
  • Better tagging/topic extraction and UI-level filtering

Philosophy

  • Local-first
  • Privacy by default
  • Intent over keywords
  • Engineering-driven AI systems

License

This project is licensed under the GNU General Public License v3.0 (GPLv3).

Why GPLv3?

Remin is built around the idea of local-first, privacy-focused AI systems. Choosing GPLv3 ensures that:

  • Any distributed derivative work must remain open-source.
  • Improvements to the core engine benefit the community.
  • The project cannot be closed off into proprietary forks.

If you plan to use this project in a commercial or closed-source context, please review the implications of GPLv3 carefully.

For full license terms, see the LICENSE file or visit: https://www.gnu.org/licenses/gpl-3.0.html