A modern bookmark manager focused on intelligent organization, fast retrieval, and contextual discovery. Designed to keep knowledge structured, accessible, and alive.
- Python 47.4%
- Go 27.2%
- Vue 12.6%
- TypeScript 6%
- Shell 5.2%
- Other 1.5%
|
|
||
|---|---|---|
| .githooks | ||
| db | ||
| deploy | ||
| docs | ||
| go | ||
| python | ||
| scripts | ||
| searxng | ||
| web/frontend | ||
| .gitignore | ||
| AGENTS.md | ||
| package.json | ||
| README.md | ||
Bookmarks Manager
A multi-user bookmarks manager focused on fast import, async enrichment, and hybrid search foundations.
Current MVP includes:
- Go API (
go/cmd/api) - Go worker (
go/cmd/worker) - Python embeddings service (
python/ml_service) - Browser renderer service (
python/browser_service) - Vue 3 + Vite frontend (
web/frontend) - PostgreSQL + pgvector + pg_trgm
- Local runner-based autotag pipeline (
python/scripts/autotag_runner_qwen_mlx.py) - Raindrop CSV import with idempotent upserts
- Admin/dev endpoints for reset, job monitoring, and SSE events
Architecture (High Level)
- Frontend:
- Vue 3 SPA with Vue Router and TanStack Query
- calls the Go API directly
- API:
- HTTP endpoints (
/health, import, admin endpoints) - writes bookmarks/tags/jobs in Postgres
- HTTP endpoints (
- Worker:
- consumes bookmark embedding jobs
- persists vectors after autotag enrichment is stored
- Bookmark embeddings service:
- exposes
/healthand/embed - keeps
bge-m3loaded in memory
- exposes
- Browser renderer service:
- keeps Chromium running in memory
- exposes
/health,/thumbnail, and/extract - generates fallback thumbnails when
preview_image_urlis missing
- Local runner:
- claims runner-backed jobs from the API
- runs local Qwen MLX processing
- sends semantic results back to the API
- PostgreSQL:
- relational data + queue table
- queue state, bookmark metadata, and semantic fields
Local Quickstart (Docker Compose)
- Choose env file
- default:
deploy/env/.env.dev - copy it if you need a variant (example:
deploy/env/.env.local)
- Start stack
cd deploy
docker compose --env-file ./env/.env.dev up -d
- Common commands
# stop (keep data)
docker compose stop
# start existing containers
docker compose start
# rebuild app images/config changes
docker compose up -d --build
# stop + remove containers and network (keep named volumes)
docker compose down
# stop + remove everything including DB volume
docker compose down -v
# follow logs
docker compose logs -f api
docker compose logs -f worker
docker compose logs -f bookmark-embeddings
docker compose logs -f browser-renderer
Import Raindrop CSV
curl -X POST \
-F "file=@/absolute/path/to/export.csv" \
http://127.0.0.1:8080/integrations/raindrop/import
Typical response:
{
"ok": true,
"total": 377,
"imported": 375,
"enqueued": 375,
"duplicated": 2,
"skipped": 0
}
Dev Admin Endpoints
Admin endpoints are enabled only when APP_ENV=development.
- Reset imported data:
curl -X POST http://127.0.0.1:8080/admin/reset-ddbb
- List jobs:
curl "http://127.0.0.1:8080/admin/jobs?status=queued&type=runner_autotag_bookmark&limit=20"
- Jobs summary:
curl "http://127.0.0.1:8080/admin/jobs/summary"
- Seed curated tags (supports
slug,label,description, optionalkeywords[]):
curl -X POST http://127.0.0.1:8080/admin/tags/seed
- Rebuild tag embeddings (required before autotagging):
curl -X POST http://127.0.0.1:8080/admin/tags/rebuild-embeddings
- Backfill enrich jobs for bookmarks with missing/failed enrichment:
curl -X POST http://127.0.0.1:8080/admin/enrich/backfill
- Quarantine report for bookmarks without auto-tags (grouped by enrich status/http status):
curl "http://127.0.0.1:8080/admin/enrich/quarantine?limit=20"
- Backfill auto-tag jobs for ready bookmarks without auto tags:
curl -X POST http://127.0.0.1:8080/admin/autotag/backfill
- Clear only auto tags (keeps import/manual tags untouched):
curl -X POST http://127.0.0.1:8080/admin/autotag/clear
- Inspect ready bookmarks still missing auto tags:
curl "http://127.0.0.1:8080/admin/autotag/missing?limit=20"
API Documentation
- OpenAPI spec:
GET /openapi.yaml - Swagger UI:
GET /docs
Examples:
curl http://127.0.0.1:8080/openapi.yaml
open http://127.0.0.1:8080/docs
SSE Events
Current SSE endpoint is dev-admin scoped:
GET /admin/events
Example frontend subscription:
<script>
const es = new EventSource('http://127.0.0.1:8080/admin/events');
es.addEventListener('hello', (ev) => console.log('hello', JSON.parse(ev.data)));
es.addEventListener('job.updated', (ev) => console.log('job', JSON.parse(ev.data)));
es.addEventListener('bookmark.updated', (ev) => console.log('bookmark', JSON.parse(ev.data)));
es.onerror = (err) => console.error('sse error', err);
</script>
Note: a public /events endpoint can be added later when auth/roles are in place.
Runner Notes
- Bookmark autotagging is driven by the local runner via:
POST /runner/autotag/claimPOST /runner/autotag/result
- When autotagging finishes, the API enqueues
worker_generate_bookmark_embedding - If autotagging returns no
preview_image_url, the API also enqueuesworker_generate_bookmark_thumbnail - The Go worker claims that job, reloads the enriched bookmark from Postgres, calls
bookmark-embeddings, and upsertsbookmark_embeddings - Thumbnail fallbacks are generated through
browser-renderer, stored underSTORAGE_DIR, and served by the Go API from/media/bookmarks/:id/thumbnail - Semantic fields persisted on bookmarks:
autotag_topicautotag_description_refinedautotag_page_objective
- Auto tags are stored in
bookmark_tagswithsource='auto' - Search queries and tag rebuilds also use the same embeddings service via
ML_URL
Verify auto tags in Postgres:
SELECT b.id, LEFT(COALESCE(b.title,''), 60) AS title, t.slug, bt.score
FROM bookmark_tags bt
JOIN bookmarks b ON b.id = bt.bookmark_id
JOIN tags t ON t.id = bt.tag_id
WHERE bt.source = 'auto'
ORDER BY bt.created_at DESC
LIMIT 30;