A modern bookmark manager focused on intelligent organization, fast retrieval, and contextual discovery. Designed to keep knowledge structured, accessible, and alive.
  • Python 47.4%
  • Go 27.2%
  • Vue 12.6%
  • TypeScript 6%
  • Shell 5.2%
  • Other 1.5%
Find a file
2026-04-08 16:35:59 +00:00
.githooks Add frontend contracts, toast handling, and bookmark processing flow 2026-04-03 19:03:21 +02:00
db chore: checkpoint all current work before branch switch 2026-04-08 18:33:53 +02:00
deploy chore: checkpoint all current work before branch switch 2026-04-08 18:33:53 +02:00
docs chore: checkpoint all current work before branch switch 2026-04-08 18:33:53 +02:00
go chore: checkpoint all current work before branch switch 2026-04-08 18:33:53 +02:00
python chore: checkpoint all current work before branch switch 2026-04-08 18:33:53 +02:00
scripts Add frontend contracts, toast handling, and bookmark processing flow 2026-04-03 19:03:21 +02:00
searxng chore: consolidar mejoras de autotag y tooling 2026-03-02 10:33:39 +01:00
web/frontend chore: checkpoint all current work before branch switch 2026-04-08 18:33:53 +02:00
.gitignore [FEAT] Added root gitignore 2026-04-05 23:39:59 +02:00
AGENTS.md chore: consolidate current repo changes and bookmarks action bar refactor 2026-04-06 00:13:38 +02:00
package.json Promote Vue frontend and enable API CORS 2026-04-05 23:38:08 +02:00
README.md chore: checkpoint all current work before branch switch 2026-04-08 18:33:53 +02:00

Bookmarks Manager

A multi-user bookmarks manager focused on fast import, async enrichment, and hybrid search foundations.

Current MVP includes:

  • Go API (go/cmd/api)
  • Go worker (go/cmd/worker)
  • Python embeddings service (python/ml_service)
  • Browser renderer service (python/browser_service)
  • Vue 3 + Vite frontend (web/frontend)
  • PostgreSQL + pgvector + pg_trgm
  • Local runner-based autotag pipeline (python/scripts/autotag_runner_qwen_mlx.py)
  • Raindrop CSV import with idempotent upserts
  • Admin/dev endpoints for reset, job monitoring, and SSE events

Architecture (High Level)

  • Frontend:
    • Vue 3 SPA with Vue Router and TanStack Query
    • calls the Go API directly
  • API:
    • HTTP endpoints (/health, import, admin endpoints)
    • writes bookmarks/tags/jobs in Postgres
  • Worker:
    • consumes bookmark embedding jobs
    • persists vectors after autotag enrichment is stored
  • Bookmark embeddings service:
    • exposes /health and /embed
    • keeps bge-m3 loaded in memory
  • Browser renderer service:
    • keeps Chromium running in memory
    • exposes /health, /thumbnail, and /extract
    • generates fallback thumbnails when preview_image_url is missing
  • Local runner:
    • claims runner-backed jobs from the API
    • runs local Qwen MLX processing
    • sends semantic results back to the API
  • PostgreSQL:
    • relational data + queue table
    • queue state, bookmark metadata, and semantic fields

Local Quickstart (Docker Compose)

  1. Choose env file
  • default: deploy/env/.env.dev
  • copy it if you need a variant (example: deploy/env/.env.local)
  1. Start stack
cd deploy
docker compose --env-file ./env/.env.dev up -d
  1. Common commands
# stop (keep data)
docker compose stop

# start existing containers
docker compose start

# rebuild app images/config changes
docker compose up -d --build

# stop + remove containers and network (keep named volumes)
docker compose down

# stop + remove everything including DB volume
docker compose down -v

# follow logs
docker compose logs -f api
docker compose logs -f worker
docker compose logs -f bookmark-embeddings
docker compose logs -f browser-renderer

Import Raindrop CSV

curl -X POST \
  -F "file=@/absolute/path/to/export.csv" \
  http://127.0.0.1:8080/integrations/raindrop/import

Typical response:

{
  "ok": true,
  "total": 377,
  "imported": 375,
  "enqueued": 375,
  "duplicated": 2,
  "skipped": 0
}

Dev Admin Endpoints

Admin endpoints are enabled only when APP_ENV=development.

  • Reset imported data:
curl -X POST http://127.0.0.1:8080/admin/reset-ddbb
  • List jobs:
curl "http://127.0.0.1:8080/admin/jobs?status=queued&type=runner_autotag_bookmark&limit=20"
  • Jobs summary:
curl "http://127.0.0.1:8080/admin/jobs/summary"
  • Seed curated tags (supports slug, label, description, optional keywords[]):
curl -X POST http://127.0.0.1:8080/admin/tags/seed
  • Rebuild tag embeddings (required before autotagging):
curl -X POST http://127.0.0.1:8080/admin/tags/rebuild-embeddings
  • Backfill enrich jobs for bookmarks with missing/failed enrichment:
curl -X POST http://127.0.0.1:8080/admin/enrich/backfill
  • Quarantine report for bookmarks without auto-tags (grouped by enrich status/http status):
curl "http://127.0.0.1:8080/admin/enrich/quarantine?limit=20"
  • Backfill auto-tag jobs for ready bookmarks without auto tags:
curl -X POST http://127.0.0.1:8080/admin/autotag/backfill
  • Clear only auto tags (keeps import/manual tags untouched):
curl -X POST http://127.0.0.1:8080/admin/autotag/clear
  • Inspect ready bookmarks still missing auto tags:
curl "http://127.0.0.1:8080/admin/autotag/missing?limit=20"

API Documentation

  • OpenAPI spec: GET /openapi.yaml
  • Swagger UI: GET /docs

Examples:

curl http://127.0.0.1:8080/openapi.yaml
open http://127.0.0.1:8080/docs

SSE Events

Current SSE endpoint is dev-admin scoped:

  • GET /admin/events

Example frontend subscription:

<script>
  const es = new EventSource('http://127.0.0.1:8080/admin/events');

  es.addEventListener('hello', (ev) => console.log('hello', JSON.parse(ev.data)));
  es.addEventListener('job.updated', (ev) => console.log('job', JSON.parse(ev.data)));
  es.addEventListener('bookmark.updated', (ev) => console.log('bookmark', JSON.parse(ev.data)));

  es.onerror = (err) => console.error('sse error', err);
</script>

Note: a public /events endpoint can be added later when auth/roles are in place.

Runner Notes

  • Bookmark autotagging is driven by the local runner via:
    • POST /runner/autotag/claim
    • POST /runner/autotag/result
  • When autotagging finishes, the API enqueues worker_generate_bookmark_embedding
  • If autotagging returns no preview_image_url, the API also enqueues worker_generate_bookmark_thumbnail
  • The Go worker claims that job, reloads the enriched bookmark from Postgres, calls bookmark-embeddings, and upserts bookmark_embeddings
  • Thumbnail fallbacks are generated through browser-renderer, stored under STORAGE_DIR, and served by the Go API from /media/bookmarks/:id/thumbnail
  • Semantic fields persisted on bookmarks:
    • autotag_topic
    • autotag_description_refined
    • autotag_page_objective
  • Auto tags are stored in bookmark_tags with source='auto'
  • Search queries and tag rebuilds also use the same embeddings service via ML_URL

Verify auto tags in Postgres:

SELECT b.id, LEFT(COALESCE(b.title,''), 60) AS title, t.slug, bt.score
FROM bookmark_tags bt
JOIN bookmarks b ON b.id = bt.bookmark_id
JOIN tags t ON t.id = bt.tag_id
WHERE bt.source = 'auto'
ORDER BY bt.created_at DESC
LIMIT 30;

More Docs