Files

Jean-Luc Makiola 443802fc68 docs: complete project research

2026-04-03 22:14:27 +02:00

24 KiB

Raw Blame History

Project Research Summary

Project: GearBox v2.0 Platform Foundation Domain: Single-user to multi-user gear management and discovery platform Researched: 2026-04-03 Confidence: HIGH

Executive Summary

GearBox v2.0 is a structural migration, not a feature addition. The project transforms a proven single-user gear tracker (React 19 + Hono + Drizzle + SQLite + Bun) into a multi-user platform with a global item database, structured community reviews, and public setup sharing. The research is clear on the sequencing: database migration and multi-user data scoping must come first because every other feature depends on user-owned records and Postgres. Skipping or deferring either creates cascading rework across all downstream features.

The recommended stack additions are conservative and well-documented: PostgreSQL 16 (required by the auth provider and for concurrent access), Bun's native S3 client against self-hosted MinIO (zero new dependencies), and an external OIDC auth provider replacing the current cookie-session system. The existing stack is validated and stays intact. The largest implementation risk is the SQLite-to-Postgres migration — it is a full schema rewrite, not an automated conversion, and it forces every service function to become async, cascading through 6 service files, 7 route files, and 19 MCP tools.

Open decision — auth provider: STACK.md recommends Logto (TypeScript-native, purpose-built for app auth, React SDK with hooks, requires only Postgres, MIT-licensed). ARCHITECTURE.md recommends Authentik (Python-based, full OIDC/OAuth2, self-hosted, requires Postgres and Redis). Both are valid. Logto integrates at the React layer via @logto/react and validates JWTs on the Hono backend via jose. Authentik integrates at the server layer via @hono/oidc-auth middleware and handles all session state — no React SDK needed. This decision must be resolved before the auth phase begins and affects infrastructure dependencies, React integration complexity, and future capability for proxy-mode SSO. See the Gaps section for resolution criteria.

Key Findings

Recommended Stack

The existing stack (React 19, Hono, Drizzle ORM, TanStack Router/Query, Tailwind v4, Zustand, Zod, Bun) is unchanged and validated. The following are net-new additions for v2.0.

Core technologies:

PostgreSQL 16: Primary database — replaces SQLite for concurrent multi-user access; required by both auth provider candidates; enables full-text search for the global item catalog
Drizzle ORM pg-core: Existing ORM, different dialect — switch from drizzle-orm/bun-sqlite to drizzle-orm/bun-sql (or postgres-js as fallback); schema must be rewritten from scratch using pgTable, not migrated from sqliteTable
Bun native S3Client + MinIO: Zero-dependency image storage — replaces ./uploads/ local filesystem; Bun ships a native S3 client with documented MinIO compatibility; no @aws-sdk/client-s3 needed
jose (^6.2.2): JWT validation — verifies OIDC access tokens on the Hono backend via JWKS; zero dependencies, explicit Bun support; needed regardless of auth provider choice
@logto/react (^4.0.13) OR @hono/oidc-auth: Auth provider SDK — one of these two depending on the provider decision (see open decision above)
@electric-sql/pglite: In-process Postgres for tests — replaces bun:sqlite in-memory test setup; Drizzle supports it via drizzle-orm/pglite; avoids Docker dependency in unit/integration tests
postgres (postgres.js, ^3.4.8): Dev dependency only — required for drizzle-kit CLI (db:generate, db:push) due to known issue #4122 where drizzle-kit does not support the Bun native SQL driver

Infrastructure additions: Docker Compose running Postgres + auth provider + MinIO alongside the Bun app. The app itself continues to run on bare Bun for HMR.

Expected Features

Must have for v2.0 launch (P1):

External auth provider integration — nothing works without multi-user identity
PostgreSQL migration — concurrent access and auth provider dependency
Multi-user data model (userId FK on items, categories, threads, setups, settings) — data isolation foundation
User profiles (minimal: display name, avatar, bio, public setups list) — required for attribution on shared content
Setup visibility controls (public/private toggle, default private) — table stakes for any sharing feature
Public setup detail pages — shareable read-only view with item list, totals, creator attribution
Global item database with seed data — canonical product catalog enabling reviews and aggregation
Link personal items to global items — the bridge enabling owner counts, crowd specs, and weight data
Search global items — full-text search powering item linking and discovery browsing
Structured reviews (overall + dimension ratings, no freeform text) — community intelligence layer
Item detail pages (aggregated specs, owner count, average ratings) — integration hub for all platform data
Discovery browse page (recent public setups, recently reviewed items, popular gear) — entry point for community value

Should have after validation (P2):

Crowd-verified specs display (manufacturer vs. community-measured weight, needs 3+ owners per item)
Setup composition insights ("commonly paired with" co-occurrence analysis)
Planning thread global item integration (candidates auto-populate from global DB)
Copy/fork public setups (one-click template from public setups)
Popular gear rankings by category (most owned, highest rated)

Defer to v3+:

Freeform reviews with moderation (explicitly deferred until moderation infrastructure exists)
Comments on setups (moderation burden)
Follow users / activity feed (social-network complexity, against the discovery-first principle)
OAuth / social login (after external auth is stable)

Anti-features to reject explicitly: real-time collaborative setups, marketplace/buy-sell, AI gear recommendations, wiki-style open item editing, gamification, Instagram-style infinite scroll feed.

Architecture Approach

The v2.0 architecture is a layered structural migration where each layer depends on the one below it: Postgres first (database), then user identity (auth), then data scoping (userId on all entities), then the global item catalog, then community features on top. Every existing service file gains a userId parameter and becomes async — this is a mechanical but wide-ranging change touching 6 services, 7 routes, 19 MCP tools, and all tests. The component topology stays the same (Hono routes -> services -> Drizzle); only the wiring within each layer changes.

Major components:

Database layer (Postgres + pg-core) — Full schema rewrite; all entity tables gain userId FK; new globalItems and reviews tables; sessions table removed; categories unique constraint changes to composite (userId, name)
Auth middleware (OIDC) — Replaces requireAuth; resolves OIDC subject to local userId via getOrCreateUser; keeps API key system intact for MCP and programmatic access; sessions table deleted (OIDC handles state via signed JWT cookies)
User-scoped services — All existing services gain userId parameter and async signature; 4 new services added: globalItem.service, review.service, profile.service, discover.service
Image storage layer (MinIO via Bun S3Client) — Replaces filesystem writes; abstraction interface allows local dev vs. S3 production swap; existing /uploads/* static route replaced by proxy or presigned URL handler
Global item catalog — Separate globalItems table (not the user items table); admin-seeded initially; user items optionally reference global items via nullable globalItemId FK; reviews and owner counts attach to global items, not user items
Public content layer — Setup isPublic flag; public profile pages; discovery queries over public content with indexes on owner_count, (is_public, updated_at), and global_item_id

Critical Pitfalls

Missing userId filters leak data between users — Any service function not updated to filter by userId returns all users' data across 30+ query sites. Prevention: use userId NOT NULL in schema so TypeScript compiler errors guide updates; add Postgres Row-Level Security as a safety net; write cross-user isolation tests per entity (create as User A, query as User B, assert empty results).
Drizzle schema rewrite is a replacement, not a migration — sqlite-core and pg-core are incompatible; real() in Postgres is 4-byte float vs. SQLite's 8-byte (use doublePrecision() for weight values); timestamps change from integer epoch to native timestamp; all service functions must become async (.get() and .all() are sync SQLite methods, Postgres uses await). Prevention: rewrite schema from scratch, update all .get() / .all() calls, run full test suite against Postgres.
Test infrastructure collapses during DB switch — createTestDb() uses bun:sqlite in-memory SQLite. After the switch, every test needs Postgres. Prevention: adopt PGlite (@electric-sql/pglite) for unit/integration tests immediately; never let the SQLite test setup coexist with Postgres production code past the migration sprint.
Auth migration breaks sessions, API keys, and MCP — Switching to external OIDC touches the login flow, session management, API key ownership, MCP authentication, E2E test setup, and the onboarding flow. Prevention: keep API keys in the local database (do not delegate to the auth provider); maintain a local users table with externalId (OIDC subject) FK; keep apiKeys table with userId FK to local users; update E2E tests to authenticate via API keys.
Global item database creates a data model fork if built wrong — An isGlobal flag or NULL userId on the user items table makes queries unmaintainable and blurs permission boundaries. Prevention: separate globalItems table from day one; user items get a nullable globalItemId FK; reviews and owner counts attach to globalItems only.
Existing data has no owner after migration — Current SQLite data has no userId. Adding userId NOT NULL breaks migration if existing rows are not assigned an owner first. Prevention: data migration script must create the original user first, assign all existing data to that userId, then enforce NOT NULL — never make userId permanently nullable as a migration workaround.

Implications for Roadmap

The dependency chain is strict: Postgres -> Auth -> Multi-user scoping -> Global items -> Community features. Attempting to parallelize across this chain creates rework. Suggested phase structure:

Phase 1: Database Migration (SQLite to PostgreSQL)

Rationale: Everything else depends on Postgres. Auth providers require it. Concurrent access requires it. Full-text search for the global item catalog requires it. Must be done first and tested completely before any feature work begins. Delivers: Postgres running locally and in CI; schema rewritten in pg-core; all service functions async; PGlite test infrastructure replacing bun:sqlite; one-time data migration script for existing SQLite data; drizzle.config.ts updated; all PRAGMA statements removed Addresses: Postgres migration (FEATURES.md P1) Avoids: Pitfall 3 (schema rewrite), Pitfall 4 (test infrastructure), Pitfall 10 (SQLite-specific patterns), Pitfall 12 (existing data ownership) Research flag: Well-documented migration pattern. STACK.md and ARCHITECTURE.md agree on the approach. No additional research needed — pitfalls are comprehensively documented with GearBox-specific code references.

Phase 2: Authentication Provider Integration

Rationale: User identity must exist before userId can be added to any table. This phase also resolves the open Logto vs. Authentik decision. Delivers: External auth provider running in Docker; OIDC middleware on Hono; local users table with externalId (OIDC subject); API key system preserved and userId-scoped; E2E tests updated to use API key authentication; onboarding flow replaced with auth provider registration; MCP auth updated Uses: jose (JWT validation), @logto/react or @hono/oidc-auth depending on provider decision, Docker Compose Avoids: Pitfall 5 (auth breaks sessions/keys/MCP); integration gotcha (keep local users table with externalId, do not remove it) Research flag: NEEDS RESOLUTION before this phase can be planned — the Logto vs. Authentik decision. See Gaps section for resolution criteria. Once the provider is chosen, integration patterns are well-documented in official docs.

Phase 3: Multi-User Data Model

Rationale: With user identity established, all entity tables can be scoped. This is the highest-risk phase because it touches every query in the codebase and is where data leaks occur if anything is missed. Delivers: userId NOT NULL on items, categories, threads, setups, settings, apiKeys; composite unique on (userId, name) for categories; isPublic boolean on setups; resolveThread propagates userId to newly created items; all service functions filter by userId; cross-user isolation tests passing per entity; Postgres RLS policies active; MCP tools user-scoped; settings migrated to per-user Addresses: Multi-user data model, setup visibility controls, user profile data model (FEATURES.md P1) Avoids: Pitfall 1 (missing userId filters), Pitfall 2 (category uniqueness), Pitfall 8 (thread resolution userId), Pitfall 9 (public content defaults to private), Pitfall 11 (setup sync race conditions) Research flag: No additional research needed — pitfall documentation is comprehensive with specific per-table and per-function guidance for the existing GearBox codebase.

Phase 4: Image Storage Migration (MinIO)

Rationale: Move image storage before public profiles and discovery ship. Once images are served to unauthenticated users at scale, the local filesystem approach fails. Better to resolve this before public content launches. Delivers: MinIO running in Docker; Bun native S3Client configured; existing ./uploads/ migrated to MinIO bucket; image service updated; proxy route for S3 reads; MCP upload_image_from_url tool updated; image storage abstraction (local filesystem for dev, S3 for production) Uses: Bun native S3Client (built-in, no install), MinIO Docker container Avoids: Pitfall 7 (image URL breakage after storage migration) Research flag: Standard pattern. Bun S3Client docs and MinIO compatibility are well-documented. No research needed.

Phase 5: Global Item Database

Rationale: The global item catalog is the second major foundation. Reviews, item detail pages, owner counts, and discovery all depend on canonical product records existing before those features can be built. Delivers: globalItems table in pg-core; admin seeding workflow with 200-500 initial items across core categories; nullable globalItemId FK on user items; item linking flow in collection UI; full-text search via Postgres tsvector; GET /api/global-items endpoints (public, no auth); globalItem.service.ts Addresses: Global item database, link personal items to global items, search global items (FEATURES.md P1) Avoids: Pitfall 6 (data model fork — separate globalItems table, not a flag on user items) Research flag: May benefit from targeted research on Postgres full-text search (tsvector/tsquery) configuration — specifically index design and query tuning for the expected catalog size and query patterns (brand + model name search). Schema is specified; FTS tuning is domain-dependent.

Phase 6: Community Features (Reviews, Profiles, Discovery)

Rationale: With the global item catalog seeded and users scoped, community features can be built on top. These three feature areas are bundled because they form the public-facing value proposition together — profiles without discovery, or discovery without content, delivers nothing meaningful to users. Delivers: Structured reviews (overall + dimension ratings, one per user per global item, no freeform text); user public profiles (display name, avatar, bio, joined date, public setups list); public setup detail pages; discovery browse page (recent public setups, recently reviewed items, popular items by owner count); item detail pages with aggregated stats (owner count, average ratings, crowd-verified weight) Addresses: Structured reviews, user profiles, public setup pages, item detail pages, discovery browse (FEATURES.md P1) Uses: Review schema with composite unique (userId, globalItemId); denormalized avgRating and ownerCount on globalItems; cursor-paginated discovery queries; indexes on (is_public, updated_at) and owner_count Avoids: Pitfall 9 (private by default enforced in all discovery queries); N+1 query trap in feed (use joins, not per-item queries) Research flag: Discovery feed pagination (cursor vs. offset) and feed composition are well-documented standard patterns at this scale. No additional research needed for v2.0.

Phase Ordering Rationale

Phases 1-3 form an unbreakable dependency chain: each phase is a prerequisite for the next. No parallelization is possible without creating rework.
Phase 4 (images) is inserted before community features because public profiles and discovery serve images to unauthenticated users — local filesystem is not viable at that point.
Phase 5 (global items) must precede Phase 6 (community features) because reviews require global item records to attach to; the dependency is one-directional.
Phases 5 and 6 could be split across releases (global items + linking as v2.0, community features as v2.1) if schedule pressure exists — the global item catalog delivers standalone value through item linking even before reviews exist.

Research Flags

Needs deeper research during planning:

Phase 2 (Auth): Auth provider decision (Logto vs. Authentik) must be resolved before this phase is planned. The integration pattern differs significantly between the two options — React SDK + backend JWT validation (Logto) vs. server-side middleware only (Authentik).
Phase 5 (Global Items): Postgres full-text search index design and query tuning for the global item catalog. The schema is specified; the FTS configuration (tsvector column type, GIN index, query parser) needs validation against expected query patterns and initial catalog size.

Phases with standard patterns (skip research-phase):

Phase 1 (Database): Migration path is thoroughly documented across ARCHITECTURE.md, PITFALLS.md, and STACK.md. Per-file implementation checklist is complete.
Phase 3 (Multi-user): Per-table userId requirements are fully specified. Pitfall checklist covers all edge cases including MCP tools, thread resolution, settings, and the threadCandidates join path.
Phase 4 (Images): Bun S3Client + MinIO is documented by the Bun team. Standard proxy-then-presigned-URL migration path is specified.
Phase 6 (Community): Schema, services, and query patterns are fully specified in ARCHITECTURE.md. No novel patterns.

Confidence Assessment

Area	Confidence	Notes
Stack	HIGH	All recommendations backed by official docs. Known issue #4122 (drizzle-kit + Bun SQL) is documented with a clear workaround. The Logto vs. Authentik disagreement is an open decision requiring resolution, not a confidence gap — both options are well-validated.
Features	HIGH	Competitor analysis is comprehensive (LighterPack, GearGrams, Trailspace, MyGear). Feature dependency chain is fully mapped. P1/P2/P3 prioritization is grounded in implementation cost and dependency analysis.
Architecture	HIGH	Based on direct codebase analysis of GearBox v1.4 plus official Drizzle and Hono docs. Component-level change inventory is complete (new files, modified files, removed files enumerated). Data flow diagrams are concrete and code-level.
Pitfalls	HIGH	12 pitfalls, each with specific GearBox v1.4 codebase context (file names, function names, column names, specific patterns). Confidence is high because research analyzed the actual codebase, not generic migration advice.

Overall confidence: HIGH

Gaps to Address

Open decision — auth provider (Logto vs. Authentik): Must be resolved before Phase 2 planning begins. Resolution criteria: (1) If the project plans to use the auth provider for infrastructure SSO beyond the GearBox app (Portainer, Grafana, Gitea, etc.), choose Authentik — it handles proxy-mode SSO that Logto does not. (2) If GearBox is the only app needing auth, choose Logto — simpler infrastructure (no Redis dependency), React SDK eliminates manual OIDC redirect handling, TypeScript-native. STACK.md's Logto recommendation is correct for the app-auth-only use case.
Review dimension configuration tension: FEATURES.md specifies "3-5 dimension ratings per product category, admin-configurable" (flexible reviewDimensions table with categoryId FK). ARCHITECTURE.md uses hardcoded columns (weightRating, durabilityRating, valueRating). For v2.0, use the hardcoded column approach (simpler, no dimension management UI needed). The flexible schema is a v2.x concern. This must be noted explicitly in Phase 6 planning to prevent scope creep.
E2E test authentication strategy post-auth migration: PITFALLS.md recommends switching E2E tests to API key authentication after auth moves external. The mechanism for creating a test user in the new auth system (direct Postgres insert bypassing the OIDC provider vs. auth provider admin API) needs to be decided during Phase 2 planning.

Sources

Primary (HIGH confidence)

GearBox v1.4 codebase — Direct analysis of src/db/schema.ts, service files, auth middleware, MCP server, test helpers, db/index.ts, E2E seed (direct codebase reference)
Logto official docs — React quickstart — SDK setup, LogtoProvider config
Logto API protection — JWT validation — jose-based middleware pattern
Logto OSS getting started — Docker deployment, Postgres requirements
Drizzle ORM — Bun SQL driver — Native Postgres via Bun
Drizzle ORM — PostgreSQL column types — pg-core schema definitions
Bun S3 documentation — Native S3 client, MinIO config
jose GitHub — JWT library v6.2.2, explicit Bun support
postgres.js npm — v3.4.8, fallback driver

Secondary (MEDIUM confidence)

drizzle-kit Bun SQL issue #4122 — Known CLI limitation with Bun driver
Authentik vs Zitadel comparison — Auth provider tradeoff analysis
Keycloak vs Authentik vs Zitadel 2026 — Ecosystem overview
LighterPack, GearGrams, Trailspace, MyGear — Competitor feature analysis
Multi-tenant architecture guide (WorkOS) — Multi-user data isolation patterns
SQLite to PostgreSQL migration pitfalls (Open WebUI) — Migration risk validation
How to migrate from SQLite to PostgreSQL (Render) — Data migration script patterns

Tertiary (LOW confidence)

Drizzle ORM PostgreSQL best practices 2025 (GitHub Gist) — Schema patterns (validate against official docs during implementation)
GetStream Social Feed Architecture — Feed implementation patterns referenced for anti-patterns to avoid

Research completed: 2026-04-03 Ready for roadmap: yes

24 KiB Raw Blame History