# Architecture Research **Domain:** Public-first discovery platform with catalog enrichment — v2.1 milestone **Researched:** 2026-04-09 **Confidence:** HIGH (based on direct codebase inspection) ## Standard Architecture ### System Overview ``` ┌──────────────────────────────────────────────────────────────────────┐ │ CLIENT (React 19 SPA) │ ├──────────────────────────────────────────────────────────────────────┤ │ ┌──────────────────────┐ ┌─────────────────────────────────┐ │ │ │ Public Shell │ │ Auth Shell (isAuthenticated) │ │ │ │ Discovery / Catalog │ │ Collection / Threads / Setups │ │ │ │ Public Setups │ │ Settings / FAB / TotalsBar │ │ │ └──────────┬───────────┘ └──────────────┬──────────────────┘ │ │ │ │ │ │ ┌──────────┴────────────────────────────────┴──────────────────┐ │ │ │ __root.tsx — single root layout, conditional chrome │ │ │ │ TanStack Router (file-based) + React Query + Zustand │ │ │ └──────────────────────────────────────────────────────────────┘ │ └──────────────────────────────────────────────────────────────────────┘ │ fetch /api/* ┌──────────────────────────────────────────────────────────────────────┐ │ SERVER (Hono on Bun) │ ├──────────────────────────────────────────────────────────────────────┤ │ ┌────────────────────────────────────────────────────────────────┐ │ │ │ Auth Middleware — public bypass list + three-way auth │ │ │ │ Existing bypasses: GET /api/global-items, GET /api/tags, │ │ │ │ GET /api/setups/:id/public, GET /api/users/:id/profile │ │ │ │ NEW bypass: GET /api/discovery/* │ │ │ └────────────────────────────────────────────────────────────────┘ │ │ ┌──────────┐ ┌──────────┐ ┌──────────────┐ ┌────────────────────┐ │ │ │ items │ │ setups │ │ global-items │ │ discovery [NEW] │ │ │ │ threads │ │ profiles │ │ tags │ │ bulk import [NEW] │ │ │ │categories│ │ auth │ │ images │ │ │ │ │ └──────────┘ └──────────┘ └──────────────┘ └────────────────────┘ │ │ ┌────────────────────────────────────────────────────────────────┐ │ │ │ Service Layer (db as first param) │ │ │ └────────────────────────────────────────────────────────────────┘ │ ├──────────────────────────────────────────────────────────────────────┤ │ ┌──────────────────────┐ ┌────────────────────┐ │ │ │ PostgreSQL (Drizzle)│ │ MinIO (S3) │ │ │ └──────────────────────┘ └────────────────────┘ │ ├──────────────────────────────────────────────────────────────────────┤ │ ┌────────────────────────────────────────────────────────────────┐ │ │ │ MCP Server (/mcp, streamable-http) │ │ │ │ 19 existing tools + NEW catalog seeding tools │ │ │ └────────────────────────────────────────────────────────────────┘ │ └──────────────────────────────────────────────────────────────────────┘ ``` ### Component Responsibilities | Component | Responsibility | Status for v2.1 | |-----------|----------------|-----------------| | `__root.tsx` | Auth gate, layout shell, global modals | MODIFY — remove hard redirect for public routes | | `routes/index.tsx` | Home page | REPLACE — becomes Discovery landing page | | `routes/global-items/` | Catalog browsing and item detail | EXTEND — show enrichment fields, attribution | | `server/index.ts` auth bypass list | Public route exceptions | EXTEND — add discovery feed bypass | | `server/routes/global-items.ts` | Catalog CRUD API | EXTEND — add bulk import endpoint | | `server/services/global-item.service.ts` | Catalog queries | EXTEND — trending query, bulk upsert | | `db/schema.ts` globalItems table | Catalog data model | EXTEND — attribution and provenance fields | | `server/mcp/` | Agent tool interface | EXTEND — add catalog seeding tools | ## Recommended Project Structure New files slot into existing conventions. Nothing moves; additions only (except `routes/index.tsx` replacement). ``` src/ ├── client/ │ ├── routes/ │ │ ├── index.tsx # REPLACE: Discovery landing (was Dashboard) │ │ └── global-items/ │ │ ├── index.tsx # EXTEND: enrichment fields in catalog list │ │ └── $globalItemId.tsx # EXTEND: attribution, source URL display │ ├── components/ │ │ ├── DiscoveryFeed.tsx # NEW: trending setups + popular items feed │ │ ├── FeedCard.tsx # NEW: card component for feed items │ │ └── CatalogSearchBar.tsx # NEW: prominent hero search bar │ └── hooks/ │ └── useDiscovery.ts # NEW: React Query hook for /api/discovery/feed │ └── server/ ├── routes/ │ ├── global-items.ts # EXTEND: add POST /bulk endpoint │ └── discovery.ts # NEW: GET /feed, GET /trending ├── services/ │ ├── global-item.service.ts # EXTEND: bulkUpsert, getTrending functions │ └── discovery.service.ts # NEW: feed composition queries └── mcp/ └── tools/ ├── catalog.ts # NEW: upsert_catalog_item, bulk_upsert_catalog └── items.ts # UNCHANGED (user collection tools) ``` ### Structure Rationale - **Replace `routes/index.tsx` directly**: The home route IS the discovery page for v2.1. No separate `/discover` URL needed — that creates two entry points and splits SEO value. - **`discovery.ts` route separate from `global-items.ts`**: Feed queries are read-only, public, and compositional (join multiple tables). Catalog CRUD stays in `global-items.ts`. Separation keeps route files single-responsibility. - **`catalog.ts` MCP tools separate from `items.ts`**: User collection tools (`create_item`) and global catalog tools (`upsert_catalog_item`) have different semantics. Mixing them invites agents using the wrong tool. ## Architectural Patterns ### Pattern 1: Auth-Aware Root Layout (Modify Existing) **What:** `__root.tsx` currently hard-redirects all unauthenticated users to `/login` except `/users/*` and `/login` itself. The `isPublicRoute` check must be expanded to include the discovery landing page and catalog routes. **When to use:** Every new public-facing route requires an addition to this check. **Current code (lines 130-132 of `__root.tsx`):** ```typescript const isPublicRoute = location.pathname.startsWith("/users/") || location.pathname === "/login"; ``` **Change required:** ```typescript const isPublicRoute = location.pathname === "/" || location.pathname.startsWith("/users/") || location.pathname.startsWith("/global-items/") || location.pathname === "/login"; ``` **Trade-offs:** Minimal change, zero new infrastructure. Risk: the list grows and becomes the source of security-adjacent bugs (forgetting to add a route). Consider extracting to a named constant `PUBLIC_ROUTE_PREFIXES` so it's discoverable. ### Pattern 2: Discovery Feed as a Composed Read Endpoint **What:** A new `GET /api/discovery/feed` endpoint returns pre-composed content: trending public setups + popular global items in a single response. No auth required. **When to use:** Discovery landing page initial load. Client calls once on mount. **Server-side composition (discovery.service.ts):** ```typescript export async function getDiscoveryFeed(db: Db) { // Trending setups: public setups, most recently updated const trendingSetups = await db .select({ id, name, userId, updatedAt }) .from(setups) .innerJoin(users, eq(users.id, setups.userId)) .where(eq(setups.isPublic, true)) .orderBy(desc(setups.updatedAt)) .limit(10); // Popular catalog items: most widely owned const popularItems = await db .select({ ...globalItems, ownerCount: count(items.id) }) .from(globalItems) .leftJoin(items, eq(items.globalItemId, globalItems.id)) .groupBy(globalItems.id) .orderBy(desc(count(items.id))) .limit(6); return { trendingSetups, popularItems }; } ``` **Trade-offs:** Single round-trip for the landing page. Risk: query grows expensive as setups table grows. Mitigation: composite index on `(is_public, updated_at DESC)`. ### Pattern 3: Catalog Enrichment via Schema Extension **What:** Add attribution and provenance fields to `globalItems`. These are optional columns — existing records are unaffected until an agent or admin populates them. **Schema additions:** ```typescript // In globalItems pgTable definition sourceUrl: text("source_url"), // Product page or spec sheet manufacturer: text("manufacturer"), // Normalized manufacturer name imageAttribution: text("image_attribution"), // Credit text for catalog image verifiedAt: timestamp("verified_at"), // Last verification date updatedAt: timestamp("updated_at") // Track catalog edits .defaultNow().notNull(), ``` **Trade-offs:** Nullable columns = zero migration risk for existing data. The `updatedAt` column is useful for cache invalidation and agent re-verification workflows. ### Pattern 4: Bulk Upsert for Agent Catalog Seeding **What:** A new `POST /api/global-items/bulk` endpoint accepts an array of catalog items and upserts on the natural key `(brand, model)`. Returns counts of created/updated/skipped. **Auth:** Required (API key or MCP OAuth Bearer token). This is a write operation. **Upsert strategy:** ```sql INSERT INTO global_items (brand, model, category, weight_grams, ...) VALUES (...) ON CONFLICT (brand, model) DO UPDATE SET source_url = EXCLUDED.source_url, manufacturer = EXCLUDED.manufacturer, updated_at = NOW() WHERE global_items.verified_at IS NULL OR EXCLUDED.source_url IS NOT NULL; ``` **Trade-offs:** Natural key upsert is robust for seeding. Risk: "Osprey" vs "Osprey Packs" creates duplicates. Mitigation: normalizeText() before insert, agent prompt instructs canonical brand naming. ### Pattern 5: MCP Catalog Tools (No User Scope) **What:** New MCP tools write to `globalItems` (shared catalog), not `items` (per-user collection). The existing MCP server passes `userId` to every tool handler — catalog tools must accept userId for auth but ignore it for data scope. **New tools:** ``` upsert_catalog_item — insert or update a single global catalog entry bulk_upsert_catalog — batch version for efficiency (up to 50 items) get_catalog_stats — item counts by category for agent planning search_catalog — wrapper over existing searchGlobalItems ``` **Registration pattern (mirrors existing tools):** ```typescript // catalog.ts export const catalogToolDefinitions = [ { name: "upsert_catalog_item", description: "...", inputSchema: {...} }, { name: "bulk_upsert_catalog", description: "...", inputSchema: {...} }, { name: "get_catalog_stats", description: "...", inputSchema: {...} }, ]; export function registerCatalogTools(db: Db) { // Note: no userId param — catalog tools are not user-scoped return { upsert_catalog_item: ..., bulk_upsert_catalog: ..., get_catalog_stats: ... }; } ``` **Trade-offs:** Keeps catalog seeding distinct from personal collection management. The `userId` is available in the MCP server context but catalog tools simply don't use it for data scope — they use it only for audit logging if needed. ## Data Flow ### Public Discovery Page Load (Unauthenticated) ``` Browser (no session) → GET / → React SPA loads (served as static file in prod) → __root.tsx: isAuthenticated=false, isPublicRoute=true → render layout → DiscoveryPage mounts → useDiscovery() → GET /api/discovery/feed (auth bypassed) → discovery.service.getDiscoveryFeed() → queries setups + globalItems → Returns { trendingSetups: [...], popularItems: [...] } → DiscoveryFeed renders FeedCard list → CatalogSearchBar renders (calls existing GET /api/global-items?q=...) → User clicks item → /global-items/:id (public) or /users/:userId (public) ``` ### Agent Catalog Seeding ``` Claude agent (API key or MCP OAuth) → MCP: get_catalog_stats → Returns: { byCategory: [{ name: "Bags", count: 3 }, ...] } → Agent identifies "Bags" as underserved (target: 20 items) → Agent researches 17 bag products → MCP: bulk_upsert_catalog([{ brand, model, weightGrams, ... }]) → global-item.service.bulkUpsert() → normalizeText() → INSERT ON CONFLICT → Returns { created: 14, updated: 2, skipped: 1 } → Agent repeats per category until coverage target met ``` ### Catalog Enrichment Display ``` User navigates to /global-items/:id (public or authenticated) → GET /api/global-items/:id → getGlobalItemWithOwnerCount() → item + ownerCount → Response includes: sourceUrl, manufacturer, imageAttribution, verifiedAt → $globalItemId.tsx renders attribution section if sourceUrl present → "Source: [manufacturer] via [domain]" with external link ``` ### Authenticated User — Unchanged ``` Browser (OIDC session) → __root.tsx: isAuthenticated=true → existing behavior → / → DiscoveryPage (same component, but "Go to Collection" CTA visible) → TotalsBar, FAB, OnboardingWizard shown as today → All collection/thread/setup routes unchanged ``` ## Scaling Considerations | Scale | Architecture Adjustments | |-------|--------------------------| | Current (< 1k users, ~18 catalog items) | Monolith fine, no changes needed beyond feature additions | | 1k-50k users | Add indexes: `CREATE INDEX ON setups (is_public, updated_at DESC)` and `CREATE INDEX ON items (global_item_id)` for ownerCount aggregation | | 50k+ users | Cache `/api/discovery/feed` response server-side (Redis or in-memory with 60s TTL). Feed accuracy does not need to be real-time. | ### Scaling Priorities 1. **First bottleneck:** The ownerCount aggregation in `getDiscoveryFeed` (and `getGlobalItemWithOwnerCount`) joins `items` on `global_item_id`. As items table grows this is O(items). Add index on `items.global_item_id` immediately — it likely does not exist yet since it's not a FK PK. 2. **Second bottleneck:** Public setup listing for the feed scans the `setups` table for `is_public = true`. Composite index `(is_public, updated_at DESC)` makes this a fast index scan. ## Anti-Patterns ### Anti-Pattern 1: Growing the Auth Bypass List Indefinitely **What people do:** Add more regex path checks to the 15-line bypass block in `server/index.ts` every time a new public endpoint appears. **Why it's wrong:** The bypass list in `server/index.ts` already has 5 special cases (lines 125-137). Each addition is a security decision made in the wrong place. A typo in a regex silently exposes an endpoint or silently breaks a public one. **Do this instead:** For this milestone, add the one needed bypass (`GET /api/discovery/*`) cleanly. Longer term, consider route-level middleware via Hono's `.use()` on specific route groups, moving auth decisions to where routes are defined. ### Anti-Pattern 2: Two Separate Root Layouts for Public vs Auth **What people do:** Create a new `__public-root.tsx` with completely different structure for unauthenticated users. **Why it's wrong:** TanStack Router file-based routing would require a `_public` layout segment and routing decisions at the top that duplicate `__root.tsx` logic. The existing root already does conditional rendering of TotalsBar and FAB based on `isAuthenticated`. Extend that pattern — don't duplicate the layout. **Do this instead:** One root, conditional chrome. Public users see the page content without TotalsBar/FAB/OnboardingWizard. The auth check gates those components, not the entire layout. ### Anti-Pattern 3: Using `create_item` MCP Tool for Catalog Seeding **What people do:** Use the existing `create_item` tool during agent seeding sessions, since it already exists and takes brand/model/weight fields. **Why it's wrong:** `create_item` writes to the user-scoped `items` table, not `globalItems`. Items added this way belong to the service account, are invisible to other users as catalog entries, pollute that account's weight/cost totals, and cannot be found via catalog search. **Do this instead:** Use dedicated `upsert_catalog_item` / `bulk_upsert_catalog` tools that target the `globalItems` table. The distinction should be documented clearly in tool descriptions. ### Anti-Pattern 4: Fetching ownerCount on Every Feed Card Render **What people do:** Call `getGlobalItemWithOwnerCount()` for each item in the discovery feed, resulting in N+1 queries. **Why it's wrong:** The feed might render 6-10 catalog items. Each triggers a separate COUNT query. At low scale invisible, at medium scale a noticeable latency hit on the most-loaded endpoint (the public landing page). **Do this instead:** Compute ownerCount in the feed query itself via a single LEFT JOIN + COUNT in the `getDiscoveryFeed` service function. One query returns all items with their counts. ## Integration Points ### Existing Architecture — What Changes | Boundary | Change | Risk | |----------|--------|------| | `__root.tsx` `isPublicRoute` | Add `/` and `/global-items/*` | Low — additive change to conditional | | `server/index.ts` bypass list | Add `GET /api/discovery/*` | Low — same pattern as existing bypasses | | `db/schema.ts` globalItems | Add 5 nullable columns | Low — nullable = no migration risk for existing rows | | `routes/index.tsx` | Replace Dashboard with Discovery page | Medium — existing authenticated users see different home page | | `server/routes/global-items.ts` | Add `POST /bulk` route | Low — new route, existing routes unchanged | | `server/mcp/index.ts` | Register catalogToolDefinitions | Low — existing registration pattern, additive | ### New Components — No Existing Touch | Component | Location | Depends On | |-----------|----------|------------| | `discovery.service.ts` | `server/services/` | Schema migration (globalItems.updatedAt), setups table | | `discovery.ts` route | `server/routes/` | `discovery.service.ts` | | `useDiscovery.ts` hook | `client/hooks/` | `GET /api/discovery/feed` endpoint | | `DiscoveryFeed.tsx` | `client/components/` | `useDiscovery.ts`, `FeedCard.tsx` | | `FeedCard.tsx` | `client/components/` | None — pure presentational | | `CatalogSearchBar.tsx` | `client/components/` | Existing `GET /api/global-items` endpoint | | `catalog.ts` MCP tools | `server/mcp/tools/` | `bulkUpsert` function in `global-item.service.ts` | ### External Services | Service | Change | Notes | |---------|--------|-------| | MinIO (S3) | None | Agent can already use `upload_image_from_url` MCP tool for catalog images | | Logto (OIDC) | None | Public routes bypass Logto entirely | | PostgreSQL | Schema migration | One `ALTER TABLE global_items ADD COLUMN ...` migration | ## Build Order (Dependency-Ordered) **Phase 1 — Foundation (no UI yet)** 1. Schema migration: add `sourceUrl`, `manufacturer`, `imageAttribution`, `verifiedAt`, `updatedAt` to `globalItems`. Run `bun run db:generate && bun run db:push`. Unblocks all subsequent work. 2. Auth bypass: add `GET /api/discovery/*` to bypass list in `server/index.ts`. Trivial change, enables endpoint testing. 3. Add indexes: `global_item_id` on items table, `(is_public, updated_at DESC)` on setups table. Drizzle migration. **Phase 2 — Server (can parallel with Phase 3)** 4. `discovery.service.ts` + `discovery.ts` route + register in `server/index.ts`. Pure reads, testable independently. 5. `bulkUpsert` in `global-item.service.ts` + `POST /api/global-items/bulk` endpoint. **Phase 3 — Client (can parallel with Phase 2)** 6. Modify `__root.tsx` to expand `isPublicRoute`. Must land before discovery page renders for anon users. 7. Replace `routes/index.tsx` with Discovery landing page. Requires Phase 3 step 6 and Phase 2 step 4 (or mock data while API is in progress). **Phase 4 — MCP and Polish** 8. `catalog.ts` MCP tools + register in `server/mcp/index.ts`. Requires bulk upsert endpoint (Phase 2 step 5). 9. Update `global-items/$globalItemId.tsx` to display attribution fields. Requires schema migration (Phase 1 step 1). ## Sources - Direct inspection: `/src/server/index.ts` (auth bypass list at lines 121-139, route registration) - Direct inspection: `/src/client/routes/__root.tsx` (isPublicRoute logic at lines 130-143, auth gate) - Direct inspection: `/src/db/schema.ts` (globalItems table definition) - Direct inspection: `/src/server/routes/global-items.ts` (existing catalog endpoints) - Direct inspection: `/src/server/services/global-item.service.ts` (query patterns, ILIKE search) - Direct inspection: `/src/server/mcp/index.ts` (tool registration pattern) - Direct inspection: `/src/server/middleware/auth.ts` (three-way auth flow) - Direct inspection: `/src/client/routes/index.tsx` (current dashboard — what is being replaced) - `.planning/PROJECT.md` (v2.1 milestone goals and constraints) --- *Architecture research for: GearBox v2.1 Public Discovery milestone* *Researched: 2026-04-09*