docs: complete project research

2026-04-09 14:44:12 +02:00
parent f9c69a1366
commit c4ad5c1b2a
4 changed files with 910 additions and 1592 deletions
--- a/.planning/research/ARCHITECTURE.md
+++ b/.planning/research/ARCHITECTURE.md
--- a/.planning/research/FEATURES.md
+++ b/.planning/research/FEATURES.md
@@ -1,28 +1,26 @@
 # Feature Research
-**Domain:** Multi-user gear management and discovery platform
+**Domain:** Public-first gear discovery platform with catalog enrichment
-**Researched:** 2026-04-03
+**Researched:** 2026-04-09
 **Confidence:** MEDIUM-HIGH
 **Milestone scope:** v2.1 Public Discovery — builds on v2.0 multi-user foundation
 ---
-## Context
+## Context: What Already Exists (v2.0)
-This is the feature research for **v2.0 Platform Foundation** -- transforming GearBox from a single-user gear tracker into a multi-user platform with discovery, global item database, structured reviews, and setup sharing.
+These are shipped. New features below only mention them when v2.1 extends them:
-**Existing features (already built through v1.4):**
+- Full gear collection CRUD with weight/price tracking, categories, images
 - Gear collection CRUD with categories, weight/price, images, quantity
 - Planning threads with candidate comparison, ranking, pros/cons, impact preview
- Named setups (loadouts) with classification, donut chart visualization
+- Named setups with classification, donut chart, weight breakdowns
- Search/filter, CSV import/export, item duplication
+- PostgreSQL multi-user data model, Logto OIDC external auth
- Dashboard home page, onboarding wizard
+- S3 image storage (MinIO), global item catalog with tags and search
- Single-user auth (cookie sessions + API keys), MCP server (19 tools)
+- User profiles with avatar/bio, public setup sharing
 - Catalog-driven add flow, global FAB, item/catalog detail pages
 - MCP server (19 tools), API key + OAuth auth methods
-**Key project constraints:**
+All features below are **new for v2.1** unless explicitly marked "extend existing."
 - No freeform UGC until moderation infrastructure exists (structured input only)
 - Discovery-first, not social-first
 - External auth provider (self-hosted, open-source)
 - Postgres for multi-user platform
 ---
@@ -30,150 +28,113 @@ This is the feature research for **v2.0 Platform Foundation** -- transforming Ge
 ### Table Stakes (Users Expect These)
-Features users assume exist on any multi-user gear platform. Missing these makes the platform feel broken or pointless.
+Features that public-first gear discovery platforms are expected to provide. Missing these makes the product feel broken or hostile to new visitors.
 | Feature | Why Expected | Complexity | Notes |
 |---------|--------------|------------|-------|
-| **User registration and authentication** | Cannot have multi-user without accounts. Every platform has sign-up/login. | HIGH | External auth provider integration (Authentik, Keycloak, or similar). Replaces current single-user cookie auth. All existing entities need userId FK. |
+| Browse catalog and setups without login | All comparable platforms (Lighterpack shared lists, BikeGearDatabase, RTINGS) allow full read access. Forcing login before browse kills SEO and casual discovery. | LOW | Middleware change: lift auth guard from all GET /api/* endpoints. Public setup sharing already exists at v2.0 — generalize to all read routes. Session-optional pattern already proven. |
-| **User profiles (public)** | Every community platform has profiles. Users need identity to share and be discovered. | LOW | Minimal: display name, avatar URL, bio text, joined date. Public profile page lists user's public setups. No follower counts needed. |
+| Discovery landing page with catalog search prominent | RTINGS, Wirecutter, and BikeGearDatabase all lead with search or category browse above the fold. Users arriving from search engines expect to search immediately, not to log in. | MEDIUM | Replace dashboard for unauthenticated visitors. Search bar + tag chips already exist as FAB overlay — promote to inline page hero. Authenticated users still see their dashboard. Route-level auth split. |
-| **Setup visibility controls** | Users will not share setups if they cannot control what is public. Privacy is table stakes for any sharing platform. | LOW | Binary public/private toggle per setup. Default to private (opt-in sharing). Existing setups migrated as private. |
+| Contextual auth prompt only on write actions | Users must understand the access model without reading documentation. "Browse freely, sign in to save" must be self-evident. Confusing this causes drop-off. | LOW | Inline "Sign in to add to your collection" CTA on catalog item detail pages. No login wall on any browse action. |
-| **Public setup detail pages** | Shared setup links must resolve to a readable page. If sharing is a feature, the shared thing must be viewable. | MEDIUM | Read-only view with item list, weight/cost totals, donut chart, creator attribution. No auth required for public setups. Extends existing setup detail view. |
+| Product attribution: brand and manufacturer fields | Any gear database users trust shows where a product originates. Missing attribution makes catalog look scraped or unverifiable. | LOW | Add `brand`, `manufacturer` fields to catalog items schema. Already has `name` — add structured attribution alongside. Display prominently on detail pages and cards. |
-| **Global item database (searchable)** | Users expect to find gear by name rather than entering specs from scratch every time. LighterPack's weakness is fully manual data entry. | HIGH | Central product catalog with brand, model, category, manufacturer weight, MSRP, product URL, image. Users search and link rather than re-enter. Seed with 200-500 items in core categories to bootstrap. This is the foundational dependency for reviews, aggregation, and item detail pages. |
+| Image source attribution display | Legal requirement and trust signal. Gear Patrol, BikeGearDatabase, and manufacturer catalogs all credit image source. Omitting creates IP risk on manufacturer-supplied images. | LOW | Add `imageCredit` (display text, e.g. "Apidura") and `imageSourceUrl` fields to catalog items. Display as "Photo: [credit]" beneath product images on detail pages. |
-| **Link personal items to global items** | Once a global DB exists, users expect to connect their gear to canonical entries for richer data. | MEDIUM | Optional FK from user items to global items. Enables aggregation (owner count, avg weight, reviews). Must handle items not yet in global DB gracefully. |
+| Community usage signal on catalog items | Users expect to see "owned by N people" or "in N setups" to gauge real-world adoption. Lighterpack shows this per shared list. RTINGS shows review counts. | LOW | `ownerCount` already exists on catalog items in v2.0. Surface it prominently on catalog cards and detail pages. Add "appears in N setups" count derived from setupItems. |
-| **Item detail page (aggregated)** | When browsing gear, clicking an item should show consolidated info: specs, who owns it, ratings. Standard on any product platform. | HIGH | Aggregated view combining: manufacturer specs from global DB, owner count, setup appearances, average ratings, crowd-reported weights. This is the integration hub for all platform features. |
+| Shareable catalog item and setup URLs resolve without login | Public-first means deep-linking works. If a setup or catalog item URL is shared, it must render for anyone — no login redirect. | LOW | Detail pages already exist at v2.0. Verify: unauthenticated API responses work end-to-end, meta tags render, no auth redirect on page load. Likely already 90% working given public setup sharing. |
 | **Structured reviews (ratings)** | Any product-oriented community needs evaluation. Users expect to rate gear and see what others think. | MEDIUM | Overall 1-5 star rating plus 3-5 dimension ratings (varies by product category). Attached to global items, not personal items. One review per user per global item. No freeform text per project constraint. |
 | **Discovery browse page** | Users expect a way to find interesting setups and gear beyond their own collection. Without this, multi-user adds no value. | MEDIUM | Not algorithmic for v2.0. Three sections: recent public setups, recently reviewed items, popular gear (most owned). Simple sorted lists with pagination. |
 | **Search global items** | Must be able to find products by name/brand in the global database. Powers linking, browsing, and review discovery. | MEDIUM | Full-text search on name, brand, category. Used in "link my item" flow, discovery browsing, and review lookup. Postgres full-text search or trigram index. |
 ### Differentiators (Competitive Advantage)
-Features that set GearBox apart from LighterPack, GearGrams, Trailspace, and MyGear. Aligned with core value: "help people make better gear decisions."
+Features that set GearBox apart from Lighterpack (lists only, no catalog), BikeGearDatabase (editorial, not user collections), and generic wishlist tools.
 | Feature | Value Proposition | Complexity | Notes |
 |---------|-------------------|------------|-------|
-| **Crowd-verified specs** | LighterPack trusts user-entered data blindly. GearBox can show "manufacturer says 450g, 12 owners measured avg 478g." Real-world weight verification is unique and high-value for weight-conscious users. | MEDIUM | Aggregate weightGrams from all user items linked to a global item. Compare against manufacturer spec. Display on item detail page. Needs sufficient linked items to be meaningful (threshold: 3+ owners). |
+| Discovery landing feed (community setups + catalog items) | No direct competitor combines a global gear catalog with user setup feeds. Lighterpack has no discovery page. BikeGearDatabase is editorial, not community-driven. GearBox can show real user gear choices with weight data. | MEDIUM | Two feed sections: (a) recently shared public setups sorted by recency, filterable by category; (b) popular/new catalog items by ownerCount. No algorithm needed at launch — recency + ownerCount is sufficient and honest. |
-| **Review dimensions per product category** | Trailspace and OutdoorGearLab use editorial ratings with fixed dimensions. GearBox crowd-sources structured ratings with category-specific dimensions: a tent gets "weather protection, ventilation, setup ease" while a stove gets "boil time, fuel efficiency, packability." More relevant than one-size-fits-all. | MEDIUM | Define 3-5 rating dimensions per product category via admin config. Store dimension ratings alongside overall rating. Display as radar chart or bar chart on item detail page. |
+| Agent-powered catalog seeding via MCP tools | Unique to GearBox. No other gear platform has agent-friendly structured import. Enables rapid catalog population by Claude agent swarms without manual data entry. Programmatic SEO value compounds with catalog size. | HIGH | Requires: bulk create MCP tool, structured import with dry-run/preview mode, attribution tracking on agent-inserted records. GearBox already has MCP server and API key auth — foundation exists. |
-| **"X people own this" social proof** | Shows popularity and real adoption. No gear tracker does this because they lack a global item database. Simple count, powerful signal. | LOW | Count of users who linked a collection item to this global item. Displayed prominently on item detail page and in search results. Zero implementation complexity once linking exists. |
+| Catalog enrichment infrastructure with provenance tracking | Enables crowd + agent contributions with full source tracking. Comparable to Wikipedia's citation model but structured. Builds long-term trust in catalog data quality. | MEDIUM | New schema fields: `sourceUrl`, `sourceType` (enum: manufacturer / community / agent / import), `contributedBy` (userId or agent identifier string), `verifiedAt`. Migration only, lightweight UI needed initially. |
-| **Setup composition insights** | "This item appears in 47 bikepacking setups, commonly paired with Y and Z." Cross-setup analysis no competitor offers. Answers "what do people use this with?" | MEDIUM | Query across all public setups containing a given global item. Show co-occurrence patterns. Powerful but can be deferred to v2.x if query performance is a concern. |
+| SEO-indexable catalog pages ranking for product searches | Public catalog pages that rank for "[product name] weight specs" are a major organic acquisition channel. RTINGS built a durable traffic moat this way via programmatic SEO. GearBox can do the same for gear. | MEDIUM | Pages already exist. Add: `<title>` tags with product name + category, OG meta tags, JSON-LD Product schema markup. Primary complexity: TanStack Router is client-rendered — crawlers need either SSR or static prerender for bots. This is the phase's primary technical risk. |
-| **Setup impact preview with global items** | Already built for personal items. Extending to global items lets users preview "adding this from the store to my setup changes weight by X." Bridges research and collection management. | LOW | Already exists for personal items. Add "preview in my setup" button on global item detail pages. Reuse existing impact preview logic. |
+| Setup impact preview teaser on public catalog pages | Showing "add this to your setup and base weight changes by +Xg" is unique. No other gear catalog does this. Showing the feature on public pages teases value and drives sign-up intent. | MEDIUM | Extend existing impact preview (v1.3) to show a teaser CTA on unauthenticated catalog detail pages: "See how this affects your setup → [Sign in to try]". Requires no new backend work — frontend auth-conditional render. |
 | **Planning threads with global item integration** | Research threads that pull in specs, reviews, and owner data from the global DB. Candidates link to global items for richer comparison than manual data entry. | MEDIUM | Add optional globalItemId to thread candidates. Auto-populate weight, price, image from global item. Show community ratings and owner count inline on candidates. |
 | **Real-world weight distribution** | Histogram showing "owners report weights between 440g-490g" for a product. Beats a single manufacturer number. Valuable for ultralight community. | LOW | Aggregate weightGrams from all linked items. Display min/max/avg. Histogram if 10+ data points. |
 | **Copy/fork public setups** | Use someone else's setup as a starting template. LighterPack has clunky CSV-based copying. One-click fork is much better UX. | LOW | Create new setup copying all items from a public setup. Items must exist in user's collection (or be linked to same global items). Clear UX for "items you do not own yet." |
 ### Anti-Features (Commonly Requested, Often Problematic)
 | Feature | Why Requested | Why Problematic | Alternative |
 |---------|---------------|-----------------|-------------|
-| **Freeform text reviews** | Users want to explain their experience in detail | Requires moderation, spam filtering, content policy, reporting infrastructure. PROJECT.md explicitly defers until moderation exists. | Structured ratings with predefined dimensions. Short predefined tags for pros/cons (e.g., "lightweight", "durable", "runs small"). |
+| Algorithmic feed ranking using engagement signals | "Show popular content" feels natural | Requires engagement data volume that does not exist at v2.1 scale. Empty or manipulated feed is worse than no feed. Gaming and spam risk immediately. | Simple recency + ownerCount sort. Add engagement signals only when data volume and moderation infrastructure justify it. |
-| **Comments on setups** | Social engagement, questions about gear choices | Moderation burden, notification system, spam, harassment risk. Deferred in PROJECT.md. | Link to user profile. Contact happens outside platform. |
+| Open wiki-style catalog editing (anyone edits any item) | Fastest path to catalog enrichment | Data quality collapses without moderation. Adversarial edits, edit wars. Requires revert/history infrastructure. Already decided out-of-scope in PROJECT.md. | Structured contributions: users submit items, agents bulk-seed with attribution, admins verify. provenance fields track every change. |
-| **Follow users / activity feed** | Social graph, staying updated on people | Turns a gear tool into a social network. Notification infrastructure, feed ranking, engagement metrics, retention loops. Project decision: discovery-first, not social-first. | Discovery feed shows popular/recent content without requiring social connections. |
+| Bulk catalog import from scraped external sources | "Just import all BikeGearDB items" | Copyright risk. Data quality issues. Stale data. Attribution impossible — you do not know who owns the content. Legal exposure. | Agent-seeding via MCP with explicit source tracking. Manual + agent creates clean provenance chain with `sourceUrl` per item. |
-| **Marketplace / buy-sell** | Users want to trade used gear | Payment processing, fraud prevention, disputes, shipping logistics, tax compliance. Massive liability. | Link to product URLs on global items. Users buy through retailers. |
+| Real-time "X users viewing this" presence indicators | Social proof, FOMO feeling | Zero signal value at current traffic scale, adds WebSocket complexity, privacy concern for a utility tool. | ownerCount ("X people own this") is sufficient social proof without live presence tracking. |
-| **AI gear recommendations** | "What tent should I buy for bikepacking?" | Training data requirements, bias, liability for bad recommendations, hallucination risk. | Global item pages with ratings, owner counts, and setup co-occurrence do implicit recommendation. "People who own X also own Y." |
+| Comments on catalog items or setups | Community enrichment, Q&A | Freeform UGC explicitly blocked in PROJECT.md until moderation infrastructure exists. Moderation requires policy, tooling, reporting. | Structured fields only: tags, ratings, attribution. Defer freeform to future milestone after moderation is designed. |
-| **Wiki-style open item editing** | Community wants to correct/enrich global item specs | Edit wars, vandalism, quality degradation, dispute resolution. PROJECT.md explicitly rules this out. | Structured contributions only: report measured weight, submit rating. Admin approval for spec corrections. Trusted contributor program later. |
+| Social follow / activity feed | "See what friends added" | Social graph is a separate product. Deferred explicitly in PROJECT.md. Notification infrastructure, feed ranking, retention loops all out of scope. | Public setup browsing by category or recency is sufficient discovery without requiring a follow graph. |
-| **Price tracking / deal alerts** | Users want to know when gear goes on sale | Requires scraping retailer sites, fragile, legal gray area, maintenance burden. PROJECT.md rules this out. | Store product URL so users can check prices manually. |
+| Infinite scroll personalized feed | "Netflix for gear" | Personalization requires user history. Unauthenticated visitors have no history. Personalized recommendations require ML infrastructure far beyond v2.1 scope. | Category-filtered browse + search. Personalization post-login once collection data exists is a v3+ feature. |
 | **Real-time collaborative setups** | "Plan a group trip together" | WebSocket infrastructure, conflict resolution, permissions model, presence indicators. Massive complexity for niche use case. | Each user builds their own setup. Fork public setups as templates. |
 | **Gamification (badges, points, levels)** | Drive engagement and contributions | Incentivizes quantity over quality. Users game systems for points rather than providing genuine data. Creates toxic dynamics. | Soft social proof: "contributed X reviews" on profile. No points, no leaderboards. |
 | **Instagram-style infinite scroll feed** | Addictive browsing experience | Engagement-maximizing design conflicts with utility-focused tool. Users come to research decisions, not scroll endlessly. | Paginated, filterable discovery page. Browse with intent, not addiction. |
 ---
 ## Feature Dependencies
 ```
-[External Auth Provider]
+Public browse without login
-    |
+    └──prerequisite for──> Discovery landing page (needs unauth API render)
-    v
+    └──prerequisite for──> SEO-indexable catalog pages (bots must reach pages)
-[Multi-User Data Model (userId FK on all entities)]
+    └──prerequisite for──> Setup impact preview teaser on public pages
-    |
+    └──prerequisite for──> Shareable URLs confirmed working without auth
-    +---> [Postgres Migration] (concurrent access, auth provider needs Postgres)
+
-    |
+Catalog enrichment schema (attribution fields)
-    +---> [User Profiles (public)]
+    └──prerequisite for──> Agent-powered MCP catalog seeding (tools write into these fields)
-    |         |
+    └──prerequisite for──> Image attribution display (imageCredit field must exist)
-    |         +---> [Public Profile Pages]
+    └──prerequisite for──> Source provenance display on detail pages
-    |         |         |
+
-    |         |         +---> [Discovery Feed (browse users' public content)]
+Agent-powered MCP catalog seeding tools
-    |         |
+    └──requires──> Catalog enrichment schema (attribution fields must exist first)
-    |         +---> [Setup Visibility Controls (public/private)]
+    └──enhances──> Discovery landing feed (more items = richer feed)
-    |                   |
+    └──enhances──> SEO surface area (more pages = more potential rankings)
-    |                   +---> [Public Setup Detail Pages]
+
-    |                             |
+Discovery landing page
-    |                             +---> [Copy/Fork Public Setups]
+    └──requires──> Public browse without login
-    |
+    └──requires──> Feed query API (popular setups + recent catalog items)
-    +---> [Global Item Database]
+    └──uses existing──> Catalog search (FAB overlay promoted to page hero)
-              |
+
-              +---> [Search Global Items]
+SEO metadata on catalog pages
-              |
+    └──requires──> Public browse without login (bots must reach pages)
-              +---> [Link Personal Items to Global Items]
+    └──depends on──> Crawlability solution (SSR or prerender for TanStack Router)
-              |         |
+    └──enhances──> Agent-seeded catalog (more items = more indexed pages)
-              |         +---> [Owner Count ("X people own this")]
+
-              |         |
+Setup impact preview teaser (public)
-              |         +---> [Crowd-Verified Specs (aggregated weight)]
+    └──requires──> Public browse without login
-              |         |
+    └──depends on existing──> Impact preview feature (v1.3, already shipped)
              |         +---> [Setup Appearances Count]
              |         |
              |         +---> [Real-World Weight Distribution]
              |
              +---> [Structured Reviews]
              |         |
              |         +---> [Review Dimensions per Category]
              |         |
              |         +---> [Average Ratings Display]
              |
              +---> [Item Detail Pages (aggregated hub)]
              |         |
              |         +---> [Setup Composition Insights]
              |
              +---> [Planning Thread Global Item Integration]
                        |
                        +---> [Candidate Auto-populate from Global DB]
 ```
 ### Dependency Notes
- **Multi-user data model is the absolute foundation.** Every feature depends on userId ownership. Items, setups, threads, categories, reviews -- all need user scoping. This is the biggest single migration.
+- **Public browse is the prerequisite for everything.** Auth middleware change must land first. All other v2.1 features depend on unauthenticated API access working correctly.
- **Postgres migration is coupled with auth.** The external auth provider (Authentik, Keycloak) needs Postgres. Migrating the app DB at the same time avoids running two databases. Do these together.
+- **Catalog enrichment schema must precede agent MCP tools.** The bulk create and import MCP tools write attribution fields. Building tools before schema means schema-breaking changes later.
- **Global item database is the second foundation.** Reviews, item detail pages, owner counts, crowd-verified specs, and planning thread integration all depend on canonical global item records. Without this, multi-user is just "LighterPack with accounts."
+- **SEO crawlability is the primary technical risk.** TanStack Router renders client-side. Search engine bots do not execute JavaScript. Without SSR or a static prerender pass, catalog pages will not be indexed. This is a known gap with the current stack — needs a solution before SEO-targeted work makes sense. Defer SEO metadata work to P2 until crawlability is resolved.
- **Structured reviews require global items.** Reviews attach to global items, not personal collection items. Otherwise reviews fragment across duplicate user-entered items with no way to aggregate.
+- **Agent seeding is high complexity but high leverage.** It is both a catalog population tool and a v2.1 launch enabler. Without sufficient catalog items, the discovery feed is thin and the platform feels empty. Prioritize MCP tooling early so catalog seeding can run in parallel with UI work.
 - **Item detail pages are the integration point.** They combine global item specs, aggregated user data, reviews, owner count, and setup appearances. Should be built after all data sources exist.
 - **Discovery feed requires profiles + public content.** Cannot browse without user identity and visibility controls producing public content to show.
 - **Linking is the bridge.** Personal items link to global items. This single FK enables owner count, crowd-verified specs, weight distribution, and setup appearances. Prioritize this flow.
 ---
 ## MVP Definition
-### Launch With (v2.0 Platform Foundation)
+This is a subsequent milestone on an existing shipped product. MVP here means minimum to deliver the v2.1 goal: public-first discovery platform.
- [ ] **External auth provider integration** -- Nothing works without multi-user identity
+### Launch With (v2.1 core)
 - [ ] **Postgres migration** -- Required for concurrent access; auth provider dependency
 - [ ] **Multi-user data model** -- userId on items, setups, threads, categories; data isolation
 - [ ] **User profiles (minimal)** -- Display name, avatar, bio; public profile page
 - [ ] **Setup visibility controls** -- Public/private toggle, default private
 - [ ] **Public setup detail pages** -- Shareable read-only view with attribution
 - [ ] **Global item database with seed data** -- Schema, admin seeding, search
 - [ ] **Link personal items to global items** -- Association flow in collection UI
 - [ ] **Structured reviews** -- Overall rating + dimension ratings on global items
 - [ ] **Item detail pages** -- Aggregated specs, owner count, average ratings
 - [ ] **Discovery browse page** -- Recent public setups, recently reviewed, popular items
-### Add After Validation (v2.x)
+- [ ] Public browse without login — lift auth guard from all GET routes. Every other feature depends on this.
 - [ ] Discovery landing page — replaces dashboard for unauthenticated visitors. Catalog search hero + two feed sections (recent setups, popular catalog items). Recency + ownerCount sort, no algorithm.
 - [ ] Catalog enrichment schema migration — add `brand`, `manufacturer`, `sourceUrl`, `sourceType`, `imageCredit`, `imageSourceUrl`, `contributedBy` fields. Schema first, UI follows.
 - [ ] Image attribution display on catalog detail pages — "Photo: [credit]" below product images, sourced from new `imageCredit` field.
 - [ ] Agent MCP catalog seeding tools — bulk create endpoint/tool, structured import with attribution, dry-run/preview mode, batch result reporting.
 - [ ] Initial catalog population via agent — run agent seeding for 3-5 priority categories (bikepacking bags, tents, sleeping bags, navigation devices, cycling computers). Target: 100+ catalog items with attribution.
 - [ ] Community usage signals surfaced — ownerCount and "appears in N setups" count prominent on catalog cards and detail pages.
- [ ] **Crowd-verified specs display** -- "Manufacturer: 450g, Community avg: 478g" (needs 3+ owners per item to be meaningful)
+### Add After Core is Stable (v2.1.x)
 - [ ] **Setup composition insights** -- "Commonly paired with" co-occurrence analysis
 - [ ] **Planning thread global item integration** -- Candidates auto-populate from global DB
 - [ ] **Popular gear rankings by category** -- Most owned, highest rated per category
 - [ ] **Copy/fork public setups** -- One-click template from public setups
 - [ ] **Review dimension customization** -- Admin configures rating dimensions per product category
 - [ ] **Real-world weight distribution** -- Histogram on item detail pages
 - [ ] **Global item suggestion workflow** -- Users propose new items for admin review
-### Future Consideration (v3+)
+- [ ] Contextual "See how this affects your setup" CTA on public catalog pages — setup impact preview teaser with login prompt. Add once public browse is confirmed stable.
 - [ ] Manufacturer/brand filter on catalog browse — add brand as a filterable facet. Only valuable once catalog volume justifies filtering (target: after initial seeding).
 - [ ] SEO metadata on catalog pages — `<title>`, OG tags, JSON-LD Product schema. Add after crawlability solution is determined.
- [ ] **Freeform reviews with moderation** -- After moderation infrastructure exists
+### Future Consideration (v2.2+)
- [ ] **Comments on setups** -- After moderation infrastructure exists
+
- [ ] **Follow users / activity feed** -- After discovery model is validated
+- [ ] Personalized discovery feed post-login — requires collection data volume and recommendation design.
- [ ] **OAuth / social login** -- After external auth provider is stable
+- [ ] Verified catalog item badge — admin-marked verified items. Requires admin tooling.
- [ ] **Trusted contributor program** -- Verified users can edit global item specs
+- [ ] User-submitted catalog enrichment — structured form to suggest corrections or add missing items. Requires contribution review workflow.
 - [ ] Engagement signals in feed — view count, saves. Requires data volume to be meaningful.
 ---
@@ -181,122 +142,57 @@ Features that set GearBox apart from LighterPack, GearGrams, Trailspace, and MyG
 | Feature | User Value | Implementation Cost | Priority |
 |---------|------------|---------------------|----------|
-| External auth provider | HIGH | HIGH | P1 |
+| Public browse without login | HIGH | LOW | P1 |
-| Postgres migration | HIGH | HIGH | P1 |
+| Discovery landing page | HIGH | MEDIUM | P1 |
-| Multi-user data model (userId on entities) | HIGH | HIGH | P1 |
+| Catalog enrichment schema (attribution fields) | HIGH | LOW | P1 |
-| User profiles (basic) | HIGH | LOW | P1 |
+| Image attribution display | MEDIUM | LOW | P1 |
-| Setup visibility controls | HIGH | LOW | P1 |
+| Agent MCP catalog seeding tools | HIGH | HIGH | P1 |
-| Public setup detail pages | HIGH | MEDIUM | P1 |
+| Initial catalog population (agent run) | HIGH | MEDIUM (depends on MCP tools) | P1 |
-| Global item database (schema + seed) | HIGH | HIGH | P1 |
+| Community usage signals (ownerCount visible) | MEDIUM | LOW | P1 |
-| Link personal items to global items | HIGH | MEDIUM | P1 |
+| Shareable URL audit (confirm unauth render) | HIGH | LOW | P1 |
-| Search global items | HIGH | MEDIUM | P1 |
+| Setup impact preview teaser (public) | MEDIUM | MEDIUM | P2 |
-| Structured reviews | HIGH | MEDIUM | P1 |
+| Brand/manufacturer filter on catalog browse | LOW | LOW | P2 |
-| Item detail pages (aggregated) | HIGH | HIGH | P1 |
+| SEO metadata on catalog pages | MEDIUM | MEDIUM (crawlability dependency) | P2 |
-| Discovery browse page | MEDIUM | MEDIUM | P1 |
+| Personalized discovery feed | MEDIUM | HIGH | P3 |
-| Crowd-verified specs | HIGH | LOW | P2 |
+| Verified catalog badge | LOW | MEDIUM | P3 |
-| Setup composition insights | MEDIUM | MEDIUM | P2 |
+| User-submitted enrichment form | LOW | MEDIUM | P3 |
 | Planning thread global DB integration | MEDIUM | MEDIUM | P2 |
 | Copy/fork public setups | MEDIUM | LOW | P2 |
 | Popular gear rankings | MEDIUM | LOW | P2 |
 | Freeform reviews + moderation | MEDIUM | HIGH | P3 |
 | Follow users | LOW | MEDIUM | P3 |
 | Setup comments | LOW | MEDIUM | P3 |
 **Priority key:**
- P1: Must have for v2.0 platform launch
+- P1: Required for v2.1 milestone goal
- P2: Should have, add in v2.x once core is validated
+- P2: Add once v2.1 core is validated
- P3: Future consideration, requires new infrastructure (moderation, notifications)
+- P3: Future consideration, requires new infrastructure
 ---
 ## Competitor Feature Analysis
-| Feature | LighterPack | GearGrams | Trailspace | MyGear | GearBox v2.0 |
+| Feature | Lighterpack | BikeGearDatabase | RTINGS | GearBox v2.1 |
-|---------|-------------|-----------|------------|--------|-------------|
+|---------|-------------|------------------|--------|--------------|
-| Gear lists/setups | Yes, drag-and-drop | Yes, trip-based | No (review only) | Yes, "Locker" | Yes, named setups with classification |
+| Browse without login | Yes (shared list links only) | Yes (all content public) | Yes (fully public) | Yes — all catalog + setups public |
-| Weight tracking | Base/worn/consumable | Carried/worn/consumable | No | Basic | Base/worn/consumable + unit conversion + donut charts |
+| Discovery landing page | No (login required to see anything) | Yes (editorial feed + categories) | Yes (category browse + new/updated) | Yes — catalog search hero + community feed |
-| User profiles | Minimal (no bio) | Minimal | Review history page | Full social profile | Display name, avatar, bio, public setups |
+| Global gear catalog | No (fully user-entered) | Editorial reviews only | Product test database | Yes — crowd + agent-seeded with attribution |
-| Sharing | Public link, embed code | Public link | N/A | Social feed posts | Public/private toggle, shareable URLs |
+| Image attribution | N/A (no images) | Editorial photo credit | Manufacturer-supplied images | Explicit imageCredit + imageSourceUrl fields |
-| Global item database | No (all user-entered) | No | Yes (editorial catalog) | No | Yes, seeded + crowd-enriched with verified specs |
+| Community setups visible publicly | Yes (shared list links) | No | No | Yes — public setups with weight data |
-| Structured reviews | No | No | Yes (summary/pros/cons + rating) | Basic star rating | Dimension ratings per product category |
+| Setup weight analysis | Yes (per list) | No | No | Yes + impact preview |
-| Item aggregation | No | No | Editorial scores only | No | Owner count, avg weight, setup appearances, crowd specs |
+| Agent-friendly catalog API (MCP) | No | No | No | Yes — unique differentiator |
-| Discovery/browse | No | No | Browse by category | AI-tagged social feed | Browse setups, items, popular gear (intent-driven, not feed) |
+| SEO catalog pages | No | Yes (editorial articles) | Yes (programmatic product pages) | Target for v2.1.x after crawlability resolved |
-| Purchase research | No | No | Price comparison links | No | Planning threads with candidates, ranking, impact preview |
+| Provenance / source tracking | No | Editorial byline only | "Tested by RTINGS staff" | Yes — sourceType enum, contributedBy, sourceUrl |
 | Crowd-verified specs | No | No | No | No | Manufacturer vs. community-measured weight comparison |
 | Mobile app | No | Yes (iOS/Android) | No | Yes (iOS/Android) | No (responsive web, per project constraint) |
 ### Competitive Positioning
 GearBox occupies a unique niche: the only platform combining **gear management** (LighterPack's strength), **structured community reviews** (Trailspace's strength), and **crowd-verified specs** (nobody does this). The planning threads feature has no direct competitor equivalent in the gear domain.
 **Key advantages over each competitor:**
 - **vs. LighterPack:** Global item database eliminates manual spec entry. Multi-user with profiles and sharing. Structured reviews provide community intelligence.
 - **vs. GearGrams:** Richer comparison tools (planning threads). Crowd-verified specs. Item detail pages with aggregated data.
 - **vs. Trailspace:** Not just reviews -- full gear management and setup composition. Users own and track their gear, not just review it. Crowd ratings, not editorial-only.
 - **vs. MyGear:** Not social-first (no engagement loops, no AI tagging gimmicks). Utility-focused: research decisions, verify specs, compare options. Hobby-agnostic data model.
 **Accepted gaps:**
 - No mobile native app (web-first, responsive design sufficient per project constraints)
 - No social feed in the Instagram sense (intentional: discovery-first, not social-first)
 - No freeform text content (intentional: structured input only until moderation exists)
 ---
 ## Implementation Notes for Key Features
 ### Global Item Database Schema
 The global item table is distinct from user items. It represents canonical products:
 - `globalItems`: id, brand, model, name (display), categoryId, manufacturerWeightGrams, manufacturerPriceCents, productUrl, imageFilename, description, createdAt, updatedAt, createdByUserId
 - User items get optional `globalItemId` FK for linking
 - Admin-seeded initially; later users can suggest additions via a proposal workflow
 ### Structured Review Schema
 - `reviews`: id, userId, globalItemId, overallRating (1-5), createdAt, updatedAt
 - `reviewDimensionRatings`: id, reviewId, dimensionId, rating (1-5)
 - `reviewDimensions`: id, categoryId, name (e.g., "durability", "packability"), sortOrder
 - Unique constraint: one review per user per global item
 - Dimensions are per-category, admin-defined
 ### Discovery Feed Approach
 Not a personalized algorithmic feed. Three content streams, each a simple sorted query:
 1. **Recent public setups** -- ORDER BY createdAt DESC, paginated
 2. **Recently reviewed items** -- Global items with recent reviews, ORDER BY latest review date
 3. **Popular gear** -- Global items ORDER BY linked owner count DESC
 No recommendation engine. No engagement scoring. Users browse with intent.
 ### User Profile Data
 Minimal profile extending the auth provider's user record:
 - Display name (from auth provider or custom)
 - Avatar URL (from auth provider or uploaded)
 - Bio (short text, 280 char limit)
 - Joined date
 - Public setups list (derived from setup visibility)
 - Review count (derived)
 - Collection size (count of items, public stat)
 ---
 ## Sources
- [LighterPack](https://lighterpack.com/) -- Gear list builder, community standard for ultralight hikers. Public sharing via link, no profiles or reviews.
+- [LighterPack](https://lighterpack.com/) — public list sharing model, community usage patterns. Public browse only via shared links, no general discovery. (MEDIUM confidence, WebSearch)
- [LighterPack tutorial (99Boulders)](https://www.99boulders.com/lighterpack-tutorial) -- Feature overview including sharing, linking, limitations.
+- [Bike Gear Database](https://www.bikegeardatabase.com/) — public editorial gear catalog, category browse patterns, ~30k monthly visitors. (MEDIUM confidence, WebSearch)
- [GearGrams](https://www.geargrams.com/) -- Trip-based gear list tracker with weight classification.
+- [RTINGS SEO Case Study — Ahrefs](https://ahrefs.com/blog/rtings-seo-case-study/) — programmatic SEO via catalog pages, category-based navigation, discovery-oriented layout. (MEDIUM confidence)
- [Trailspace](https://www.trailspace.com/) -- User gear reviews with structured Summary/Pros/Cons format and Review Corps program.
+- [NN/G E-commerce Homepages and Listing Pages](https://www.nngroup.com/articles/ecommerce-homepages-listing-pages/) — subcategory surfacing above listings improves discoverability; 30-50% of product interactions come from unintended category navigation. (HIGH confidence)
- [Trailspace Review Form](https://www.trailspace.com/blog/2012/02/29/new-gear-review-form.html) -- Details on structured review fields with category-specific suggestions.
+- [Sales Layer MCP Server for catalog enrichment](https://www.saleslayer.com/ai-pim/mcp) — agent-powered product information management, bulk update patterns, audit and quality scoring via MCP tools. (MEDIUM confidence)
- [MyGear](https://mygear.world/) -- Social app for sports gear with Locker, feed, AI gear recognition, challenges.
+- [Creative Commons Attribution Best Practices](https://wiki.creativecommons.org/wiki/Recommended_practices_for_attribution) — TASL attribution standard; attribution must be visible and associated with the image. (HIGH confidence)
- [Outdoor Gear Lab](https://www.outdoorgearlab.com/) -- Professional structured gear reviews with side-by-side comparison methodology.
+- [Pixsy Image Credits Guide](https://www.pixsy.com/image-licensing/image-credits) — legal requirements and UX placement for image credits; "image courtesy of" as standard phrasing. (HIGH confidence)
- [Ultralight App](https://trailsmag.net/blogs/hiker-box/ultralight-the-gear-tracking-app-i-m-leaving-lighterpack-for) -- LighterPack alternative analysis showing community pain points.
+- [GS1 Image Standards](https://orbitvu.com/blog/gs1-image-standards-how-automation-can-help-effective-product-representation/) — product image metadata standards including GTIN linkage and consistent attribution for catalog platforms. (MEDIUM confidence)
- [Ready Set Sim](https://www.readysetsim.com/) -- Sim racing gear profiles and build sharing (cross-domain reference for hobby-agnostic patterns).
+- PROJECT.md — existing feature set, out-of-scope decisions, constraints, v2.1 milestone definition. (HIGH confidence, first-party)
 - [GetStream Social Feed Architecture](https://getstream.io/blog/social-media-feed/) -- Feed implementation patterns and anti-patterns.
 ---
-*Feature research for: GearBox v2.0 Platform Foundation -- multi-user gear discovery platform*
+
-*Researched: 2026-04-03*
+*Feature research for: GearBox v2.1 Public Discovery — public-first gear discovery platform*
 *Researched: 2026-04-09*
--- a/.planning/research/PITFALLS.md
+++ b/.planning/research/PITFALLS.md
@@ -1,314 +1,187 @@
 # Pitfalls Research
-**Domain:** Single-user to multi-user gear platform migration (GearBox v2.0)
+**Domain:** Public-first discovery platform with catalog enrichment (GearBox v2.1)
-**Researched:** 2026-04-03
+**Researched:** 2026-04-09
-**Confidence:** HIGH (based on direct codebase analysis of v1.4 + established migration patterns)
+**Confidence:** HIGH (based on direct codebase inspection of v2.0 + verified ecosystem patterns)
 > v2.0 migration pitfalls (SQLite→Postgres, single→multi-user) are archived in git history.
 > This document covers pitfalls specific to the v2.1 milestone: public access model, discovery feed, catalog enrichment, and agent-powered seeding.
 ---
 ## Critical Pitfalls
-### Pitfall 1: Missing userId Filters Leak Data Between Users
+### Pitfall 1: Frontend Auth Guard Blocks All New Public Routes
 **What goes wrong:**
-Every query in the existing codebase operates without a `userId` filter. After adding `userId` columns to `items`, `categories`, `threads`, `setups`, and `settings`, any service function not updated to filter by `userId` will return or mutate other users' data. The current `getAllItems()` returns `db.select().from(items).innerJoin(...)` with zero WHERE clauses. One missed function means User A sees User B's gear.
+The root layout (`__root.tsx`) hard-redirects any unauthenticated visitor to `/login` unless they are already on `/users/*` or `/login`. When public routes are added — a discovery landing page at `/`, a public catalog at `/global-items/` that is meant to be the new entry point — they will silently redirect anonymous users before rendering anything. The server already correctly skips auth middleware for `GET /api/global-items` (line 136 of `src/server/index.ts`), but the frontend guard is a separate allowlist that has not been updated.
 The surface area is large: 6 service files, 19 MCP tools, 7 route files, aggregate queries in `totals`, the `duplicateItem` function, the `getCollectionSummary` MCP resource, setup-item joins, and thread resolution (which creates a new item).
 **Why it happens:**
-Developers add `userId` to the schema, update the obvious CRUD functions, but miss edge cases. The codebase has enough query sites (~30+) that manual "find all queries" misses something. Thread resolution is particularly dangerous because it creates an item as a side effect of updating a thread.
+The client-side guard and the server-side middleware allowlist live in different files (`__root.tsx` vs `server/index.ts`) and can drift. Developers add routes to the server-side skip list but forget the frontend guard, then wonder why authenticated users see the feature but unauthenticated visitors hit the login page.
 **How to avoid:**
-1. Enable Postgres Row-Level Security (RLS) as a safety net -- even if the app filters by `userId`, RLS prevents cross-user access at the database level.
+Refactor the auth guard before building any public UI. Invert the logic: instead of allowlisting public routes, define a small `PROTECTED_ROUTES` set (collection, planning, settings, threads) and use TanStack Router's `beforeLoad` to protect those specific routes. Everything else renders without auth. The root layout should not gate render — it should only determine which UI chrome elements to show based on auth state.
 2. Add `userId` as NOT NULL to the Drizzle schema first, then use TypeScript compiler errors to find every query that needs updating (insert calls will fail where `userId` is required but not provided).
 3. Write one integration test per entity: create data as User A, query as User B, assert empty results.
 4. Grep the codebase for every `.from(items)`, `.from(categories)`, `.from(threads)`, `.from(setups)`, `.from(settings)` and verify each has a `userId` filter.
 **Warning signs:**
- Any service function that does not accept a `userId` parameter after migration.
+- Loading `/global-items/` in a private browser window redirects to `/login`
- Tests that pass without specifying which user is performing the action.
+- The `isPublicRoute` check in `__root.tsx` is a string allowlist that grows as features are added
- MCP tools that work without user context.
+- New routes work for authenticated users but are invisible to anonymous users during testing
 **Phase to address:**
-Multi-user data model phase. This is the single most important thing to get right. Do not add public content or discovery features until every query is provably user-scoped.
+Public access auth model phase — must be the first change made. Every other public feature depends on this being correct.
 ---
-### Pitfall 2: Category Name Uniqueness Breaks in Multi-User
+### Pitfall 2: `useAuth()` Spinner Blocks Public Page First Contentful Paint
 **What goes wrong:**
-The current schema has `name: text("name").notNull().unique()` on the `categories` table -- a global unique constraint. When User A creates a "Bikepacking" category, User B cannot. The migration must change this to a composite unique constraint on `(userId, name)`.
+The root layout shows a full-screen spinner while `useAuth()` resolves. For authenticated users this is imperceptible (~50ms for a cached session). For anonymous visitors on a public discovery page, this is 300–800ms of blank white screen before any content appears — because the auth check hits `/api/auth/me` which must complete before the page renders. This directly undercuts "public-first" positioning.
 Additionally, `useOnboardingComplete()` fires for all users. For anonymous visitors it will hit an auth-required endpoint and produce a 401. Even though it is conditionally rendered, verify the hook itself does not fetch when `isAuthenticated` is false.
 **Why it happens:**
-Single-user apps use simple unique constraints. Developers add `userId` to the table but forget to update the unique constraint from `unique(name)` to `unique(userId, name)`. The migration runs fine on an empty database but fails the moment a second user creates a category with a common name.
+Login-first apps legitimately gate the entire UI on auth resolution — there is nothing useful to show an unauthenticated user. The same pattern applied to a public discovery page creates a perceived login wall.
 **How to avoid:**
-Audit every `.unique()` constraint in the schema during migration. `categories.name` must become a composite unique on `(userId, name)`. The `users.username` unique stays global (desired). No other tables currently have unique constraints, but new tables (reviews, products) should use composite uniqueness from the start.
+Public routes must render immediately with unauthenticated defaults. Auth state loads in the background and hydrates progressive elements (nav user avatar, "Add to collection" CTAs) without blocking content. Use React Query's `enabled: isAuthenticated` on all hooks that call auth-required endpoints. The `useAuth()` query itself should never block page render — only auth-gated actions should wait on it.
 **Warning signs:**
- Database constraint errors when a second user creates categories.
+- Full-screen spinner visible to anonymous visitors on the landing page
- Tests that only ever use one user.
+- Lighthouse FCP score degrades after the public access change
 - Network tab shows 401 on `/api/settings` or `/api/totals` for logged-out users
 **Phase to address:**
-Multi-user data model phase, during schema migration.
+Public access auth model phase — same as Pitfall 1, tackled together.
 ---
-### Pitfall 3: Drizzle Schema Rewrite Is a Replacement, Not a Migration
+### Pitfall 3: Root-Level Components Fire Auth-Required Queries for Anonymous Users
 **What goes wrong:**
-Drizzle ORM schemas are dialect-specific. The current schema imports from `drizzle-orm/sqlite-core` and uses `sqliteTable`, `integer().primaryKey({ autoIncrement: true })`, and `real()`. The Postgres schema must import from `drizzle-orm/pg-core` and use `pgTable`, `serial()` or `integer().generatedAlwaysAsIdentity()`, and `doublePrecision()`. This is not a migration Drizzle can auto-generate -- it requires a full schema rewrite and a fresh migration history.
+`TotalsBar` is rendered at the root layout level for all routes and calls `useTotals()` which hits `GET /api/totals`. The auth middleware does not skip `/api/totals` for GET requests (verified in `server/index.ts`) — it requires a `userId`. Anonymous visitors will receive a 401 on every public page load, and React Query will retry the failed query three times. `FabMenu`, `CatalogSearchOverlay`, `AddToCollectionModal`, and `AddToThreadModal` are similarly rendered at root level and may trigger auth-gated operations.
 Specific differences that will cause bugs if missed:
 - `integer("id").primaryKey({ autoIncrement: true })` becomes `serial("id").primaryKey()` or `integer("id").primaryKey().generatedAlwaysAsIdentity()`.
 - `integer("created_at", { mode: "timestamp" })` -- SQLite stores timestamps as integers. Postgres has native `timestamp` type. Must decide: keep integer storage or switch to Postgres `timestamp()`.
 - `real("weight_grams")` -- SQLite `REAL` is 8-byte float. Postgres `real` is 4-byte float (less precision). Use `doublePrecision()` for equivalent behavior.
 - SQLite `text("status")` with string values works as pseudo-enum. Postgres has native `pgEnum` for type safety.
 - The `Db` type alias (`typeof prodDb`) changes entirely -- every service file and MCP tool imports this type.
 **Why it happens:**
-Developers assume Drizzle abstracts away database differences. It does not at the schema layer. The query builder is mostly compatible, but schema definition is dialect-specific by design.
+Root layout components were designed when every user was authenticated. Adding public routes does not automatically suppress these components' data fetches.
 **How to avoid:**
-1. Write a new `schema.ts` from scratch using `pg-core`, not edit the existing one.
+Audit every component rendered in the root layout. For each one: (1) does it make an API call? (2) does that endpoint require auth? If yes, add `enabled: isAuthenticated` to the query, or conditionally render the component itself behind `{isAuthenticated && <TotalsBar />}`. `TotalsBar` should not appear on the new public discovery landing page at all — it is a user-specific widget.
 2. Start a fresh Drizzle migration history for Postgres. SQLite migrations are irrelevant.
 3. Write a data migration script that reads from old SQLite and inserts into new Postgres.
 4. Update the `Db` type alias in all service files.
 5. Use `doublePrecision()` not `real()` for weight values to maintain precision parity with SQLite.
 **Warning signs:**
- Weight values losing precision (245.5g becoming 245.49999...).
+- Network tab shows 401 on `/api/totals` for anonymous users
- Timestamps behaving differently (integer epoch vs. native timestamp).
+- React Query error boundaries firing on public pages for components that are not relevant to anonymous users
- drizzle-kit refusing to generate migrations against the wrong dialect.
+- Console shows `[auth] OIDC auth failed` log spam from root-level queries
 **Phase to address:**
-Database migration phase. Must complete before any other v2.0 feature.
+Public access auth model phase — audit and guard every root-level component before deploying the public landing page.
 ---
-### Pitfall 4: Test Infrastructure Collapses During Database Switch
+### Pitfall 4: Discovery Feed Built as Per-Card Queries (N+1)
 **What goes wrong:**
-The entire test infrastructure is built on SQLite. `createTestDb()` uses `bun:sqlite` with `Database(":memory:")` and `drizzle-orm/bun-sqlite`. E2E tests use a file-based SQLite (`e2e/test.db`). After switching to Postgres, every test needs a Postgres connection -- no more in-memory databases.
+A discovery feed showing popular public setups or recently added catalog items typically starts as a list query followed by per-item detail fetches. For example: `getAllPublicSetups()` returns 20 setup IDs, then the frontend or backend fetches each setup's item count, owner display name, and total weight individually. At 20 items this is invisible; at 100+ items or with multiple feed sections it causes 2+ second response times and high DB connection pressure.
-The MCP server hard-codes `db as prodDb` which is an SQLite Drizzle instance. The Hono context variable type for `db` changes. Every route handler that does `c.get("db")` gets a different type.
+The existing `getPublicSetupWithItems()` service function is optimized for a single-setup detail view. Reusing it in a loop for a feed is the most common trap.
 **Why it happens:**
-In-memory SQLite is the best testing story in the Bun ecosystem -- fast, isolated, no external services. Postgres testing requires either: (a) a running Postgres instance, (b) testcontainers with Docker, or (c) PGlite (lightweight Postgres in WebAssembly). Developers delay updating tests and end up with a broken test suite for weeks.
+Developers reach for familiar service functions. The function works. Performance issues only appear under real data volumes, not in development with 3 test setups.
 **How to avoid:**
-1. Adopt PGlite (`@electric-sql/pglite`) for unit/integration tests. It provides in-memory Postgres without Docker. Drizzle supports PGlite via `drizzle-orm/pglite`.
+Write dedicated feed query functions using Drizzle joins from day one. A single SQL query should return all feed cards with their aggregates (item count, total weight in grams, owner display name). Add PostgreSQL indexes on `setups.is_public`, `setups.created_at`, and `setups.updated_at` before building the feed query. Mirror the pattern already used for aggregate totals (computed via SQL on read, not stored).
 2. Update `createTestDb()` to use PGlite instead of bun:sqlite.
 3. For E2E tests, use Docker Compose with a test Postgres instance, or PGlite if performance is acceptable.
 4. Update the Hono context variable type to the new Postgres Drizzle instance type.
 5. Migrate test infrastructure in the same phase as the schema, not after.
 **Warning signs:**
- `bun test` fails across the board after schema change.
+- Feed query time scales linearly with results count
- "Type 'BunSQLiteDatabase' is not assignable to type 'PgDatabase'" errors everywhere.
+- `pg_stat_statements` shows repeated single-row lookups for users or items
- E2E tests silently skipped or disabled "temporarily."
+- Adding a second feed section doubles total response time
 **Phase to address:**
-Database migration phase. Tests must migrate alongside the schema.
+Discovery landing page phase — design feed queries as joins from the first implementation, not as a later optimization.
 ---
-### Pitfall 5: Auth Provider Integration Breaks Existing Sessions, API Keys, and MCP
+### Pitfall 5: Image Attribution Stored as Unstructured Text
 **What goes wrong:**
-The current auth stores users, sessions, and API keys in the local database. Switching to an external auth provider means: (1) user identity moves external, (2) session management changes (JWT or OAuth flow vs. cookie sessions), (3) existing API keys become orphaned because they reference the old user table, (4) the MCP server authenticates via API keys stored locally, (5) E2E tests authenticate via `POST /api/auth/login` with a seeded user, (6) the onboarding flow (`POST /api/auth/setup`) creates the first user.
+If image attribution for catalog items is stored as a single `attribution: text` field (the fastest approach), it becomes impossible to: programmatically render a copyright badge, distinguish manufacturer press images from community uploads from AI-generated placeholders, enforce a "no scraped retailer images" policy, or filter catalog items by image source type. Agent-seeded catalog items will have inconsistent attribution formats that are expensive to clean up retroactively.
 The current `globalItems` schema has only `imageUrl: text`. There is no `imageSourceType` or structured attribution.
 **Why it happens:**
-Auth migration is treated as "swap the login page" when it touches the entire authentication surface: user identity, session lifecycle, API key management, MCP authentication, E2E test setup, and onboarding.
+"We'll add a text note" is the zero-friction path. Attribution structure seems like a nice-to-have until you need to answer "how many catalog items have manufacturer-licensed images?" or build a compliance filter.
 **How to avoid:**
-1. Keep API keys in the local database even after auth moves external. API keys are long-lived credentials managed by the application, not the auth provider.
+Define a structured attribution model at schema design time before any seeding. Minimum: `imageSourceType: text` (enum: `manufacturer`, `community`, `agent_seeded`, `no_image`), `imageAttribution: text` (human-readable credit line), and `imageSourceUrl: text` (already exists on items but not on globalItems). This allows source-type-specific rendering and filtering without a schema migration mid-catalog-build.
 2. Map external provider user IDs to a local `users` table. The external provider handles authentication; the local table handles application-level data (userId foreign keys, API keys, preferences). Foreign keys reference local `users.id`, not the provider's UUID.
 3. Replace the onboarding flow: instead of "create admin account," it becomes "sign up via external provider, first user gets admin role."
 4. Update E2E tests to either mock the auth provider or use API key authentication exclusively for E2E.
 **Warning signs:**
- MCP server stops working after auth migration.
+- Seeding agent instructions say "put attribution in the description field"
- E2E tests that log in via `POST /api/auth/login` all fail.
+- Catalog items display images without any credit indication
- API keys created before migration stop working.
+- No way to query "show me only manufacturer-sourced images"
 - No local `users` table -- everything delegated to external provider.
 **Phase to address:**
-Auth migration phase. Should be done early because user identity is the foundation.
+Catalog enrichment infrastructure phase — schema changes must be in the migration before any seeding begins.
 ---
-### Pitfall 6: Global Item Database Creates a Data Model Fork
+### Pitfall 6: Agent Catalog Seeding Creates Duplicate Global Items
 **What goes wrong:**
-The current `items` table represents user-owned gear. The v2.0 vision includes a "global item database" with manufacturer specs. These are fundamentally different entities: a user's item has quantity, personal notes, setup associations, and belongs to a user. A global item is a product definition with canonical specs, owned by nobody. Conflating them in one table (via `isGlobal` flag or `NULL userId`) creates an unmaintainable mess. Separating them creates a sync problem.
+Without a unique constraint on `(brand, model)` in the `globalItems` table (which currently has none), running an MCP agent seeding pass twice creates duplicate rows for the same product. Agents also retry on API errors, compounding the issue. The current `create_item` MCP tool creates a new row unconditionally — it was designed for personal collection management where duplicates are intentional (a user can own two of the same item). Reusing it for catalog seeding carries no deduplication.
 **Why it happens:**
-It seems efficient to add an `isGlobal` flag. But then queries need to handle both cases, user items need to link to global items for spec inheritance, and the API surface doubles with different permission models.
+The catalog seeding flow is built on top of existing personal item tools because they are already available via MCP. The semantic mismatch (user-owned vs. global reference item) is not obvious until duplicates appear.
 **How to avoid:**
-1. Create a separate `products` table for the global database. A product has: name, manufacturer, canonical weight, canonical price, product URL, image, category.
+Add a unique constraint on `globalItems(brand, model)` as part of the catalog enrichment schema migration. Create a dedicated `upsert_catalog_item` MCP tool or admin API endpoint that uses `ON CONFLICT (brand, model) DO UPDATE` semantics. This tool should be explicitly different from personal collection tools: no `userId`, upsert behavior, admin-scoped access.
 2. User `items` gets a nullable `productId` foreign key. When set, the item inherits specs from the product but can override them (user's measured weight vs. manufacturer spec).
 3. User items without a `productId` are standalone (backward-compatible with all existing items).
 4. Reviews, owner counts, and setup appearances link to `products`, not user `items`.
 **Warning signs:**
- `items` table query complexity increases beyond what is reasonable.
+- Catalog search returns two entries for the same product ("Apidura Backcountry Food Pouch")
- Ambiguity about whether an operation affects "my item" or "the global product."
+- Owner count on a duplicate item is 0 because user-owned items link to the wrong copy
- Permission model becomes unclear (who can edit a global product?).
+- Re-running a seed script doubles the catalog size
 **Phase to address:**
-Global item database phase. Must come after multi-user data model is stable.
+Catalog enrichment infrastructure phase — unique constraint and upsert endpoint before any agent seeding run.
 ---
-### Pitfall 7: Image Storage Migration Breaks Existing URLs and the MCP Tool
+### Pitfall 7: Storing Third-Party Product Images in S3 Creates Legal and Cost Exposure
 **What goes wrong:**
-Images are stored in `./uploads/` on the filesystem, served via `app.use("/uploads/*", serveStatic({ root: "./" }))`, and referenced by `imageFilename` in the database. Moving to object storage changes URLs from `/uploads/uuid.jpg` to `https://bucket.s3.region.amazonaws.com/uuid.jpg`. Every existing `imageFilename` reference becomes a broken image.
+The existing `upload_image_from_url` MCP tool fetches a URL and saves it to MinIO/S3. If an agent uses this to seed manufacturer product images from brand websites, retailer pages, or Amazon listings, those images are copyright-protected. Storing and publicly serving them creates: (1) legal liability for hosting images without a license — up to $150,000 per infringement in the US; (2) storage and egress costs that grow with public traffic; (3) dependency on external URLs that 404 silently when retailers change their CDN paths.
 Both `items` and `threadCandidates` have `imageFilename` and `imageSourceUrl` fields. The MCP tool `upload_image_from_url` saves to the local filesystem. The image route `POST /api/images` saves to `./uploads/`.
 **Why it happens:**
-The current design stores only the filename, not the full URL. The serving path is implicit (prepend `/uploads/`). When storage moves to S3, the "prepend `/uploads/`" pattern breaks.
+"Just grab the product image from the brand website" produces accurate images immediately. It feels like fair use. It is not — attribution does not create a license, and copyright does not require a watermark or notice.
 **How to avoid:**
-1. Add a reverse proxy route: keep `/uploads/*` working but proxy to S3 instead of local filesystem. This maintains backward compatibility during transition.
+Define a clear image sourcing policy before seeding begins. Safest options in order: (1) store `imageUrl` as a reference to the external source without copying to S3; (2) use manufacturer-provided press/media kit images that explicitly grant redistribution; (3) use Creative Commons–licensed images from Wikimedia Commons or similar. Document which sources are permitted in the seeding agent's prompt. Do not hotlink to third-party URLs either — they create external dependencies. Distinguish permitted images from unverified ones using `imageSourceType`.
 2. Or migrate `imageFilename` to store full URLs. Existing filenames get prefixed with the S3 URL during data migration.
 3. Write a migration script that uploads all `./uploads/` files to S3 and updates database references.
 4. Update `POST /api/images`, `POST /api/images/from-url`, and the MCP `upload_image_from_url` tool to write to S3.
 5. Create an image storage abstraction layer so dev can use local filesystem and production uses S3.
 **Warning signs:**
- Broken images after deployment.
+- Seeding instructions tell the agent to call `upload_image_from_url` on Amazon product listing URLs
- Mixed URLs (some `/uploads/`, some `https://s3...`) in the database.
+- All catalog items have `imageFilename` values from manufacturer/retailer URLs
- MCP tool `upload_image_from_url` silently failing.
+- No documented image licensing policy before seeding starts
 **Phase to address:**
-Infrastructure phase. Should be done before discovery/public profiles (which serve images to many users).
+Catalog enrichment infrastructure phase — establish policy and `imageSourceType` schema before any seeding.
 ---
-### Pitfall 8: Thread Resolution Creates Items Without Proper User Scoping
+### Pitfall 8: MCP Catalog Tools Share the Seeding Agent's Personal userId
 **What goes wrong:**
-Thread resolution copies a candidate's data into a new item. In multi-user, the newly created item must inherit the thread owner's `userId`. If the resolution logic does not explicitly set `userId` on the new item, it either fails (NOT NULL constraint) or creates an orphaned item.
+The MCP server binds every tool invocation to the `userId` of the authenticated API key or OAuth token. When an agent uses a regular user API key to create catalog items, those items are implicitly associated with that user's account context. This creates two problems: (1) catalog items appear in the seeding user's personal collection or produce permission collisions; (2) running the seeding agent as a specific user creates a "ghost user" with thousands of catalog entries that pollutes collection analytics and owner counts.
 This is a specific instance of Pitfall 1 but deserves its own callout because resolution is a multi-step transaction: update thread status, set `resolvedCandidateId`, create new item. Any step that forgets `userId` breaks the chain.
 **Why it happens:**
-The resolution logic is tested as a unit but the test does not set a `userId` because none existed. After adding `userId`, the test still passes if using a default/NULL value. The bug only surfaces with a second user.
+There is no separation between personal collection MCP tools and catalog admin tools in the current implementation. The `userId` context flows through all tool handlers automatically.
 **How to avoid:**
-1. Make `userId` NOT NULL on all entity tables from day one.
+Catalog write operations must not carry a personal `userId`. Options: (1) create a separate admin-scoped API key that maps to a "system" user with no personal collection; (2) build dedicated catalog MCP tools that explicitly ignore `userId` for the globalItems table while still requiring authentication for authorization; (3) use a separate REST endpoint (`POST /api/admin/catalog-items`) with admin-only auth, bypassing the user-scoped MCP tools entirely.
 2. Update `resolveThread` to accept and propagate `userId`.
 3. Write a test: resolve thread as User A, verify created item belongs to User A.
 **Warning signs:**
- Items appearing in the wrong user's collection after resolution.
+- Running the seeding agent creates items visible in someone's personal collection
- Thread resolution failing with constraint violations.
+- Owner count on seeded global items starts at 1 (the seeding user's implicit ownership)
 - Catalog items appear in the seeding user's dashboard totals
 **Phase to address:**
-Multi-user data model phase.
+Catalog enrichment infrastructure phase — design catalog write path before building seeding tooling.
 ---
 ### Pitfall 9: Public Content Without Explicit Privacy Controls
 **What goes wrong:**
 The v2.0 plan includes "public user profiles with shared setups" and a "discovery feed." Without explicit visibility controls, the default state is ambiguous: are new setups public? Are all items in a public setup visible? Can someone discover gear a user has not chosen to share? Users expecting a private gear tracker are surprised when their collection appears in search results.
 **Why it happens:**
 The developer defaults to "everything public" because it is simpler to build discovery features. Privacy controls are added as an afterthought, requiring a retroactive audit of all existing data.
 **How to avoid:**
 1. Default to private. Every entity (setup, profile) is private unless explicitly published.
 2. Add a `visibility` column (`private` | `public`) to setups. Items are visible publicly only through public setups.
 3. User profiles are private by default. Public profile is opt-in.
 4. Public API endpoints (discovery, search) only query entities with `visibility = 'public'`.
 5. Build the visibility model in the data layer before building any discovery UI.
 **Warning signs:**
 - No `visibility` or `isPublic` column in the schema.
 - Discovery queries that do not filter by visibility.
 - User complaints about unexpected data exposure.
 **Phase to address:**
 Multi-user data model phase (add visibility columns) and discovery phase (enforce in queries).
 ---
 ### Pitfall 10: SQLite-Specific Patterns That Silently Break on Postgres
 **What goes wrong:**
 The codebase has SQLite-specific patterns that will not error but will behave differently on Postgres:
 - `src/db/index.ts` runs `PRAGMA journal_mode = WAL` and `PRAGMA foreign_keys = ON` -- Postgres has no PRAGMAs. Foreign keys are always enforced. WAL is always on.
 - `bun:sqlite` is used as the driver. Postgres needs `postgres` (postgres.js) or `pg` (node-postgres) as the driver.
 - The existing Drizzle migrator import is `drizzle-orm/bun-sqlite/migrator`. Postgres uses `drizzle-orm/node-postgres/migrator` or `drizzle-orm/postgres-js/migrator`.
 - SQLite allows inserting strings into integer columns silently. Postgres will error.
 - SQLite `AUTOINCREMENT` guarantees IDs never reuse. Postgres `serial` reuses IDs after deletions if the sequence is not explicitly configured.
 - The test helper's `Database(":memory:")` has no Postgres equivalent without PGlite.
 **Why it happens:**
 These patterns are invisible in a working SQLite app. They only surface during or after the switch, often as runtime errors in production.
 **How to avoid:**
 1. Remove all PRAGMA statements when switching to Postgres.
 2. Replace `bun:sqlite` driver with `postgres` (postgres.js is recommended for Bun compatibility).
 3. Update all migrator imports.
 4. Run the full test suite against Postgres to catch type strictness differences.
 5. Use `serial` or `identity` columns for auto-increment; accept that IDs may be reused after deletion (this should not matter for a web app).
 **Warning signs:**
 - "PRAGMA" in the Postgres codebase.
 - `bun:sqlite` imports anywhere in production code after migration.
 - Tests passing against SQLite but failing against Postgres.
 **Phase to address:**
 Database migration phase.
 ---
 ### Pitfall 11: Setup-Item Delete-All-Reinsert Pattern Causes Phantom Reads
 **What goes wrong:**
 The current setup item sync uses delete-all-then-re-insert: `DELETE FROM setup_items WHERE setupId = X`, then re-insert all items. In single-user SQLite this is fine. In multi-user Postgres with concurrent writes: (a) race conditions if two users modify setups simultaneously, (b) brief windows where a public setup appears empty to concurrent readers.
 **Why it happens:**
 The pattern was chosen for simplicity (noted in CLAUDE.md: "Simpler than diffing, atomic in transaction"). "Atomic in transaction" only holds if the transaction isolation level prevents phantom reads, which is not the default in Postgres (`READ COMMITTED`).
 **How to avoid:**
 1. Wrap in an explicit transaction with `SERIALIZABLE` or `REPEATABLE READ` isolation for the sync operation.
 2. Or switch to diff-based approach for public setups: compare existing vs. new list, delete removed, insert added.
 3. For private setups, the delete-reinsert pattern with a basic transaction is acceptable.
 **Warning signs:**
 - Public setups briefly appearing empty.
 - Foreign key violations in concurrent scenarios.
 **Phase to address:**
 Multi-user data model phase, when updating the setup service.
 ---
 ### Pitfall 12: Existing Data Has No Owner After Multi-User Migration
 **What goes wrong:**
 The existing SQLite database has items, categories, threads, setups -- all without a `userId` column. When the schema adds `userId NOT NULL`, the existing data needs an owner. If the migration script does not assign existing data to the original user, the data is either lost (NOT NULL violation prevents migration) or orphaned.
 **Why it happens:**
 The developer writes the new schema with `userId NOT NULL`, runs `db:push`, and the migration fails because existing rows have no `userId`. The "fix" is to make `userId` nullable, which undermines the entire data isolation model.
 **How to avoid:**
 1. The data migration script must: (a) create the original user in the new system, (b) assign all existing data to that user's ID, (c) then apply the NOT NULL constraint.
 2. Migration order: create tables with `userId` nullable, insert data with the owner's userId, then ALTER to NOT NULL.
 3. Verify row counts match before and after migration.
 **Warning signs:**
 - `userId` column is nullable in the final schema "because of migration."
 - Existing data missing after migration.
 - Migration script that only handles schema, not data.
 **Phase to address:**
 Database migration phase, specifically the data migration step.
 ---
@@ -316,121 +189,116 @@ Database migration phase, specifically the data migration step.
 | Shortcut | Immediate Benefit | Long-term Cost | When Acceptable |
 |----------|-------------------|----------------|-----------------|
-| Keeping SQLite test infrastructure while developing Postgres features | Tests keep passing during migration | Two database dialects to maintain, false confidence from tests that do not match production | Never -- migrate tests alongside schema |
+| Single `isPublicRoute` allowlist in `__root.tsx` | Simple to reason about | Every new public route requires updating this list; lists drift | Never — use per-route `beforeLoad` guards on protected routes instead |
-| Storing both old `/uploads/` paths and new S3 URLs | Avoid data migration script | Every image-rendering component handles both URL formats forever | Only as a 1-2 week transition |
+| Reuse personal item MCP tools for catalog seeding | No new tools to build | Creates wrong userId semantics, no deduplication, wrong ownership | Never for bulk ops — build a dedicated catalog upsert tool |
-| Using `userId` as nullable during migration | Existing data does not need backfilling | Every query must handle NULL userId, privacy bugs when userId is missing | Only during the migration transaction itself, then enforce NOT NULL |
+| `attribution: text` free-form field for image credit | Zero schema change | Cannot programmatically distinguish source types, filter, or enforce licensing policy | Only for internal admin-only catalog; never for public content |
-| Skipping RLS and relying only on app-level userId filtering | Faster to implement | Single missed WHERE clause = data leak | Never for multi-user platforms |
+| Hotlink external product images without copying to S3 | Zero storage cost | Silent 404s when retailers change CDN URLs; external dependency | Only for dev/prototype with a clear plan to replace |
-| Deferring visibility controls to "after discovery ships" | Ship discovery faster | Retroactive privacy audit, potential data exposure, user trust damage | Never |
+| Discovery feed as multiple React Query calls per card | Familiar pattern | N+1 queries degrade at scale; visible at ~30 feed cards | Only for MVP with < 20 items and a committed optimization plan |
-| Keeping the local `users` table password hash after external auth | Avoid migration complexity | Dead column confuses future developers, potential security liability | Never -- remove password hash column after auth migration |
+| No unique constraint on `globalItems(brand, model)` | Faster initial schema | Duplicate catalog entries after every re-seed or agent retry | Never — add the constraint before any seeding |
 ---
 ## Integration Gotchas
 | Integration | Common Mistake | Correct Approach |
 |-------------|----------------|------------------|
-| External auth provider | Removing the local `users` table entirely | Keep a local `users` table with `externalId` (from auth provider) + local fields (preferences, API keys). Foreign keys reference local `users.id`, not the external provider's UUID. |
+| Logto OIDC + public routes | `oidcAuthMiddleware()` throws or redirects when there is no session, breaking public routes | Use `getAuth(c)` which returns null gracefully for unauthenticated requests; only apply `oidcAuthMiddleware()` on login-gated routes |
-| External auth provider | Storing user profile data in the auth provider and querying it at runtime | Store only identity in auth provider. Sync user profile to local `users` table on login. Application queries local table only. |
+| MCP tools + catalog seeding | Using user-scoped tools (bound to API key owner's `userId`) to write global catalog entries | Build separate catalog admin tools or a REST endpoint that writes to `globalItems` without personal userId semantics |
-| External auth provider | Using auth provider's session tokens directly as API authentication | Auth provider handles login/logout. Application mints its own session after verifying the auth provider's token. This decouples session lifecycle from the provider. |
+| MinIO/S3 + public catalog | Using presigned URLs (which expire) for catalog image delivery | Catalog item images need stable public paths or a CDN URL; presigned URLs are for user-private content only |
-| S3-compatible object storage | Using the S3 SDK directly in route handlers | Create an image storage abstraction (interface with `upload`, `getUrl`, `delete`). Swap implementations (local filesystem for dev, S3 for production) via environment config. |
+| TanStack Router `beforeLoad` + auth check | `beforeLoad` that re-fetches auth on every navigation creates a waterfall | Read from React Query cache (already has 5-min `staleTime` in `useAuth`); `beforeLoad` should read cached auth state, not re-fetch |
-| Postgres driver | Assuming `bun:sqlite` patterns work with Postgres | Postgres uses `postgres` (postgres.js) or `pg`. Connection pooling, async queries, and error handling differ. SQLite is synchronous; Postgres is async. Service functions may need to become async. |
+| PostgreSQL + public feed queries | Missing indexes on `is_public`, `created_at` cause full-table scans | Add composite indexes on `(is_public, created_at)` on setups table before the feed goes live |
-| Postgres | Assuming SQLite PRAGMA behaviors exist | Postgres has no PRAGMAs. Foreign keys are always on. WAL is always on. Remove all PRAGMA code. |
+
-| Drizzle ORM Postgres driver | Using synchronous `.get()` and `.all()` query methods | SQLite Drizzle uses `.get()` (sync). Postgres Drizzle uses `.execute()` or `await` on queries. Every service function that calls `.get()` or `.all()` must be updated. |
+---
 ## Performance Traps
 | Trap | Symptoms | Prevention | When It Breaks |
 |------|----------|------------|----------------|
-| N+1 queries in discovery feed | Feed page takes 2+ seconds | Use joins or batch queries for setups with items and categories | 50+ setups in feed, each with 10+ items |
+| Per-card queries in discovery feed | Feed loads in > 2s; each section multiplies DB time | Single JOIN query returning all feed card data with aggregates | At ~30 items in feed |
-| Unindexed `userId` columns | All queries slow after adding userId filtering | Add indexes on `userId` for every table. Composite indexes for `(userId, categoryId)` on items. | 1000+ items across 50+ users |
+| Auth check blocking public FCP | Blank + spinner visible on first load; LCP degraded | Render public content immediately; auth state hydrates progressively | Immediately on first deploy — visible in Lighthouse |
-| Full-table scans for aggregates | Dashboard slow for large collections | Current aggregates are computed via SQL on read. Add materialized views or cache for public setup totals. | 100+ items per user, or public setups viewed by 100+ visitors |
+| Full-table scan on `globalItems` text search | Search feels fine at 18 items; slows visibly at 500+ | Add `pg_trgm` trigram index or `tsvector` GIN index before catalog grows | At ~200 catalog items |
-| Image serving from app server | Server CPU/bandwidth saturated | Serve images from S3/CDN. Current `serveStatic` for uploads hits the app server for every request. | 100+ concurrent users browsing image-heavy pages |
+| Image egress costs without CDN | MinIO egress scales with public traffic | CDN in front of public catalog images, or store external `imageUrl` references | Once catalog is publicly discoverable |
-| Global product search without full-text index | Product search slow or inaccurate | Use Postgres full-text search (`tsvector`/`tsquery`) or `pg_trgm` trigram index. | 10,000+ products |
+| React Query refetching public feed on every window focus | Unnecessary server load for anonymous browsing | Set appropriate `staleTime` (5–10 min) on public catalog/feed queries | At moderate traffic |
-| Synchronous service functions on Postgres | Request timeouts, connection pool exhaustion | SQLite Drizzle is sync. Postgres Drizzle is async. Service functions that were sync must become async. | Any usage under load |
+
 ---
 ## Security Mistakes
 | Mistake | Risk | Prevention |
 |---------|------|------------|
-| No RLS, relying only on app-level userId filtering | Single missed WHERE clause exposes all user data | Enable Postgres RLS on all user-owned tables. App filtering is primary; RLS is safety net. |
+| Regular user API key authorized to write global catalog items | Any user with an API key can pollute the shared catalog | Catalog write operations require admin scope or a designated system API key; regular user keys are read-only on globalItems |
-| Public setup exposes private item details | Users share a setup but private notes/pricing leak | Public setup views project only public fields (name, weight, category). Define a "public item projection" and enforce it. |
+| Public setup pages exposing private item fields | Public setup view leaks item notes, threads, or product URLs not intended for sharing | Audit `getPublicSetupWithItems` — return only explicitly public fields (name, weight, image); strip notes and thread data |
-| API keys not scoped to users after auth migration | API key created by User A operates on User B's data | API keys must associate with a userId. After validation, the key's userId scopes all operations. |
+| No rate limiting on public catalog search endpoint | `GET /api/global-items?q=...` is unauthenticated; bots can enumerate or abuse it | Add basic rate limiting middleware to unauthenticated GET endpoints before making them discoverable |
-| Auth provider misconfigured for open self-registration | Random users create accounts without approval | Configure auth provider for admin-approval or invite-only registration. Test explicitly. |
+| `imageSourceUrl` storing retailer order URLs with auth tokens in query params | Private session or order data in stored URLs | Normalize and validate `imageSourceUrl` before storage; strip query params that resemble auth or session tokens |
-| Image upload accepts any file type | Stored XSS via SVG uploads, executable content | Validate MIME type on upload (JPEG, PNG, WebP only). Set `Content-Type` and `Content-Disposition` headers. Strip EXIF metadata. |
+
-| External auth provider callback URL not validated | OAuth redirect attack | Whitelist exact callback URLs in auth provider config. Never use wildcard redirect URIs. |
+---
 ## UX Pitfalls
 | Pitfall | User Impact | Better Approach |
 |---------|-------------|-----------------|
-| Forcing existing single user to re-register via external auth | User loses access to their own data until they figure out new login | Migration path: on first visit after upgrade, guide user to create auth provider account and automatically link to existing data. |
+| Hard login wall immediately after discovery | Anonymous users discover value, click a setup, hit a login wall — they leave | Show full public setup/item detail to anonymous users; only prompt login at the point of a write action (add to collection) |
-| Public profiles default to showing everything | Users surprised their gear list is public | Default profile to private. Public is opt-in with clear preview of what others see. |
+| Empty state on catalog search with no query | Users expect to browse; zero results on open page is confusing | Return a curated/ranked set for empty queries (popular, recently added, or featured tags) |
-| Review system with only star ratings | Ratings without context are useless for gear decisions | Structured reviews with predefined fields (durability, weight accuracy, value) per category. "Weight is 15g heavier than listed" is actionable; a 4-star rating is not. |
+| Catalog feed with no images | Text-only cards look sparse and unfinished | Ensure most catalog items have images before the feed is public; add a styled placeholder with brand initial |
-| Discovery feed dominated by one hobby | Users in other hobbies see irrelevant content | Category-based feed filtering. Show content relevant to user's categories. |
+| Replacing dashboard for logged-in users | Existing users lose their familiar personal dashboard entry point | Discovery page is the anonymous entry point; authenticated users see a hybrid or a personal dashboard — do not remove the existing dashboard |
-| No indication of data ownership when browsing others' setups | User tries to edit someone else's setup and gets error | Clear visual distinction between "my setup" and "someone else's setup." Read-only view with "copy to my setups" action. |
+| Agent-seeded content displayed raw without quality review | Inconsistent formatting, wrong weights, or invalid product links visible publicly | Implement `status: draft | published` on catalog items; agents create drafts, a review step publishes them |
-| Settings lost during migration | User's weight unit preference, onboarding state disappear | Migrate the `settings` table data alongside everything else. Map settings to the original user. |
+
 ---
 ## "Looks Done But Isn't" Checklist
- [ ] **Multi-user data model:** Often missing userId on the `settings` table -- verify settings are user-scoped (weight unit preference, onboarding state).
+- [ ] **Public route guard:** Routes `/`, `/global-items/`, `/global-items/:id`, and `/users/:id` render without redirect in a private browser window with no session cookies — verify manually before shipping
- [ ] **Multi-user data model:** Often missing userId filter on `threadCandidates` queries that join through `threads` -- verify candidates are not directly queryable across users.
+- [ ] **Root-level component suppression:** No 401 responses in the network tab when browsing public pages as an anonymous user — `TotalsBar`, `FabMenu`, and `OnboardingWizard` must not fire auth-required queries
- [ ] **Multi-user data model:** Often missing userId on thread resolution -- verify `resolveThread` propagates userId to the newly created item.
+- [ ] **Catalog deduplication:** Running the agent seed script twice does not increase the row count in `globalItems` — verify unique constraint exists and upsert behavior works
- [ ] **Auth migration:** Often missing MCP server auth update -- verify MCP tools operate in context of the authenticated user, not as global admin.
+- [ ] **Image attribution schema:** `globalItems` has `imageSourceType` column in the migration before any seeding starts — verify migration file exists and was applied
- [ ] **Auth migration:** Often missing E2E test auth update -- verify E2E tests authenticate against new auth system or use API keys.
+- [ ] **Feed query efficiency:** Discovery feed data loads from a single JOIN query — verify using `EXPLAIN ANALYZE` or query logging, not by eyeballing response time
- [ ] **Auth migration:** Often missing API key userId association -- verify API keys created after migration are scoped to the creating user.
+- [ ] **Public setup privacy:** `getPublicSetupWithItems` response does not include item `notes`, thread data, or private product URLs — verify the response shape manually
- [ ] **Database migration:** Often missing data migration script -- verify existing SQLite data is actually moved to Postgres, not just the schema.
+- [ ] **Catalog write authorization:** A regular user's API key cannot create or modify `globalItems` — verify the catalog tool/endpoint requires admin scope
- [ ] **Database migration:** Often missing timestamp conversion -- verify SQLite integer timestamps are correctly handled in Postgres schema.
+- [ ] **Image copyright policy:** Seeding instructions explicitly specify which image sources are permitted; no `upload_image_from_url` calls against brand/retailer URLs — verify in the agent prompt before any seeding run
- [ ] **Database migration:** Often missing weight precision check -- verify `real()` vs `doublePrecision()` does not lose decimal precision.
+
- [ ] **Database migration:** Often missing sync-to-async conversion -- verify all service functions are async after Postgres switch.
+---
 - [ ] **Image migration:** Often missing MCP tool update -- verify `upload_image_from_url` writes to S3, not local filesystem.
 - [ ] **Image migration:** Often missing `imageSourceUrl` field -- verify source URL metadata is preserved during migration.
 - [ ] **Public content:** Often missing visibility filtering on aggregate endpoints -- verify `/api/totals` only counts requesting user's items.
 - [ ] **Reviews:** Often missing rate limiting -- verify a user cannot submit 100 reviews in a minute.
 - [ ] **Discovery feed:** Often missing pagination -- verify feed does not load all public setups at once.
 - [ ] **Global items:** Often missing product-vs-item distinction -- verify adding a product to global database does not add it to anyone's collection.
 ## Recovery Strategies
 | Pitfall | Recovery Cost | Recovery Steps |
 |---------|---------------|----------------|
-| Data leaked between users (missing userId filter) | HIGH | Audit all queries, add RLS immediately, notify affected users, review access logs. Reputation damage is the real cost. |
+| Login redirect blocking public routes | LOW | Update `isPublicRoute` allowlist in `__root.tsx` and add server-side guard bypasses; redeploy; verify in incognito |
-| Broken images after storage migration | MEDIUM | Keep old uploads directory as fallback. Re-upload missing images. Update database references. |
+| Duplicate catalog items from agent seeding | MEDIUM | Write a deduplication migration to merge duplicates keeping owner links; add unique constraint post-merge; re-run seed in upsert mode |
-| Test suite broken for weeks during DB migration | MEDIUM | Pause feature work. Set up PGlite test infrastructure. Port tests one file at a time. |
+| Copyrighted images stored in S3 | HIGH | Identify affected items via `imageSourceType`; delete S3 objects; replace with permitted images or null `imageFilename`; legal review |
-| Auth migration breaks MCP server | LOW | MCP server can fall back to API key auth (already implemented). Fix isolated to MCP auth middleware. |
+| N+1 feed queries causing degraded response times | MEDIUM | Write optimized JOIN query; API response shape may change requiring frontend update; deploy together |
-| Category unique constraint failures | LOW | Drop old unique constraint, add composite unique. Single transaction. |
+| Auth-scoped queries firing for anonymous users | LOW | Add `enabled: isAuthenticated` to each affected query; guard root-level components with auth check |
-| Weight precision loss (SQLite real to Postgres real) | LOW | Alter column to `doublePrecision`. One-time verification script. |
+| Catalog items created with seeding user's userId | MEDIUM | Migration to null out `userId` on globalItems created during seeding; update catalog write path to not accept userId |
-| Public data exposure before visibility controls | HIGH | Emergency: set all entities to private, deploy, then build visibility controls properly. Cannot undo exposure. |
+
-| Existing data orphaned after migration | MEDIUM | Re-run data migration script with correct userId assignment. Verify row counts. |
+---
 | Service functions still sync after Postgres switch | MEDIUM | Systematic conversion of all service functions to async. Update all callers. TypeScript will catch most issues. |
 ## Pitfall-to-Phase Mapping
 | Pitfall | Prevention Phase | Verification |
 |---------|------------------|--------------|
-| Missing userId filters (P1) | Multi-user data model | Integration tests: create as User A, query as User B, assert empty. RLS policies active. |
+| Frontend auth guard blocks public routes (P1) | Public access auth model | Load `/global-items/` and `/` in private window — no redirect |
-| Category uniqueness (P2) | Multi-user data model | Two users create identically-named categories without constraint violations. |
+| `useAuth()` spinner blocks public FCP (P2) | Public access auth model | Lighthouse FCP on landing page with cold cache — no full-screen spinner |
-| Drizzle schema rewrite (P3) | Database migration | Schema compiles with pg-core. drizzle-kit generates valid Postgres migrations. Weight values maintain precision. |
+| Root-level components 401 for anonymous users (P3) | Public access auth model | Zero 401 responses in network tab on public pages |
-| Test infrastructure collapse (P4) | Database migration | `bun test` passes with PGlite. E2E tests pass against Postgres. No SQLite imports in test code. |
+| Discovery feed N+1 queries (P4) | Discovery landing page | `EXPLAIN ANALYZE` on feed endpoint confirms single query, no per-row loops |
-| Auth provider breaks sessions/keys (P5) | Auth migration | Existing API keys work. MCP server authenticates. E2E tests pass. First-time setup works via external provider. |
+| Image attribution stored as free text (P5) | Catalog enrichment infrastructure | Schema review — `imageSourceType` column exists on `globalItems` before seeding |
-| Global item data model fork (P6) | Global item database | Separate `products` table exists. User items optionally reference a product. CRUD operations distinct. |
+| Agent seeding creates duplicates (P6) | Catalog enrichment infrastructure | Run seed script twice — row count unchanged on second run |
-| Image URL breakage (P7) | Infrastructure / Image storage | Existing images render. New uploads go to S3. MCP upload tool works. |
+| Copyrighted images in S3 (P7) | Catalog enrichment infrastructure | Seeding instructions reviewed — no calls to `upload_image_from_url` on brand URLs |
-| Thread resolution userId (P8) | Multi-user data model | Resolving a thread creates an item owned by the thread's owner. Tested with multiple users. |
+| Agent catalog tools carry personal userId (P8) | Catalog enrichment infrastructure | Seeded items have null userId or system userId; not in any user's collection |
-| Privacy/visibility (P9) | Multi-user data model + Discovery | Default is private. Public queries filter by visibility. No private data in discovery feed. |
+
-| SQLite-specific patterns (P10) | Database migration | No PRAGMAs in codebase. No bun:sqlite imports. All queries async. |
+---
 | Setup sync race conditions (P11) | Multi-user data model | Concurrent setup modifications do not produce empty setups or constraint violations. |
 | Existing data ownership (P12) | Database migration | All existing data assigned to original user. Row counts match. userId NOT NULL enforced. |
 ## Sources
- Direct codebase analysis of GearBox v1.4 (schema.ts, services, auth middleware, MCP server, test helpers, db/index.ts, E2E seed)
+- GearBox codebase: `src/client/routes/__root.tsx` — root auth guard and `isPublicRoute` allowlist (direct inspection)
- [Drizzle ORM PostgreSQL documentation](https://orm.drizzle.team/docs/get-started/postgresql-new)
+- GearBox codebase: `src/server/index.ts` — server-side public route bypass patterns (direct inspection)
- [Drizzle ORM SQLite column types](https://orm.drizzle.team/docs/column-types/sqlite)
+- GearBox codebase: `src/db/schema.ts` — `globalItems` table confirming no unique constraint on brand/model, no `imageSourceType` (direct inspection)
- [Drizzle ORM migrations documentation](https://orm.drizzle.team/docs/migrations)
+- GearBox codebase: `src/server/mcp/index.ts` — MCP userId binding per API key (direct inspection)
- [SQLite to PostgreSQL migration pitfalls (Open WebUI discussion)](https://github.com/open-webui/open-webui/discussions/21609)
+- [TanStack Router: Auth performance issue with recommended patterns (GitHub #3997)](https://github.com/TanStack/router/issues/3997)
- [How to migrate from SQLite to PostgreSQL (Render)](https://render.com/articles/how-to-migrate-from-sqlite-to-postgresql)
+- [TanStack Router: Authenticated Routes documentation](https://tanstack.com/router/v1/docs/guide/authenticated-routes)
- [Multi-tenant architecture guide (WorkOS)](https://workos.com/blog/developers-guide-saas-multi-tenant-architecture)
+- [Practical Ecommerce: Online Retailer's Guide to Photo Copyrights](https://www.practicalecommerce.com/Online-Retailers-Guide-to-Photo-Copyrights)
- [Multi-tenant vs single-tenant SaaS (Clerk)](https://clerk.com/blog/multi-tenant-vs-single-tenant)
+- [MCP Idempotency: Best Practices 2025 (BytePlus)](https://www.byteplus.com/en/topic/542207)
- [Migrating file storage to Amazon S3 (DZone)](https://dzone.com/articles/migrating-file-storage-to-amazon-s3)
+- [Six Fatal Flaws of MCP (Scalifiai, 2025)](https://www.scalifiai.com/blog/model-context-protocol-flaws-2025)
- [Drizzle ORM PostgreSQL best practices 2025 (GitHub Gist)](https://gist.github.com/productdevbook/7c9ce3bbeb96b3fabc3c7c2aa2abc717)
+- [Hostwinds: Hotlinking Pitfalls and How to Protect Yourself](https://www.hostwinds.com/blog/hotlinking-pitfalls-and-how-to-protect-yourself)
 ---
-*Pitfalls research for: GearBox v2.0 -- Single-user to multi-user platform migration*
+*Pitfalls research for: GearBox v2.1 — Public-first discovery platform with catalog enrichment*
-*Researched: 2026-04-03*
+*Researched: 2026-04-09*
--- a/.planning/research/STACK.md
+++ b/.planning/research/STACK.md
@@ -1,260 +1,333 @@
 # Stack Research
-**Domain:** Multi-user gear management platform (v2.0 platform additions)
+**Domain:** Public-first gear discovery platform — catalog enrichment, discovery feed, agent-powered seeding (v2.1)
-**Researched:** 2026-04-03
+**Researched:** 2026-04-09
-**Confidence:** MEDIUM-HIGH
+**Confidence:** HIGH (existing stack verified against package.json; additions verified against npm/official docs)
-This document covers ONLY the new stack additions for v2.0. The existing stack (React 19, Hono, Drizzle ORM, TanStack Router/Query, Tailwind CSS v4, Lucide React, Recharts, framer-motion, Zustand, Zod, Bun) is validated and unchanged.
+---
-## Recommended Stack
+## Context: What Already Exists (Do Not Re-Research)
-### Authentication -- Logto (Self-Hosted)
+The following are validated and in production at v2.0. This file covers ADDITIONS AND CHANGES only.
-| Technology | Version | Purpose | Why Recommended |
+| Layer | Current |
-|------------|---------|---------|-----------------|
+|-------|---------|
-| Logto OSS | v1.36+ | External OIDC/OAuth 2.1 auth provider | TypeScript-native, purpose-built for app auth (not enterprise IAM), requires Postgres (shared infra), beautiful pre-built sign-in UI, React SDK with hooks, lightweight JWT validation on backend. MIT-licensed core. |
+| Runtime | Bun |
-| @logto/react | ^4.0.13 | React SDK for auth flows | LogtoProvider wraps app, provides useLogto() hook for sign-in/sign-out/token access. Handles OIDC redirect flow, token refresh, and user info. |
+| Frontend | React 19, TanStack Router/Query v5, Tailwind CSS v4, Zustand, Zod 4.x, framer-motion, Recharts, Lucide React |
-| jose | ^6.2.2 | JWT validation on Hono backend | Zero-dependency, Bun-compatible, used to verify Logto-issued access tokens via JWKS. Recommended by Logto docs over heavier alternatives. |
+| Backend | Hono 4.12.x, Drizzle ORM 0.45.x, PostgreSQL (postgres.js 3.4.x driver) |
 | Auth | @hono/oidc-auth 1.8.x (Logto), API key auth, MCP OAuth 2.1 |
 | Storage | @aws-sdk/client-s3 3.x (MinIO) |
 | MCP | @modelcontextprotocol/sdk 1.29.x (19 tools) |
 | Rate limiting | Custom in-process Map (auth endpoints only, 5 req/15 min per IP) |
-**Why Logto over alternatives:**
+---
-| Provider | Why Not |
+## New Capability Areas
 |----------|---------|
 | Authentik | Python-based, heavyweight (designed for enterprise proxy/SSO), overkill for app-level auth. No React SDK -- requires raw OIDC integration. Better for infra-level SSO (Portainer, Grafana). |
 | Zitadel | Go-based, Kubernetes-first architecture, AGPL 3.0 license (copyleft since 2025). Stronger for multi-tenant B2B SaaS. Over-engineered for a single-product platform. |
 | SuperTokens | Session-based by default (not OIDC), requires embedding their middleware into your backend. Tighter coupling than external provider model. |
 | Keycloak | Java-based, heavy memory footprint (1-2GB RAM), complex admin UI. Industry standard but vastly over-scoped for this use case. |
-**Integration pattern:** Logto runs as a separate Docker container alongside Postgres. React app redirects to Logto's hosted sign-in page for auth flows. Hono backend validates JWT access tokens from the Authorization header using `jose` JWKS verification -- no Logto SDK needed on the backend, just standard OIDC token validation. User identity is the Logto `sub` claim (a stable string ID), stored as `userId` on all user-owned records.
+### 1. Public Access Auth Model
-**Backend middleware pattern (Hono):**
+**What's needed:** The `requireAuth` middleware in `src/server/middleware/auth.ts` already handles three auth paths (API key, OAuth Bearer, OIDC session). The skip-list pattern in `src/server/index.ts` already exempts public GETs on `/api/global-items`, `/api/tags`, `/api/users/:id/profile`, and `/api/setups/:id/public`.
 **This milestone extends the skip-list** to cover new discovery endpoints (`/api/discovery/*`). Additionally, a new `tryAuth` middleware variant is needed for endpoints that work for both anonymous and authenticated users — it resolves `userId` if credentials are present but does NOT 401 on absence. This enables auth-aware responses (e.g., annotating feed items with "in your collection" for logged-in users).
 **No new dependency.** Pure middleware logic — add `tryAuth` to `auth.ts`, update skip-list in `index.ts`.
 ---
 ### 2. Discovery Feed (Popular Setups, Trending Items)
 The feed requires: ranked/scored queries, cursor-based pagination, and cheap repeated reads by anonymous users.
 #### Trending Score
 Use a hot-score computed in PostgreSQL SQL — no external search engine or materialized view needed at this scale.
 ```sql
 -- Hacker News-style decay: engagement / time^gravity
 SELECT id, brand, model,
  (owner_count::float / power((extract(epoch from now()) - extract(epoch from created_at)) / 3600.0 + 2, 1.8)) AS hot_score
 FROM global_items
 ORDER BY hot_score DESC
 LIMIT 20;
 ```
 This requires `ownerCount` as a real column (not a JOIN-time COUNT) on `globalItems`. The column already logically exists via join — promote it to a denormalized integer that the collection add/remove service path updates. No trigger needed; update it in the same database transaction as the collection operation.
 **No new dependency.** Schema migration + service-layer update.
 #### Cursor-Based Pagination
 Drizzle ORM 0.45.x has documented cursor pagination support (two-column keyset). Use `(hotScore DESC, id DESC)` for the trending feed and `(createdAt DESC, id DESC)` for "recently added." Encode cursor as base64 JSON — opaque to the client.
 The Hono + Drizzle cursor pattern is documented and actively used in the ecosystem. No pagination library needed.
 **No new dependency.** Drizzle already supports this natively.
 #### Full-Text Catalog Search
 `globalItems` needs fast free-text search across `brand + model + description`. Use PostgreSQL native `tsvector` with a GIN index.
 Drizzle 0.45.x does not generate `GENERATED ALWAYS AS ... STORED` syntax for tsvector columns in drizzle-kit. Add the `searchVector` column and GIN index via a raw SQL migration file (create via `drizzle-kit generate` then manually add the ALTER TABLE and CREATE INDEX statements to the generated file).
 For the Hono route, use Drizzle's `sql` template tag with `to_tsquery`:
 ```typescript
-import { createRemoteJWKSet, jwtVerify } from "jose";
+.where(sql`search_vector @@ plainto_tsquery('english', ${q})`)
 .orderBy(sql`ts_rank(search_vector, plainto_tsquery('english', ${q})) DESC`)
 ```
-const jwks = createRemoteJWKSet(
+**No new dependency.** Schema migration + raw SQL in service layer.
  new URL("https://logto.example.com/oidc/jwks")
 );
-const authMiddleware = createMiddleware(async (c, next) => {
+#### Feed Client (TanStack Query + IntersectionObserver)
  const token = c.req.header("Authorization")?.replace("Bearer ", "");
  if (!token) return c.json({ error: "Unauthorized" }, 401);
-  const { payload } = await jwtVerify(token, jwks, {
+`useInfiniteQuery` from `@tanstack/react-query` (already at 5.90.x) handles cursor pagination natively via `getNextPageParam`. The scroll trigger uses the browser-native IntersectionObserver API — implement a `useIntersectionObserver(ref, callback)` hook (~12 lines) rather than adding a scroll library. This matches the existing GearBox pattern of minimal third-party UI dependencies.
-    issuer: "https://logto.example.com/oidc",
+
-    audience: "your-api-resource-indicator",
+**No new dependency.**
 ---
 ### 3. Catalog Enrichment Infrastructure
 #### Schema Additions to `globalItems`
 New fields for attribution, source tracking, and feed ranking:
 | Field | Type | Purpose |
 |-------|------|---------|
 | `sourceUrl` | `text` | Canonical product page (retailer or manufacturer) |
 | `sourceAttribution` | `text` | Human-readable credit ("via REI", "via manufacturer") |
 | `imageAttributionUrl` | `text` | URL where product image was originally sourced |
 | `imageAttributionText` | `text` | License or credit line for the image |
 | `submittedByUserId` | `integer FK → users` | Who submitted this catalog entry (null = seeded by admin/agent) |
 | `verifiedAt` | `timestamp` | When an admin approved the entry (null = unverified) |
 | `ownerCount` | `integer NOT NULL DEFAULT 0` | Denormalized count of collection items referencing this |
 | `productUrl` | `text` | Retailer/manufacturer product link (duplicates item-level, but catalog-owned) |
 These are Drizzle schema additions. **No new dependency.**
 #### Zod Schemas for Enriched Catalog
 Add `CreateCatalogItemSchema` in `src/shared/schemas.ts` with attribution fields. Zod 4.3.x handles this natively. The schema feeds the new `POST /api/global-items` route (currently only GET is public — writes will require auth but open to non-admins for catalog submissions).
 ---
 ### 4. Agent-Powered Catalog Seeding via MCP
 The existing MCP server (`@modelcontextprotocol/sdk` 1.29.x, 19 tools) already provides the infrastructure. The agent workflow:
 1. Claude agent receives a category or brand as a prompt
 2. Uses a new `create_catalog_item` MCP tool — purpose-built for `globalItems` insertion with full attribution fields
 3. Server validates via Zod, inserts into `globalItems`, updates `ownerCount` denormalization
 4. Agent uses the existing `upload_image_from_url` tool to fetch and store product images
 The new tool registers identically to existing tools in `src/server/mcp/index.ts`. Batch seeding sessions: the agent runs N `create_catalog_item` calls in sequence within one MCP session — no parallel execution framework needed at catalog bootstrap scale.
 For standalone seed scripts (`bun run src/db/dev-seed.ts` extensions), use the Drizzle db instance directly. No external seeding framework.
 **No new dependency.**
 ---
 ### 5. HTTP Caching for Public Endpoints
 Public GET endpoints (discovery feed, catalog detail pages) will be hit by anonymous users repeatedly. Add HTTP-level cache hints to reduce DB round-trips.
 - **Catalog item detail pages** (`GET /api/global-items/:id`): Use Hono's built-in `etag()` middleware. Content-addressed — returns 304 Not Modified when item hasn't changed.
 - **Discovery feed endpoints** (`GET /api/discovery/*`): Set `Cache-Control: public, max-age=60, stale-while-revalidate=300` manually in route handlers. Feed data tolerates 60s staleness.
 **Do NOT use Hono's `cache()` middleware** — it is platform-specific to Cloudflare Workers and Deno, and silently does nothing on Bun. This is a documented limitation. Known issue #4401 in the Hono repo also shows the `etag()` middleware can generate inconsistent ETags when combining with other middleware — test in integration tests before shipping.
 **No new dependency.** `etag` is built into Hono 4.12.x.
 ---
 ### 6. Rate Limiting for Public Traffic
 The existing `rateLimit.ts` in-process Map handles auth endpoints correctly (5 req/15 min per IP). It is inappropriate for public discovery traffic because:
 - 5 req/15 min is far too strict for anonymous browsing
 - In-process state resets on server restart (tolerable for auth, wrong for general rate limiting)
 - No way to differentiate authenticated vs anonymous callers in the current implementation
 **Recommendation:** Keep the existing `rateLimit.ts` for auth endpoints only. Add `hono-rate-limiter` for discovery/catalog public endpoints with a permissive anonymous limit (e.g., 100 req/min per IP) and no limit for authenticated callers.
 ```typescript
 import { rateLimiter } from "hono-rate-limiter";
 const discoveryLimiter = rateLimiter({
  windowMs: 60 * 1000,  // 1 minute
  limit: 100,
  keyGenerator: (c) => c.req.header("x-forwarded-for")?.split(",")[0] ?? "unknown",
 });
-  c.set("userId", payload.sub);
+app.use("/api/discovery/*", discoveryLimiter);
  await next();
 });
 ```
-**React provider pattern:**
+The in-process storage adapter (default in `hono-rate-limiter`) is sufficient for single-instance deployment. If the app scales horizontally, swap to `@hono-rate-limiter/redis` — but that is a future decision, not a v2.1 concern.
-```typescript
+**New dependency:**
 import { LogtoProvider, LogtoConfig } from "@logto/react";
-const config: LogtoConfig = {
+| Library | Version | Purpose |
-  endpoint: "https://logto.example.com",
+|---------|---------|---------|
-  appId: "<your-app-id>",
+| `hono-rate-limiter` | `^0.5.3` | Per-route rate limiting with configurable windows for public endpoints |
  resources: ["https://api.gearbox.example.com"],
 };
 // Wrap app root
 <LogtoProvider config={config}>
  <App />
 </LogtoProvider>
 ```
 ### Database -- PostgreSQL via Bun Native Driver
 | Technology | Version | Purpose | Why Recommended |
 |------------|---------|---------|-----------------|
 | PostgreSQL | 16+ | Primary database | Required by Logto anyway, proper concurrent access for multi-user, JSONB for flexible spec fields, full-text search for discovery feed. |
 | drizzle-orm | ^0.45.1 (existing) | Type-safe ORM | Already in use. Switch from `drizzle-orm/bun-sqlite` to `drizzle-orm/bun-sql` for Postgres. Schema definitions move from `sqlite-core` to `pg-core`. |
 | Bun native SQL | built-in | Postgres driver | Zero additional dependencies. `import { SQL } from "bun"` provides native Postgres bindings. Drizzle ORM supports it via `drizzle-orm/bun-sql`. |
 | postgres (postgres.js) | ^3.4.8 | Fallback Postgres driver | Only needed if Bun native SQL has issues with drizzle-kit CLI tooling (known issue #4122). More mature ecosystem, proven with Drizzle. Install as dev dependency for drizzle-kit. |
 **Schema migration approach:**
 1. Rewrite `src/db/schema.ts` imports from `drizzle-orm/sqlite-core` to `drizzle-orm/pg-core`
 2. Replace `sqliteTable` with `pgTable`
 3. Replace `integer().primaryKey({ autoIncrement: true })` with `integer().primaryKey().generatedAlwaysAsIdentity()` for PKs
 4. Replace `integer("created_at", { mode: "timestamp" })` with `timestamp("created_at").defaultNow().notNull()`
 5. Add `userId text("user_id").notNull()` to all user-owned tables (items, threads, setups, categories)
 6. Add `visibility text("visibility").notNull().default("private")` to setups and profiles
 7. Generate fresh Postgres migration with `drizzle-kit generate`
 8. Write a one-time data migration script (SQLite read -> Postgres insert) for existing data
 **drizzle.config.ts change:**
 ```typescript
 // Before
 { dialect: "sqlite", dbCredentials: { url: "./gearbox.db" } }
 // After
 { dialect: "postgresql", dbCredentials: { url: process.env.DATABASE_URL } }
 ```
 **Known issue:** drizzle-kit CLI does not use the Bun SQL driver for `push`/`generate` commands (GitHub issue #4122). Workaround: install `postgres` (postgres.js) as a dev dependency for drizzle-kit, while the app runtime uses Bun native SQL.
 ### Image Storage -- Bun Native S3 + MinIO
 | Technology | Version | Purpose | Why Recommended |
 |------------|---------|---------|-----------------|
 | Bun S3Client | built-in | S3 API client | Zero dependencies, native Bun bindings, extends Blob interface. Supports presigned URLs, streaming uploads. Built-in MinIO compatibility. |
 | MinIO | latest | Self-hosted S3-compatible object storage | Replaces local `./uploads/` directory. Single Go binary, Docker-friendly, S3 API compatible. Handles multi-user image scaling without cloud vendor lock-in. |
 **Why Bun native S3 over @aws-sdk/client-s3:**
 - Zero additional dependencies (Bun ships with it)
 - Simpler API (extends Blob, web-standard patterns)
 - Native performance bindings
 - Full MinIO compatibility documented by Bun team
 **Migration from ./uploads/:**
 1. Deploy MinIO container alongside app
 2. Create `gearbox-images` bucket
 3. Write migration script to upload existing files from `./uploads/` to MinIO
 4. Update image service to use S3Client for reads/writes
 5. Serve images via presigned URLs or a proxy route on Hono
 **Configuration:**
 ```typescript
 import { S3Client } from "bun";
 const storage = new S3Client({
  accessKeyId: process.env.S3_ACCESS_KEY!,
  secretAccessKey: process.env.S3_SECRET_KEY!,
  bucket: "gearbox-images",
  endpoint: process.env.S3_ENDPOINT!, // e.g., http://minio:9000
 });
 ```
 ### Supporting Libraries
 | Library | Version | Purpose | When to Use |
 |---------|---------|---------|-------------|
 | jose | ^6.2.2 | JWKS-based JWT verification | Every authenticated API request -- validate Logto access tokens on Hono middleware |
 | @logto/react | ^4.0.13 | React auth provider + hooks | Wrap app root, sign-in/sign-out flows, access token retrieval for API calls |
 ### Development / Infrastructure
 | Tool | Purpose | Notes |
 |------|---------|-------|
 | Docker Compose | Local dev environment | Postgres + Logto + MinIO containers. App still runs on bare Bun for HMR. |
 | drizzle-kit | Schema management | Same tool, different dialect config. `bun run db:generate` and `bun run db:push` still work. |
 ## Installation
 ```bash
-# New production dependencies
+bun add hono-rate-limiter
 bun add @logto/react jose
 # New dev dependencies (for drizzle-kit Postgres support)
 bun add -D postgres
 # No install needed for:
 # - Bun native S3 (built-in)
 # - Bun native SQL/Postgres (built-in)
 # - drizzle-orm (already installed, just change imports)
 ```
 ---
 ## Full Stack Additions Summary
 ### New Dependencies (v2.1 only)
 | Library | Version | Purpose | Why |
 |---------|---------|---------|-----|
 | `hono-rate-limiter` | `^0.5.3` | Configurable rate limits for public discovery routes | Existing in-process limiter is auth-only with a 5-req cap; public browse traffic needs separate, permissive limits |
 ### No New Dependencies Needed For
 | Capability | Existing Stack Component Used |
 |------------|------------------------------|
 | Public auth model (`tryAuth` variant) | Hono middleware — no library |
 | Discovery feed cursor pagination | Drizzle 0.45.x cursor pagination docs |
 | Full-text catalog search (tsvector GIN) | PostgreSQL native + Drizzle `sql` template |
 | Trending score computation | PostgreSQL SQL expression — no extension |
 | Infinite scroll client | TanStack Query `useInfiniteQuery` + native IntersectionObserver |
 | Catalog attribution fields | Drizzle schema migration |
 | Agent catalog seeding | Existing MCP SDK + new `create_catalog_item` tool |
 | HTTP cache headers | Hono built-in `etag()` + manual `Cache-Control` |
 | Feed ranking denormalization | Service-layer transaction update (no trigger, no cron) |
 ---
 ## Schema Changes Required (Not Library Changes)
 These are Drizzle schema additions generating migrations:
 ### `globalItems` additions
 ```typescript
 // In src/db/schema.ts — globalItems table additions
 sourceUrl: text("source_url"),
 sourceAttribution: text("source_attribution"),
 imageAttributionUrl: text("image_attribution_url"),
 imageAttributionText: text("image_attribution_text"),
 submittedByUserId: integer("submitted_by_user_id").references(() => users.id),
 verifiedAt: timestamp("verified_at"),
 ownerCount: integer("owner_count").notNull().default(0),
 productUrl: text("product_url"),
 ```
 ### Raw SQL migration additions (cannot be expressed in Drizzle schema)
 ```sql
 -- Add after Drizzle-generated migration runs:
 -- Generated tsvector column for full-text search
 ALTER TABLE global_items
  ADD COLUMN search_vector tsvector
  GENERATED ALWAYS AS (
    to_tsvector('english',
      coalesce(brand, '') || ' ' ||
      coalesce(model, '') || ' ' ||
      coalesce(description, '')
    )
  ) STORED;
 CREATE INDEX global_items_search_vector_idx ON global_items USING GIN(search_vector);
 -- Partial index for public setup discovery feed
 CREATE INDEX setups_public_updated_idx ON setups (updated_at DESC) WHERE is_public = true;
 -- Trending feed index
 CREATE INDEX global_items_owner_count_id_idx ON global_items (owner_count DESC, id DESC);
 ```
 > **Note:** Drizzle Kit does not generate `GENERATED ALWAYS AS ... STORED` for tsvector. Add these as a separate raw SQL file appended to the Drizzle migration or as a separate `customMigration` file in the migrations folder. Run via `bun run db:push` after the Drizzle migration applies.
 ### `setups` additions
 ```typescript
 // In src/db/schema.ts — setups table additions
 viewCount: integer("view_count").notNull().default(0),
 ```
 ---
 ## Alternatives Considered
-### Authentication Provider
+| Recommended | Alternative | Why Not |
 |-------------|-------------|---------|
 | PostgreSQL tsvector + GIN | Meilisearch / Typesense | Separate search service adds infra ops complexity; tsvector covers structured gear catalog search at GearBox scale without additional containers |
 | PostgreSQL tsvector + GIN | pg_textsearch (BM25 extension) | Requires installing a PostgreSQL extension in production; BM25 ranking is unnecessary for a catalog of branded products where exact brand/model matches dominate |
 | Denormalized `ownerCount` column | COUNT JOIN per feed request | Feed queries fire on every anonymous page load; a JOIN COUNT becomes a bottleneck before any other part of the stack does |
 | Native IntersectionObserver hook | react-infinite-scroll-component | Zero-dependency — 12-line hook replaces an 8KB library; consistent with GearBox's minimal-external-dependency UI philosophy |
 | Manual `Cache-Control` headers | Hono `cache()` middleware | Hono `cache()` is Cloudflare Workers/Deno only — silently does nothing on Bun |
 | `hono-rate-limiter` in-process | Redis-backed rate limiter | Single-instance deployment — Redis adds an infra dependency not justified at current scale |
 | Extend existing MCP toolset | Separate seeding CLI script | MCP agents already have auth and structured tool calling; a dedicated `create_catalog_item` tool is cleaner than a one-off script that bypasses the service layer |
 | Service-layer `ownerCount` update | PostgreSQL trigger | Triggers are invisible to the TypeScript codebase, harder to test, and prone to silent failures in complex transactions |
-| Recommended | Alternative | When to Use Alternative |
+---
 |-------------|-------------|-------------------------|
 | Logto | Authentik | If you need proxy-mode SSO for non-OIDC apps (Portainer, legacy tools) |
 | Logto | Zitadel | If building multi-tenant B2B SaaS with organization-level isolation |
 | Logto | Keycloak | If enterprise LDAP/AD integration is mandatory |
 ### Database Driver
 | Recommended | Alternative | When to Use Alternative |
 |-------------|-------------|-------------------------|
 | Bun native SQL (`bun:sql`) | postgres.js | If Bun native SQL has concurrency bugs (known issue in Bun 1.2.0 with concurrent statements) |
 | Bun native SQL (`bun:sql`) | @neondatabase/serverless | If deploying to serverless/edge where persistent connections are not possible |
 ### Image Storage
 | Recommended | Alternative | When to Use Alternative |
 |-------------|-------------|-------------------------|
 | MinIO (self-hosted) | Cloudflare R2 | If you want zero-ops storage with no egress fees and don't mind cloud dependency |
 | MinIO (self-hosted) | Local filesystem (current) | For development/testing only. Not viable for multi-user at scale. |
 ## What NOT to Add
 | Avoid | Why | Use Instead |
 |-------|-----|-------------|
-| @aws-sdk/client-s3 | 60+ transitive dependencies, Bun has native S3 support | Bun built-in S3Client |
+| Elasticsearch / OpenSearch | Separate cluster, ops overhead, overkill for a structured product catalog | PostgreSQL tsvector with GIN index |
-| passport.js / express-session | Wrong paradigm -- we want external OIDC, not embedded session auth | Logto + jose JWT validation |
+| pg_textsearch / VectorChord-BM25 | PostgreSQL extension install required in prod; BM25 precision unnecessary for brand+model search | PostgreSQL native `ts_rank` |
-| next-auth / auth.js | Designed for Next.js, assumes framework integration we don't have | Logto (external provider) |
+| Hono `cache()` middleware | Platform-specific to Cloudflare/Deno; does nothing on Bun | Manual `Cache-Control` headers in route handlers |
-| better-auth | Embedded auth library, opposite of external provider model | Logto (external provider) |
+| react-virtual / windowing | Feed is paginated, not a virtual list; items per page (~20) never hit DOM performance limits | Standard DOM list with cursor pagination |
-| pg (node-postgres) | Callback-based API, Bun has native Postgres bindings | Bun native SQL or postgres.js |
+| Prisma | Already using Drizzle ORM; two ORMs in one codebase is a maintenance trap | drizzle-orm (existing) |
-| sharp / image processing libs | Premature optimization -- serve originals first, add resizing later if needed | Direct S3 storage of originals |
+| Materialized views for feed caching | drizzle-kit does not fully support materialized view migrations; manual REFRESH logic is brittle | Denormalized score columns + partial indexes |
-| Redis | Not needed at this scale. Postgres handles sessions (via Logto), caching is premature | Postgres for everything |
+| Separate seeding framework (Faker, etc.) | Catalog data is real product data, not fake; agent seeding produces real structured records | MCP `create_catalog_item` tool |
 | Prisma | Already using Drizzle ORM, no reason to add a second ORM | drizzle-orm (existing) |
 | nanoid / cuid2 | Postgres `gen_random_uuid()` is built-in for public-facing IDs if needed | Postgres native UUID generation |
 | TypeORM / Sequelize | Legacy ORMs with worse TypeScript support than Drizzle | drizzle-orm (existing) |
-## Infrastructure Architecture
+---
 ```
 Docker Compose (dev) / Docker (prod)
 +-- gearbox-app        (Bun, port 3000)
 +-- gearbox-postgres   (PostgreSQL 16, port 5432)
 |   +-- gearbox DB     (app data)
 |   +-- logto DB       (Logto data, separate database same instance)
 +-- gearbox-logto      (Logto OSS, port 3001 app / 3002 admin)
 +-- gearbox-minio      (MinIO, port 9000 API / 9001 console)
 ```
 Logto and the app share a single Postgres instance (different databases). This keeps infrastructure simple -- one Postgres to back up, one to monitor. Logto requires PostgreSQL 14+; using 16 covers both.
 ## Version Compatibility
-| Package | Compatible With | Notes |
+| Package | Current Version | v2.1 Notes |
-|---------|-----------------|-------|
+|---------|----------------|------------|
-| drizzle-orm@0.45.x | Bun native SQL | Supported via `drizzle-orm/bun-sql` driver |
+| `hono` | 4.12.x (4.12.12 latest) | `etag()` built-in available; `cache()` is NOT compatible with Bun — do not use |
-| drizzle-orm@0.45.x | postgres.js@3.4.x | Supported via `drizzle-orm/postgres-js` driver (fallback) |
+| `drizzle-orm` | 0.45.x (0.45.2 latest stable) | Cursor pagination confirmed; generated tsvector column requires raw SQL migration appended to drizzle-kit output |
-| drizzle-kit@0.31.x | PostgreSQL 16 | Generates Postgres-dialect migrations |
+| `@tanstack/react-query` | 5.90.x | `useInfiniteQuery` with `getNextPageParam` fully supports cursor pattern natively |
-| @logto/react@4.x | React 19 | Uses React context/hooks, compatible |
+| `hono-rate-limiter` | 0.5.3 (latest, published ~16 days ago) | In-process storage adapter works on Bun; actively maintained |
-| jose@6.x | Bun runtime | Explicitly lists Bun support in docs |
+| `@modelcontextprotocol/sdk` | 1.29.x | Existing MCP tooling is sufficient for adding `create_catalog_item` |
-| Logto OSS v1.36 | PostgreSQL 14+ | Logto requires PG 14 minimum; use PG 16 for both app and Logto |
+| `zod` | 4.3.x | New catalog attribution schemas are straightforward additions to existing `schemas.ts` |
-| Bun S3Client | MinIO latest | Documented compatibility with endpoint configuration |
+| `@hono/zod-validator` | 0.7.x | Already used for all routes; covers new discovery/catalog endpoints |
-## Migration Checklist (SQLite to Postgres)
+---
-1. **Schema rewrite**: `sqlite-core` -> `pg-core` imports, adjust column types
+## Installation
-2. **Driver swap**: `drizzle-orm/bun-sqlite` -> `drizzle-orm/bun-sql`
+
-3. **Config update**: `drizzle.config.ts` dialect and credentials
+```bash
-4. **Fresh migrations**: Generate from scratch for Postgres (do not try to convert SQLite migrations)
+# Only one new package for v2.1
-5. **Data migration**: One-time script reads SQLite, writes to Postgres
+bun add hono-rate-limiter
-6. **Test infrastructure**: Update `createTestDb()` helper to use Postgres test database (or pg-mem for in-memory testing)
+```
-7. **CI pipeline**: Add Postgres service container for test runs
+
-8. **Remove SQLite deps**: Remove `better-sqlite3` from devDependencies after migration confirmed
+Everything else is schema migrations, new service/route/middleware code, and one new MCP tool — all on the existing stack.
 ---
 ## Sources
- [Logto official docs -- React quickstart](https://docs.logto.io/quick-starts/react) -- SDK setup, LogtoProvider config (HIGH confidence)
+- [Drizzle ORM cursor-based pagination](https://orm.drizzle.team/docs/guides/cursor-based-pagination) — two-column keyset pattern, v0.45.x confirmed (HIGH)
- [Logto API protection -- JWT validation](https://docs.logto.io/api-protection/nodejs/express) -- jose-based middleware pattern (HIGH confidence)
+- [Drizzle ORM PostgreSQL full-text search](https://orm.drizzle.team/docs/guides/postgresql-full-text-search) — tsvector approach confirmed (HIGH)
- [Logto OSS getting started](https://docs.logto.io/logto-oss/get-started-with-oss) -- Docker deployment, Postgres requirements (HIGH confidence)
+- [Drizzle ORM full-text search with generated columns](https://orm.drizzle.team/docs/guides/full-text-search-with-generated-columns) — generated column pattern for tsvector (HIGH)
- [Logto @logto/react npm](https://www.npmjs.com/package/@logto/react) -- Version 4.0.13 confirmed (HIGH confidence)
+- [Hono ETag middleware](https://hono.dev/docs/middleware/builtin/etag) — built-in, no install required (HIGH)
- [Drizzle ORM -- Bun SQL driver](https://orm.drizzle.team/docs/connect-bun-sql) -- Native Postgres via Bun (HIGH confidence)
+- [Hono Cache middleware](https://hono.dev/docs/middleware/builtin/cache) — explicitly listed as Cloudflare/Deno only, not Bun (HIGH)
- [Drizzle ORM -- PostgreSQL column types](https://orm.drizzle.team/docs/column-types/pg) -- pg-core schema definitions (HIGH confidence)
+- [Hono ETag issue #4401](https://github.com/honojs/hono/issues/4401) — known inconsistency bug in etag middleware (MEDIUM)
- [drizzle-kit Bun SQL issue #4122](https://github.com/drizzle-team/drizzle-orm/issues/4122) -- Known CLI limitation with Bun driver (MEDIUM confidence)
+- [hono-rate-limiter GitHub](https://github.com/rhinobase/hono-rate-limiter) — v0.5.3, active, Bun compatible (HIGH)
- [Bun S3 documentation](https://bun.com/docs/runtime/s3) -- Native S3 client, MinIO config (HIGH confidence)
+- [hono-rate-limiter npm](https://www.npmjs.com/package/hono-rate-limiter) — version 0.5.3 confirmed (HIGH)
- [MinIO GitHub](https://github.com/minio/minio) -- S3-compatible self-hosted storage (HIGH confidence)
+- [TanStack Query infinite queries](https://tanstack.com/query/latest/docs/framework/react/guides/infinite-queries) — `useInfiniteQuery` cursor pattern (HIGH)
- [jose GitHub](https://github.com/panva/jose) -- JWT library v6.2.2, explicit Bun support (HIGH confidence)
+- [Drizzle ORM materialized views issue #2653](https://github.com/drizzle-team/drizzle-orm/issues/2653) — confirmed drizzle-kit does not fully support materialized view migrations (MEDIUM)
- [Authentik vs Zitadel comparison](https://wz-it.com/en/blog/authentik-vs-zitadel-identity-provider-comparison/) -- Auth provider analysis (MEDIUM confidence)
+- [Hono middleware docs](https://hono.dev/docs/guides/middleware) — selective auth middleware pattern (HIGH)
- [Keycloak vs Authentik vs Zitadel 2026](https://blog.houseoffoss.com/post/keycloak-vs-authentik-vs-zitadel-2026-which-open-source-login-tool-should-you-use) -- Ecosystem overview (MEDIUM confidence)
+- GearBox `package.json` — all existing dependency versions verified directly (HIGH)
- [postgres.js npm](https://www.npmjs.com/package/postgres) -- Version 3.4.8, fallback driver (HIGH confidence)
+- GearBox `src/server/index.ts` — existing skip-list pattern verified directly (HIGH)
 - GearBox `src/server/middleware/auth.ts` — existing three-way auth verified directly (HIGH)
 - GearBox `src/db/schema.ts` — existing `globalItems` table columns verified directly (HIGH)
 ---
-*Stack research for: GearBox v2.0 Platform Foundation*
+
-*Researched: 2026-04-03*
+*Stack research for: GearBox v2.1 Public Discovery milestone*
 *Researched: 2026-04-09*