Files
GearBox/.planning/research/STACK.md

18 KiB

Stack Research

Domain: Public-first gear discovery platform — catalog enrichment, discovery feed, agent-powered seeding (v2.1) Researched: 2026-04-09 Confidence: HIGH (existing stack verified against package.json; additions verified against npm/official docs)


Context: What Already Exists (Do Not Re-Research)

The following are validated and in production at v2.0. This file covers ADDITIONS AND CHANGES only.

Layer Current
Runtime Bun
Frontend React 19, TanStack Router/Query v5, Tailwind CSS v4, Zustand, Zod 4.x, framer-motion, Recharts, Lucide React
Backend Hono 4.12.x, Drizzle ORM 0.45.x, PostgreSQL (postgres.js 3.4.x driver)
Auth @hono/oidc-auth 1.8.x (Logto), API key auth, MCP OAuth 2.1
Storage @aws-sdk/client-s3 3.x (MinIO)
MCP @modelcontextprotocol/sdk 1.29.x (19 tools)
Rate limiting Custom in-process Map (auth endpoints only, 5 req/15 min per IP)

New Capability Areas

1. Public Access Auth Model

What's needed: The requireAuth middleware in src/server/middleware/auth.ts already handles three auth paths (API key, OAuth Bearer, OIDC session). The skip-list pattern in src/server/index.ts already exempts public GETs on /api/global-items, /api/tags, /api/users/:id/profile, and /api/setups/:id/public.

This milestone extends the skip-list to cover new discovery endpoints (/api/discovery/*). Additionally, a new tryAuth middleware variant is needed for endpoints that work for both anonymous and authenticated users — it resolves userId if credentials are present but does NOT 401 on absence. This enables auth-aware responses (e.g., annotating feed items with "in your collection" for logged-in users).

No new dependency. Pure middleware logic — add tryAuth to auth.ts, update skip-list in index.ts.


The feed requires: ranked/scored queries, cursor-based pagination, and cheap repeated reads by anonymous users.

Use a hot-score computed in PostgreSQL SQL — no external search engine or materialized view needed at this scale.

-- Hacker News-style decay: engagement / time^gravity
SELECT id, brand, model,
  (owner_count::float / power((extract(epoch from now()) - extract(epoch from created_at)) / 3600.0 + 2, 1.8)) AS hot_score
FROM global_items
ORDER BY hot_score DESC
LIMIT 20;

This requires ownerCount as a real column (not a JOIN-time COUNT) on globalItems. The column already logically exists via join — promote it to a denormalized integer that the collection add/remove service path updates. No trigger needed; update it in the same database transaction as the collection operation.

No new dependency. Schema migration + service-layer update.

Cursor-Based Pagination

Drizzle ORM 0.45.x has documented cursor pagination support (two-column keyset). Use (hotScore DESC, id DESC) for the trending feed and (createdAt DESC, id DESC) for "recently added." Encode cursor as base64 JSON — opaque to the client.

The Hono + Drizzle cursor pattern is documented and actively used in the ecosystem. No pagination library needed.

No new dependency. Drizzle already supports this natively.

globalItems needs fast free-text search across brand + model + description. Use PostgreSQL native tsvector with a GIN index.

Drizzle 0.45.x does not generate GENERATED ALWAYS AS ... STORED syntax for tsvector columns in drizzle-kit. Add the searchVector column and GIN index via a raw SQL migration file (create via drizzle-kit generate then manually add the ALTER TABLE and CREATE INDEX statements to the generated file).

For the Hono route, use Drizzle's sql template tag with to_tsquery:

.where(sql`search_vector @@ plainto_tsquery('english', ${q})`)
.orderBy(sql`ts_rank(search_vector, plainto_tsquery('english', ${q})) DESC`)

No new dependency. Schema migration + raw SQL in service layer.

Feed Client (TanStack Query + IntersectionObserver)

useInfiniteQuery from @tanstack/react-query (already at 5.90.x) handles cursor pagination natively via getNextPageParam. The scroll trigger uses the browser-native IntersectionObserver API — implement a useIntersectionObserver(ref, callback) hook (~12 lines) rather than adding a scroll library. This matches the existing GearBox pattern of minimal third-party UI dependencies.

No new dependency.


3. Catalog Enrichment Infrastructure

Schema Additions to globalItems

New fields for attribution, source tracking, and feed ranking:

Field Type Purpose
sourceUrl text Canonical product page (retailer or manufacturer)
sourceAttribution text Human-readable credit ("via REI", "via manufacturer")
imageAttributionUrl text URL where product image was originally sourced
imageAttributionText text License or credit line for the image
submittedByUserId integer FK → users Who submitted this catalog entry (null = seeded by admin/agent)
verifiedAt timestamp When an admin approved the entry (null = unverified)
ownerCount integer NOT NULL DEFAULT 0 Denormalized count of collection items referencing this
productUrl text Retailer/manufacturer product link (duplicates item-level, but catalog-owned)

These are Drizzle schema additions. No new dependency.

Zod Schemas for Enriched Catalog

Add CreateCatalogItemSchema in src/shared/schemas.ts with attribution fields. Zod 4.3.x handles this natively. The schema feeds the new POST /api/global-items route (currently only GET is public — writes will require auth but open to non-admins for catalog submissions).


4. Agent-Powered Catalog Seeding via MCP

The existing MCP server (@modelcontextprotocol/sdk 1.29.x, 19 tools) already provides the infrastructure. The agent workflow:

  1. Claude agent receives a category or brand as a prompt
  2. Uses a new create_catalog_item MCP tool — purpose-built for globalItems insertion with full attribution fields
  3. Server validates via Zod, inserts into globalItems, updates ownerCount denormalization
  4. Agent uses the existing upload_image_from_url tool to fetch and store product images

The new tool registers identically to existing tools in src/server/mcp/index.ts. Batch seeding sessions: the agent runs N create_catalog_item calls in sequence within one MCP session — no parallel execution framework needed at catalog bootstrap scale.

For standalone seed scripts (bun run src/db/dev-seed.ts extensions), use the Drizzle db instance directly. No external seeding framework.

No new dependency.


5. HTTP Caching for Public Endpoints

Public GET endpoints (discovery feed, catalog detail pages) will be hit by anonymous users repeatedly. Add HTTP-level cache hints to reduce DB round-trips.

  • Catalog item detail pages (GET /api/global-items/:id): Use Hono's built-in etag() middleware. Content-addressed — returns 304 Not Modified when item hasn't changed.
  • Discovery feed endpoints (GET /api/discovery/*): Set Cache-Control: public, max-age=60, stale-while-revalidate=300 manually in route handlers. Feed data tolerates 60s staleness.

Do NOT use Hono's cache() middleware — it is platform-specific to Cloudflare Workers and Deno, and silently does nothing on Bun. This is a documented limitation. Known issue #4401 in the Hono repo also shows the etag() middleware can generate inconsistent ETags when combining with other middleware — test in integration tests before shipping.

No new dependency. etag is built into Hono 4.12.x.


6. Rate Limiting for Public Traffic

The existing rateLimit.ts in-process Map handles auth endpoints correctly (5 req/15 min per IP). It is inappropriate for public discovery traffic because:

  • 5 req/15 min is far too strict for anonymous browsing
  • In-process state resets on server restart (tolerable for auth, wrong for general rate limiting)
  • No way to differentiate authenticated vs anonymous callers in the current implementation

Recommendation: Keep the existing rateLimit.ts for auth endpoints only. Add hono-rate-limiter for discovery/catalog public endpoints with a permissive anonymous limit (e.g., 100 req/min per IP) and no limit for authenticated callers.

import { rateLimiter } from "hono-rate-limiter";

const discoveryLimiter = rateLimiter({
  windowMs: 60 * 1000,  // 1 minute
  limit: 100,
  keyGenerator: (c) => c.req.header("x-forwarded-for")?.split(",")[0] ?? "unknown",
});

app.use("/api/discovery/*", discoveryLimiter);

The in-process storage adapter (default in hono-rate-limiter) is sufficient for single-instance deployment. If the app scales horizontally, swap to @hono-rate-limiter/redis — but that is a future decision, not a v2.1 concern.

New dependency:

Library Version Purpose
hono-rate-limiter ^0.5.3 Per-route rate limiting with configurable windows for public endpoints
bun add hono-rate-limiter

Full Stack Additions Summary

New Dependencies (v2.1 only)

Library Version Purpose Why
hono-rate-limiter ^0.5.3 Configurable rate limits for public discovery routes Existing in-process limiter is auth-only with a 5-req cap; public browse traffic needs separate, permissive limits

No New Dependencies Needed For

Capability Existing Stack Component Used
Public auth model (tryAuth variant) Hono middleware — no library
Discovery feed cursor pagination Drizzle 0.45.x cursor pagination docs
Full-text catalog search (tsvector GIN) PostgreSQL native + Drizzle sql template
Trending score computation PostgreSQL SQL expression — no extension
Infinite scroll client TanStack Query useInfiniteQuery + native IntersectionObserver
Catalog attribution fields Drizzle schema migration
Agent catalog seeding Existing MCP SDK + new create_catalog_item tool
HTTP cache headers Hono built-in etag() + manual Cache-Control
Feed ranking denormalization Service-layer transaction update (no trigger, no cron)

Schema Changes Required (Not Library Changes)

These are Drizzle schema additions generating migrations:

globalItems additions

// In src/db/schema.ts — globalItems table additions
sourceUrl: text("source_url"),
sourceAttribution: text("source_attribution"),
imageAttributionUrl: text("image_attribution_url"),
imageAttributionText: text("image_attribution_text"),
submittedByUserId: integer("submitted_by_user_id").references(() => users.id),
verifiedAt: timestamp("verified_at"),
ownerCount: integer("owner_count").notNull().default(0),
productUrl: text("product_url"),

Raw SQL migration additions (cannot be expressed in Drizzle schema)

-- Add after Drizzle-generated migration runs:

-- Generated tsvector column for full-text search
ALTER TABLE global_items
  ADD COLUMN search_vector tsvector
  GENERATED ALWAYS AS (
    to_tsvector('english',
      coalesce(brand, '') || ' ' ||
      coalesce(model, '') || ' ' ||
      coalesce(description, '')
    )
  ) STORED;

CREATE INDEX global_items_search_vector_idx ON global_items USING GIN(search_vector);

-- Partial index for public setup discovery feed
CREATE INDEX setups_public_updated_idx ON setups (updated_at DESC) WHERE is_public = true;

-- Trending feed index
CREATE INDEX global_items_owner_count_id_idx ON global_items (owner_count DESC, id DESC);

Note: Drizzle Kit does not generate GENERATED ALWAYS AS ... STORED for tsvector. Add these as a separate raw SQL file appended to the Drizzle migration or as a separate customMigration file in the migrations folder. Run via bun run db:push after the Drizzle migration applies.

setups additions

// In src/db/schema.ts — setups table additions
viewCount: integer("view_count").notNull().default(0),

Alternatives Considered

Recommended Alternative Why Not
PostgreSQL tsvector + GIN Meilisearch / Typesense Separate search service adds infra ops complexity; tsvector covers structured gear catalog search at GearBox scale without additional containers
PostgreSQL tsvector + GIN pg_textsearch (BM25 extension) Requires installing a PostgreSQL extension in production; BM25 ranking is unnecessary for a catalog of branded products where exact brand/model matches dominate
Denormalized ownerCount column COUNT JOIN per feed request Feed queries fire on every anonymous page load; a JOIN COUNT becomes a bottleneck before any other part of the stack does
Native IntersectionObserver hook react-infinite-scroll-component Zero-dependency — 12-line hook replaces an 8KB library; consistent with GearBox's minimal-external-dependency UI philosophy
Manual Cache-Control headers Hono cache() middleware Hono cache() is Cloudflare Workers/Deno only — silently does nothing on Bun
hono-rate-limiter in-process Redis-backed rate limiter Single-instance deployment — Redis adds an infra dependency not justified at current scale
Extend existing MCP toolset Separate seeding CLI script MCP agents already have auth and structured tool calling; a dedicated create_catalog_item tool is cleaner than a one-off script that bypasses the service layer
Service-layer ownerCount update PostgreSQL trigger Triggers are invisible to the TypeScript codebase, harder to test, and prone to silent failures in complex transactions

What NOT to Add

Avoid Why Use Instead
Elasticsearch / OpenSearch Separate cluster, ops overhead, overkill for a structured product catalog PostgreSQL tsvector with GIN index
pg_textsearch / VectorChord-BM25 PostgreSQL extension install required in prod; BM25 precision unnecessary for brand+model search PostgreSQL native ts_rank
Hono cache() middleware Platform-specific to Cloudflare/Deno; does nothing on Bun Manual Cache-Control headers in route handlers
react-virtual / windowing Feed is paginated, not a virtual list; items per page (~20) never hit DOM performance limits Standard DOM list with cursor pagination
Prisma Already using Drizzle ORM; two ORMs in one codebase is a maintenance trap drizzle-orm (existing)
Materialized views for feed caching drizzle-kit does not fully support materialized view migrations; manual REFRESH logic is brittle Denormalized score columns + partial indexes
Separate seeding framework (Faker, etc.) Catalog data is real product data, not fake; agent seeding produces real structured records MCP create_catalog_item tool

Version Compatibility

Package Current Version v2.1 Notes
hono 4.12.x (4.12.12 latest) etag() built-in available; cache() is NOT compatible with Bun — do not use
drizzle-orm 0.45.x (0.45.2 latest stable) Cursor pagination confirmed; generated tsvector column requires raw SQL migration appended to drizzle-kit output
@tanstack/react-query 5.90.x useInfiniteQuery with getNextPageParam fully supports cursor pattern natively
hono-rate-limiter 0.5.3 (latest, published ~16 days ago) In-process storage adapter works on Bun; actively maintained
@modelcontextprotocol/sdk 1.29.x Existing MCP tooling is sufficient for adding create_catalog_item
zod 4.3.x New catalog attribution schemas are straightforward additions to existing schemas.ts
@hono/zod-validator 0.7.x Already used for all routes; covers new discovery/catalog endpoints

Installation

# Only one new package for v2.1
bun add hono-rate-limiter

Everything else is schema migrations, new service/route/middleware code, and one new MCP tool — all on the existing stack.


Sources


Stack research for: GearBox v2.1 Public Discovery milestone Researched: 2026-04-09