334 lines
18 KiB
Markdown
334 lines
18 KiB
Markdown
# Stack Research
|
|
|
|
**Domain:** Public-first gear discovery platform — catalog enrichment, discovery feed, agent-powered seeding (v2.1)
|
|
**Researched:** 2026-04-09
|
|
**Confidence:** HIGH (existing stack verified against package.json; additions verified against npm/official docs)
|
|
|
|
---
|
|
|
|
## Context: What Already Exists (Do Not Re-Research)
|
|
|
|
The following are validated and in production at v2.0. This file covers ADDITIONS AND CHANGES only.
|
|
|
|
| Layer | Current |
|
|
|-------|---------|
|
|
| Runtime | Bun |
|
|
| Frontend | React 19, TanStack Router/Query v5, Tailwind CSS v4, Zustand, Zod 4.x, framer-motion, Recharts, Lucide React |
|
|
| Backend | Hono 4.12.x, Drizzle ORM 0.45.x, PostgreSQL (postgres.js 3.4.x driver) |
|
|
| Auth | @hono/oidc-auth 1.8.x (Logto), API key auth, MCP OAuth 2.1 |
|
|
| Storage | @aws-sdk/client-s3 3.x (MinIO) |
|
|
| MCP | @modelcontextprotocol/sdk 1.29.x (19 tools) |
|
|
| Rate limiting | Custom in-process Map (auth endpoints only, 5 req/15 min per IP) |
|
|
|
|
---
|
|
|
|
## New Capability Areas
|
|
|
|
### 1. Public Access Auth Model
|
|
|
|
**What's needed:** The `requireAuth` middleware in `src/server/middleware/auth.ts` already handles three auth paths (API key, OAuth Bearer, OIDC session). The skip-list pattern in `src/server/index.ts` already exempts public GETs on `/api/global-items`, `/api/tags`, `/api/users/:id/profile`, and `/api/setups/:id/public`.
|
|
|
|
**This milestone extends the skip-list** to cover new discovery endpoints (`/api/discovery/*`). Additionally, a new `tryAuth` middleware variant is needed for endpoints that work for both anonymous and authenticated users — it resolves `userId` if credentials are present but does NOT 401 on absence. This enables auth-aware responses (e.g., annotating feed items with "in your collection" for logged-in users).
|
|
|
|
**No new dependency.** Pure middleware logic — add `tryAuth` to `auth.ts`, update skip-list in `index.ts`.
|
|
|
|
---
|
|
|
|
### 2. Discovery Feed (Popular Setups, Trending Items)
|
|
|
|
The feed requires: ranked/scored queries, cursor-based pagination, and cheap repeated reads by anonymous users.
|
|
|
|
#### Trending Score
|
|
|
|
Use a hot-score computed in PostgreSQL SQL — no external search engine or materialized view needed at this scale.
|
|
|
|
```sql
|
|
-- Hacker News-style decay: engagement / time^gravity
|
|
SELECT id, brand, model,
|
|
(owner_count::float / power((extract(epoch from now()) - extract(epoch from created_at)) / 3600.0 + 2, 1.8)) AS hot_score
|
|
FROM global_items
|
|
ORDER BY hot_score DESC
|
|
LIMIT 20;
|
|
```
|
|
|
|
This requires `ownerCount` as a real column (not a JOIN-time COUNT) on `globalItems`. The column already logically exists via join — promote it to a denormalized integer that the collection add/remove service path updates. No trigger needed; update it in the same database transaction as the collection operation.
|
|
|
|
**No new dependency.** Schema migration + service-layer update.
|
|
|
|
#### Cursor-Based Pagination
|
|
|
|
Drizzle ORM 0.45.x has documented cursor pagination support (two-column keyset). Use `(hotScore DESC, id DESC)` for the trending feed and `(createdAt DESC, id DESC)` for "recently added." Encode cursor as base64 JSON — opaque to the client.
|
|
|
|
The Hono + Drizzle cursor pattern is documented and actively used in the ecosystem. No pagination library needed.
|
|
|
|
**No new dependency.** Drizzle already supports this natively.
|
|
|
|
#### Full-Text Catalog Search
|
|
|
|
`globalItems` needs fast free-text search across `brand + model + description`. Use PostgreSQL native `tsvector` with a GIN index.
|
|
|
|
Drizzle 0.45.x does not generate `GENERATED ALWAYS AS ... STORED` syntax for tsvector columns in drizzle-kit. Add the `searchVector` column and GIN index via a raw SQL migration file (create via `drizzle-kit generate` then manually add the ALTER TABLE and CREATE INDEX statements to the generated file).
|
|
|
|
For the Hono route, use Drizzle's `sql` template tag with `to_tsquery`:
|
|
|
|
```typescript
|
|
.where(sql`search_vector @@ plainto_tsquery('english', ${q})`)
|
|
.orderBy(sql`ts_rank(search_vector, plainto_tsquery('english', ${q})) DESC`)
|
|
```
|
|
|
|
**No new dependency.** Schema migration + raw SQL in service layer.
|
|
|
|
#### Feed Client (TanStack Query + IntersectionObserver)
|
|
|
|
`useInfiniteQuery` from `@tanstack/react-query` (already at 5.90.x) handles cursor pagination natively via `getNextPageParam`. The scroll trigger uses the browser-native IntersectionObserver API — implement a `useIntersectionObserver(ref, callback)` hook (~12 lines) rather than adding a scroll library. This matches the existing GearBox pattern of minimal third-party UI dependencies.
|
|
|
|
**No new dependency.**
|
|
|
|
---
|
|
|
|
### 3. Catalog Enrichment Infrastructure
|
|
|
|
#### Schema Additions to `globalItems`
|
|
|
|
New fields for attribution, source tracking, and feed ranking:
|
|
|
|
| Field | Type | Purpose |
|
|
|-------|------|---------|
|
|
| `sourceUrl` | `text` | Canonical product page (retailer or manufacturer) |
|
|
| `sourceAttribution` | `text` | Human-readable credit ("via REI", "via manufacturer") |
|
|
| `imageAttributionUrl` | `text` | URL where product image was originally sourced |
|
|
| `imageAttributionText` | `text` | License or credit line for the image |
|
|
| `submittedByUserId` | `integer FK → users` | Who submitted this catalog entry (null = seeded by admin/agent) |
|
|
| `verifiedAt` | `timestamp` | When an admin approved the entry (null = unverified) |
|
|
| `ownerCount` | `integer NOT NULL DEFAULT 0` | Denormalized count of collection items referencing this |
|
|
| `productUrl` | `text` | Retailer/manufacturer product link (duplicates item-level, but catalog-owned) |
|
|
|
|
These are Drizzle schema additions. **No new dependency.**
|
|
|
|
#### Zod Schemas for Enriched Catalog
|
|
|
|
Add `CreateCatalogItemSchema` in `src/shared/schemas.ts` with attribution fields. Zod 4.3.x handles this natively. The schema feeds the new `POST /api/global-items` route (currently only GET is public — writes will require auth but open to non-admins for catalog submissions).
|
|
|
|
---
|
|
|
|
### 4. Agent-Powered Catalog Seeding via MCP
|
|
|
|
The existing MCP server (`@modelcontextprotocol/sdk` 1.29.x, 19 tools) already provides the infrastructure. The agent workflow:
|
|
|
|
1. Claude agent receives a category or brand as a prompt
|
|
2. Uses a new `create_catalog_item` MCP tool — purpose-built for `globalItems` insertion with full attribution fields
|
|
3. Server validates via Zod, inserts into `globalItems`, updates `ownerCount` denormalization
|
|
4. Agent uses the existing `upload_image_from_url` tool to fetch and store product images
|
|
|
|
The new tool registers identically to existing tools in `src/server/mcp/index.ts`. Batch seeding sessions: the agent runs N `create_catalog_item` calls in sequence within one MCP session — no parallel execution framework needed at catalog bootstrap scale.
|
|
|
|
For standalone seed scripts (`bun run src/db/dev-seed.ts` extensions), use the Drizzle db instance directly. No external seeding framework.
|
|
|
|
**No new dependency.**
|
|
|
|
---
|
|
|
|
### 5. HTTP Caching for Public Endpoints
|
|
|
|
Public GET endpoints (discovery feed, catalog detail pages) will be hit by anonymous users repeatedly. Add HTTP-level cache hints to reduce DB round-trips.
|
|
|
|
- **Catalog item detail pages** (`GET /api/global-items/:id`): Use Hono's built-in `etag()` middleware. Content-addressed — returns 304 Not Modified when item hasn't changed.
|
|
- **Discovery feed endpoints** (`GET /api/discovery/*`): Set `Cache-Control: public, max-age=60, stale-while-revalidate=300` manually in route handlers. Feed data tolerates 60s staleness.
|
|
|
|
**Do NOT use Hono's `cache()` middleware** — it is platform-specific to Cloudflare Workers and Deno, and silently does nothing on Bun. This is a documented limitation. Known issue #4401 in the Hono repo also shows the `etag()` middleware can generate inconsistent ETags when combining with other middleware — test in integration tests before shipping.
|
|
|
|
**No new dependency.** `etag` is built into Hono 4.12.x.
|
|
|
|
---
|
|
|
|
### 6. Rate Limiting for Public Traffic
|
|
|
|
The existing `rateLimit.ts` in-process Map handles auth endpoints correctly (5 req/15 min per IP). It is inappropriate for public discovery traffic because:
|
|
|
|
- 5 req/15 min is far too strict for anonymous browsing
|
|
- In-process state resets on server restart (tolerable for auth, wrong for general rate limiting)
|
|
- No way to differentiate authenticated vs anonymous callers in the current implementation
|
|
|
|
**Recommendation:** Keep the existing `rateLimit.ts` for auth endpoints only. Add `hono-rate-limiter` for discovery/catalog public endpoints with a permissive anonymous limit (e.g., 100 req/min per IP) and no limit for authenticated callers.
|
|
|
|
```typescript
|
|
import { rateLimiter } from "hono-rate-limiter";
|
|
|
|
const discoveryLimiter = rateLimiter({
|
|
windowMs: 60 * 1000, // 1 minute
|
|
limit: 100,
|
|
keyGenerator: (c) => c.req.header("x-forwarded-for")?.split(",")[0] ?? "unknown",
|
|
});
|
|
|
|
app.use("/api/discovery/*", discoveryLimiter);
|
|
```
|
|
|
|
The in-process storage adapter (default in `hono-rate-limiter`) is sufficient for single-instance deployment. If the app scales horizontally, swap to `@hono-rate-limiter/redis` — but that is a future decision, not a v2.1 concern.
|
|
|
|
**New dependency:**
|
|
|
|
| Library | Version | Purpose |
|
|
|---------|---------|---------|
|
|
| `hono-rate-limiter` | `^0.5.3` | Per-route rate limiting with configurable windows for public endpoints |
|
|
|
|
```bash
|
|
bun add hono-rate-limiter
|
|
```
|
|
|
|
---
|
|
|
|
## Full Stack Additions Summary
|
|
|
|
### New Dependencies (v2.1 only)
|
|
|
|
| Library | Version | Purpose | Why |
|
|
|---------|---------|---------|-----|
|
|
| `hono-rate-limiter` | `^0.5.3` | Configurable rate limits for public discovery routes | Existing in-process limiter is auth-only with a 5-req cap; public browse traffic needs separate, permissive limits |
|
|
|
|
### No New Dependencies Needed For
|
|
|
|
| Capability | Existing Stack Component Used |
|
|
|------------|------------------------------|
|
|
| Public auth model (`tryAuth` variant) | Hono middleware — no library |
|
|
| Discovery feed cursor pagination | Drizzle 0.45.x cursor pagination docs |
|
|
| Full-text catalog search (tsvector GIN) | PostgreSQL native + Drizzle `sql` template |
|
|
| Trending score computation | PostgreSQL SQL expression — no extension |
|
|
| Infinite scroll client | TanStack Query `useInfiniteQuery` + native IntersectionObserver |
|
|
| Catalog attribution fields | Drizzle schema migration |
|
|
| Agent catalog seeding | Existing MCP SDK + new `create_catalog_item` tool |
|
|
| HTTP cache headers | Hono built-in `etag()` + manual `Cache-Control` |
|
|
| Feed ranking denormalization | Service-layer transaction update (no trigger, no cron) |
|
|
|
|
---
|
|
|
|
## Schema Changes Required (Not Library Changes)
|
|
|
|
These are Drizzle schema additions generating migrations:
|
|
|
|
### `globalItems` additions
|
|
|
|
```typescript
|
|
// In src/db/schema.ts — globalItems table additions
|
|
sourceUrl: text("source_url"),
|
|
sourceAttribution: text("source_attribution"),
|
|
imageAttributionUrl: text("image_attribution_url"),
|
|
imageAttributionText: text("image_attribution_text"),
|
|
submittedByUserId: integer("submitted_by_user_id").references(() => users.id),
|
|
verifiedAt: timestamp("verified_at"),
|
|
ownerCount: integer("owner_count").notNull().default(0),
|
|
productUrl: text("product_url"),
|
|
```
|
|
|
|
### Raw SQL migration additions (cannot be expressed in Drizzle schema)
|
|
|
|
```sql
|
|
-- Add after Drizzle-generated migration runs:
|
|
|
|
-- Generated tsvector column for full-text search
|
|
ALTER TABLE global_items
|
|
ADD COLUMN search_vector tsvector
|
|
GENERATED ALWAYS AS (
|
|
to_tsvector('english',
|
|
coalesce(brand, '') || ' ' ||
|
|
coalesce(model, '') || ' ' ||
|
|
coalesce(description, '')
|
|
)
|
|
) STORED;
|
|
|
|
CREATE INDEX global_items_search_vector_idx ON global_items USING GIN(search_vector);
|
|
|
|
-- Partial index for public setup discovery feed
|
|
CREATE INDEX setups_public_updated_idx ON setups (updated_at DESC) WHERE is_public = true;
|
|
|
|
-- Trending feed index
|
|
CREATE INDEX global_items_owner_count_id_idx ON global_items (owner_count DESC, id DESC);
|
|
```
|
|
|
|
> **Note:** Drizzle Kit does not generate `GENERATED ALWAYS AS ... STORED` for tsvector. Add these as a separate raw SQL file appended to the Drizzle migration or as a separate `customMigration` file in the migrations folder. Run via `bun run db:push` after the Drizzle migration applies.
|
|
|
|
### `setups` additions
|
|
|
|
```typescript
|
|
// In src/db/schema.ts — setups table additions
|
|
viewCount: integer("view_count").notNull().default(0),
|
|
```
|
|
|
|
---
|
|
|
|
## Alternatives Considered
|
|
|
|
| Recommended | Alternative | Why Not |
|
|
|-------------|-------------|---------|
|
|
| PostgreSQL tsvector + GIN | Meilisearch / Typesense | Separate search service adds infra ops complexity; tsvector covers structured gear catalog search at GearBox scale without additional containers |
|
|
| PostgreSQL tsvector + GIN | pg_textsearch (BM25 extension) | Requires installing a PostgreSQL extension in production; BM25 ranking is unnecessary for a catalog of branded products where exact brand/model matches dominate |
|
|
| Denormalized `ownerCount` column | COUNT JOIN per feed request | Feed queries fire on every anonymous page load; a JOIN COUNT becomes a bottleneck before any other part of the stack does |
|
|
| Native IntersectionObserver hook | react-infinite-scroll-component | Zero-dependency — 12-line hook replaces an 8KB library; consistent with GearBox's minimal-external-dependency UI philosophy |
|
|
| Manual `Cache-Control` headers | Hono `cache()` middleware | Hono `cache()` is Cloudflare Workers/Deno only — silently does nothing on Bun |
|
|
| `hono-rate-limiter` in-process | Redis-backed rate limiter | Single-instance deployment — Redis adds an infra dependency not justified at current scale |
|
|
| Extend existing MCP toolset | Separate seeding CLI script | MCP agents already have auth and structured tool calling; a dedicated `create_catalog_item` tool is cleaner than a one-off script that bypasses the service layer |
|
|
| Service-layer `ownerCount` update | PostgreSQL trigger | Triggers are invisible to the TypeScript codebase, harder to test, and prone to silent failures in complex transactions |
|
|
|
|
---
|
|
|
|
## What NOT to Add
|
|
|
|
| Avoid | Why | Use Instead |
|
|
|-------|-----|-------------|
|
|
| Elasticsearch / OpenSearch | Separate cluster, ops overhead, overkill for a structured product catalog | PostgreSQL tsvector with GIN index |
|
|
| pg_textsearch / VectorChord-BM25 | PostgreSQL extension install required in prod; BM25 precision unnecessary for brand+model search | PostgreSQL native `ts_rank` |
|
|
| Hono `cache()` middleware | Platform-specific to Cloudflare/Deno; does nothing on Bun | Manual `Cache-Control` headers in route handlers |
|
|
| react-virtual / windowing | Feed is paginated, not a virtual list; items per page (~20) never hit DOM performance limits | Standard DOM list with cursor pagination |
|
|
| Prisma | Already using Drizzle ORM; two ORMs in one codebase is a maintenance trap | drizzle-orm (existing) |
|
|
| Materialized views for feed caching | drizzle-kit does not fully support materialized view migrations; manual REFRESH logic is brittle | Denormalized score columns + partial indexes |
|
|
| Separate seeding framework (Faker, etc.) | Catalog data is real product data, not fake; agent seeding produces real structured records | MCP `create_catalog_item` tool |
|
|
|
|
---
|
|
|
|
## Version Compatibility
|
|
|
|
| Package | Current Version | v2.1 Notes |
|
|
|---------|----------------|------------|
|
|
| `hono` | 4.12.x (4.12.12 latest) | `etag()` built-in available; `cache()` is NOT compatible with Bun — do not use |
|
|
| `drizzle-orm` | 0.45.x (0.45.2 latest stable) | Cursor pagination confirmed; generated tsvector column requires raw SQL migration appended to drizzle-kit output |
|
|
| `@tanstack/react-query` | 5.90.x | `useInfiniteQuery` with `getNextPageParam` fully supports cursor pattern natively |
|
|
| `hono-rate-limiter` | 0.5.3 (latest, published ~16 days ago) | In-process storage adapter works on Bun; actively maintained |
|
|
| `@modelcontextprotocol/sdk` | 1.29.x | Existing MCP tooling is sufficient for adding `create_catalog_item` |
|
|
| `zod` | 4.3.x | New catalog attribution schemas are straightforward additions to existing `schemas.ts` |
|
|
| `@hono/zod-validator` | 0.7.x | Already used for all routes; covers new discovery/catalog endpoints |
|
|
|
|
---
|
|
|
|
## Installation
|
|
|
|
```bash
|
|
# Only one new package for v2.1
|
|
bun add hono-rate-limiter
|
|
```
|
|
|
|
Everything else is schema migrations, new service/route/middleware code, and one new MCP tool — all on the existing stack.
|
|
|
|
---
|
|
|
|
## Sources
|
|
|
|
- [Drizzle ORM cursor-based pagination](https://orm.drizzle.team/docs/guides/cursor-based-pagination) — two-column keyset pattern, v0.45.x confirmed (HIGH)
|
|
- [Drizzle ORM PostgreSQL full-text search](https://orm.drizzle.team/docs/guides/postgresql-full-text-search) — tsvector approach confirmed (HIGH)
|
|
- [Drizzle ORM full-text search with generated columns](https://orm.drizzle.team/docs/guides/full-text-search-with-generated-columns) — generated column pattern for tsvector (HIGH)
|
|
- [Hono ETag middleware](https://hono.dev/docs/middleware/builtin/etag) — built-in, no install required (HIGH)
|
|
- [Hono Cache middleware](https://hono.dev/docs/middleware/builtin/cache) — explicitly listed as Cloudflare/Deno only, not Bun (HIGH)
|
|
- [Hono ETag issue #4401](https://github.com/honojs/hono/issues/4401) — known inconsistency bug in etag middleware (MEDIUM)
|
|
- [hono-rate-limiter GitHub](https://github.com/rhinobase/hono-rate-limiter) — v0.5.3, active, Bun compatible (HIGH)
|
|
- [hono-rate-limiter npm](https://www.npmjs.com/package/hono-rate-limiter) — version 0.5.3 confirmed (HIGH)
|
|
- [TanStack Query infinite queries](https://tanstack.com/query/latest/docs/framework/react/guides/infinite-queries) — `useInfiniteQuery` cursor pattern (HIGH)
|
|
- [Drizzle ORM materialized views issue #2653](https://github.com/drizzle-team/drizzle-orm/issues/2653) — confirmed drizzle-kit does not fully support materialized view migrations (MEDIUM)
|
|
- [Hono middleware docs](https://hono.dev/docs/guides/middleware) — selective auth middleware pattern (HIGH)
|
|
- GearBox `package.json` — all existing dependency versions verified directly (HIGH)
|
|
- GearBox `src/server/index.ts` — existing skip-list pattern verified directly (HIGH)
|
|
- GearBox `src/server/middleware/auth.ts` — existing three-way auth verified directly (HIGH)
|
|
- GearBox `src/db/schema.ts` — existing `globalItems` table columns verified directly (HIGH)
|
|
|
|
---
|
|
|
|
*Stack research for: GearBox v2.1 Public Discovery milestone*
|
|
*Researched: 2026-04-09*
|