Files
GearBox/.planning/research/SUMMARY.md

22 KiB

Project Research Summary

Project: GearBox Domain: Single-user gear management and purchase planning web app Researched: 2026-03-14 Confidence: HIGH

Executive Summary

GearBox is a single-user personal gear management app with a critical differentiator: purchase planning threads. Every competitor (LighterPack, GearGrams, Packstack, Hikt) is a post-purchase inventory tool — they help you track what you own. GearBox closes the loop by adding a structured pre-purchase research workflow where users compare candidates, track research status, and resolve threads by promoting winners into their collection. This is the entire reason to build the product; the collection management side is table stakes, and the purchase planning threads are the moat. Research strongly recommends building both together in the v1 scope, not sequencing them separately, because the thread resolution workflow only becomes compelling once a real collection exists to reference.

The recommended architecture is a single-process Bun fullstack monolith: Hono for the API layer, React 19 + Vite 8 for the frontend, Drizzle ORM + bun:sqlite for the database, TanStack Router + TanStack Query for client navigation and server state, and Tailwind CSS v4 for styling. This stack is purpose-built for the constraints: Bun is a project requirement, SQLite is optimal for single-user, and every tool in the list has zero or near-zero runtime overhead. Zustand handles the small amount of client-only UI state. The entire stack is type-safe end-to-end through Zod schemas shared between client and server.

The biggest risks are front-loaded in Phase 1: unit handling (weights must be canonicalized to grams from day one), currency precision (prices must be stored as integer cents), category flexibility (must use user-defined tags, not a hardcoded hierarchy), and image storage strategy (relative paths to a local directory, never BLOBs for full-size, never absolute paths). Getting these wrong requires painful data migrations later. The second major risk is the thread state machine in Phase 2 — the combination of candidate status, thread lifecycle, and "move winner to collection" creates a stateful flow that must be modeled as an explicit state machine with transactional resolution, not assembled incrementally.

Key Findings

The stack is a tightly integrated Bun-native toolchain with no redundant tools. Bun serves as runtime, package manager, test runner, and provides built-in SQLite — eliminating entire categories of infrastructure. Vite 8 (Rolldown-based, 5-30x faster than Vite 7) handles the dev server and production frontend builds. The client-server boundary is clean: Hono serves the API, React handles the UI, and Zod schemas in a shared/ directory provide a single source of truth for data shapes on both sides.

The architecture note in STACK.md suggests Bun's fullstack HTML-based routing (not Vite's dev server proxy pattern). This differs slightly from the standard Vite proxy setup: each page is a separate HTML entrypoint imported into Bun.serve(), and TanStack Router handles in-page client-side navigation only. This simplifies the development setup to a single bun run command with no proxy configuration.

Core technologies:

  • Bun 1.3.x: Runtime, package manager, test runner, bundler — eliminates Node.js and npm
  • React 19.2.x + Vite 8.x: SPA framework + dev server — stable, large ecosystem, HMR out of the box
  • Hono 4.12.x: API layer — Web Standards based, first-class Bun support, ~12kB, faster than Express on Bun
  • SQLite (bun:sqlite) + Drizzle ORM 0.45.x: Database — zero-dependency, built into Bun, type-safe queries and migrations
  • TanStack Router 1.167.x + TanStack Query 5.93.x: Routing + server state — full type-safe routing, automatic cache invalidation
  • Tailwind CSS 4.2.x: Styling — CSS-native config, no JS file, microsecond incremental builds
  • Zustand 5.x: Client UI state — minimal boilerplate for filter state, modals, theme
  • Zod 4.3.x: Schema validation — shared between client and server as single source of truth for types
  • Biome: Linting + formatting — replaces ESLint + Prettier, Rust-based, near-zero config

Version flag: Verify that @hono/zod-validator supports Zod 4.x before starting. If not, pin Zod 3.23.x until the validator is updated.

Expected Features

The feature research distinguishes cleanly between what every gear app does (table stakes) and what GearBox uniquely does (purchase planning threads). No competitor has threads, candidate comparison, or thread resolution. This is the entire competitive surface. Everything else is hygiene.

Must have (table stakes) — v1 launch:

  • Item CRUD with weight, price, category, notes, product URL — minimum unit of value
  • User-defined categories/tags — must be flexible, not a hardcoded hierarchy
  • Weight unit support (g, oz, lb, kg) — gear community requires this; store canonical grams internally
  • Automatic weight/cost totals by category and setup — the reason to use an app over a text file
  • Named setups composed from collection items — compose loadouts, get aggregate totals
  • Planning threads with candidate items — the core differentiator
  • Side-by-side candidate comparison with deltas (not just raw values) — the payoff of threads
  • Thread resolution: pick winner, move to collection — closes the purchase research loop
  • Search and filter on collection — essential at 30+ items
  • Dashboard home page — clean entry point per project constraints

Should have (competitive) — v1.x after validation:

  • Impact preview: how a thread candidate changes a specific setup's weight and cost
  • Status tracking on thread items (researching / ordered / arrived)
  • Priority/ranking within threads
  • Photos per item (one photo per item initially)
  • CSV import/export — migration path from spreadsheets, data portability
  • Weight distribution visualization (pie/bar chart by category)

Defer — v2+:

  • Multi-photo gallery per item
  • Shareable read-only links for setups
  • Drag-and-drop reordering
  • Bulk operations (multi-select, bulk delete)
  • Dark mode
  • Item history/changelog

Architecture Approach

The architecture is a monolithic Bun process with a clear 4-layer structure: API routes (HTTP concerns), service layer (business logic and calculations), Drizzle ORM (type-safe data access), and bun:sqlite (embedded storage). There are no microservices, no Docker, no external database server. The client is a React SPA served as static files by the same Bun process. Internal communication is REST + JSON; no WebSockets needed. The data model has three primary entities — items, threads (with candidates), and setups — connected by explicit foreign keys and a junction table for the many-to-many setup-to-items relationship.

Major components:

  1. Collection (items): Core entity. Source of truth for owned gear. Every other feature references items.
  2. Planning Threads (threads + candidates): Pre-purchase research. Thread lifecycle is a state machine; resolution is transactional.
  3. Setups: Named loadouts composed from collection items. Totals are always computed live from item data, never cached.
  4. Service Layer: Business logic isolated from HTTP concerns. Enables testing without HTTP mocking. Key: calculateSetupTotals(), computeCandidateImpact().
  5. Dashboard: Read-only aggregation. Built last since it reads from all other entities.
  6. Image Storage: Filesystem (./uploads/ or data/images/{item-id}/) with relative paths in DB. Thumbnails on upload.

Build order from ARCHITECTURE.md (follow this):

  1. Database schema (Drizzle) — everything depends on this
  2. Items API (CRUD) — the core entity
  3. Collection UI — first visible feature, validates end-to-end
  4. Threads + candidates API and UI — depends on items for resolution
  5. Setups API and UI — depends on items for composition
  6. Dashboard — aggregates from all entities, build last
  7. Polish: image upload, impact calculations, status tracking

Critical Pitfalls

  1. Unit handling treated as display-only — Store all weights as canonical grams at write time. Accept any unit as input, convert on save. Build a weightToGrams(value, unit) utility on day one. A bare number field with no unit tracking will silently corrupt all aggregates when users paste specs in mixed units.

  2. Rigid category hierarchy — Use user-defined flat tags, not a hardcoded category tree. A categories table with parent_id foreign keys will fail the moment a user tries to track sim racing gear or photography equipment. Tags allow many-to-many, support any hobby, and do not require schema changes to add a new domain.

  3. Thread state machine complexity — Model the thread lifecycle as an explicit state machine before writing any code. Document valid transitions. The "resolve thread" action must be a single atomic transaction: validate winner exists, create collection item, mark thread resolved, update candidate statuses. Without this, impossible states (resolved thread with active candidates, ghost items in collection) accumulate silently.

  4. Setup totals cached in the database — Never store totalWeight or totalCost on a setup record. Always compute from live item data via SUM(). Cached totals go stale the moment any member item is edited, and the bugs are subtle (the UI shows a total that doesn't match the items).

  5. Comparison view that displays data but doesn't aid decisions — The comparison view must show deltas between candidates and against the item being replaced from the collection, not just raw values side by side. Color-code lighter/heavier, cheaper/more expensive. A comparison table with no computed differences is worse than a spreadsheet.

Additional high-priority pitfalls to address per phase:

  • Currency stored as floats (use integer cents always)
  • Image paths stored as absolute paths or as BLOBs for full-size images
  • Thread resolution is destructive (archive threads, don't delete them — users need to reference why they chose X over Y)
  • Item deletion without setup impact warning

Implications for Roadmap

Based on the combined research, a 5-phase structure is recommended. Phases 1-3 deliver the v1 MVP; Phases 4-5 deliver the v1.x feature set.

Phase 1: Foundation — Data Model, Infrastructure, Core Item CRUD

Rationale: Everything depends on getting the data model right. Unit handling, currency precision, category flexibility, image storage strategy, and the items schema are all Phase 1 decisions. Getting these wrong requires expensive data migrations. The architecture research explicitly states: "Database schema + Drizzle setup — Everything depends on the data model." The pitfalls research agrees: 6 of 9 pitfalls have "Phase 1" as their prevention phase.

Delivers: Working gear catalog — users can add, edit, delete, and browse their collection. Item CRUD with all core fields. Weight unit conversion. User-defined categories. Image upload with thumbnail generation and cleanup on delete. SQLite database with WAL mode enabled, automatic backup mechanism, and all schemas finalized.

Features from FEATURES.md: Item CRUD with core fields, user-defined categories, weight unit support (g/oz/lb/kg), notes and product URL fields, search and filter.

Pitfalls to prevent: Unit handling (canonical grams), currency precision (integer cents), category flexibility (user-defined tags, no hierarchy), image storage (relative paths, thumbnails), data loss prevention (WAL mode, auto-backup mechanism).

Research flag: Standard patterns. Schema design for inventory apps is well-documented. No research phase needed.


Phase 2: Planning Threads — The Core Differentiator

Rationale: Threads are why GearBox exists. The feature dependency graph in FEATURES.md shows threads require items to exist (to resolve candidates into the collection), which is why Phase 1 must complete first. The thread state machine is the most complex feature in the product and gets its own phase to ensure the state transitions are modeled correctly before any UI is built.

Delivers: Complete purchase planning workflow — create threads, add candidates with weight/price/notes, compare candidates side-by-side with weight/cost deltas (not just raw values), resolve threads by selecting a winner and moving it to the collection, archive resolved threads.

Features from FEATURES.md: Planning threads, side-by-side candidate comparison (with deltas), thread resolution workflow. Does not include status tracking (researching/ordered/arrived) or priority/ranking — those are v1.x.

Pitfalls to prevent: Thread state machine complexity (model transitions explicitly, transactional resolution), comparison usefulness (show deltas and impact, not just raw data), thread archiving (never destructive resolution).

Research flag: Needs careful design work before coding. The state machine for thread lifecycle (open -> in-progress -> resolved/cancelled) combined with candidate status (researching / ordered / arrived) and the resolution side-effect (create collection item) has no off-the-shelf reference implementation. Design the state diagram first.


Phase 3: Setups — Named Loadouts and Composition

Rationale: Setups require items to exist (Phase 1) and benefit from threads being stable (Phase 2) because thread resolution can affect setup membership (the replaced item should be updatable in setups). The many-to-many setup-items relationship and the setup integrity pitfall require careful foreign key design.

Delivers: Named setups composed from collection items. Weight and cost totals computed live (never cached). Base/worn/consumable weight classification per item per setup. Category weight breakdown. Item deletion warns about setup membership. Visual indicator when a setup item is no longer in the collection.

Features from FEATURES.md: Named setups with item selection and totals, setup weight/cost breakdown by category, automatic totals.

Pitfalls to prevent: Setup totals cached in DB (always compute live), setup composition breaks on collection changes (explicit ON DELETE behavior, visual indicators for missing items, no silent CASCADE).

Research flag: Standard patterns for junction table composition. No research phase needed for the setup-items relationship. The weight classification (base/worn/consumable) per setup entry is worth a design session — this is per-setup metadata on the junction, not a property of the item itself.


Phase 4: Dashboard and Polish

Rationale: The architecture research explicitly states "Dashboard — aggregates stats from all other entities. Build last since it reads from everything." Dashboard requires all prior phases to be stable since it reads from items, threads, and setups simultaneously. This phase also adds the weight visualization chart that requires a full dataset to be meaningful.

Delivers: Dashboard home page with summary cards (item count, active threads, setup count, collection value). Weight distribution visualization (pie/bar chart by category). Dashboard stats endpoint (/api/stats) as a read-only aggregation. General UI polish for the "light, airy, minimalist" aesthetic.

Features from FEATURES.md: Dashboard home page, weight distribution visualization.

Research flag: Standard patterns. Dashboard aggregation is a straightforward read-only endpoint. Charting is well-documented. No research phase needed.


Phase 5: v1.x Enhancements

Rationale: These features add significant value but depend on the core (Phases 1-3) being proven out. Impact preview requires both stable setups and stable threads. CSV import/export validates the data model is clean (if import is buggy, the model has problems). Photos add storage complexity that is easier to handle once the core CRUD flow is solid.

Delivers: Impact preview (how a thread candidate changes a specific setup's weight/cost). Thread item status tracking (researching / ordered / arrived). Priority/ranking within threads. Photos per item (upload, display, cleanup). CSV import/export with unit detection.

Features from FEATURES.md: Impact preview, status tracking, priority/ranking, photos per item, CSV import/export.

Pitfalls to prevent: CSV import missing unit conversion (must detect and convert oz/lb/kg to grams on import). Image uploads without size/type validation. Product URLs not sanitized (validate http/https protocol, render with rel="noopener noreferrer").

Research flag: CSV import with unit detection may need a design pass — handling "5 oz", "142g", "0.3 lb" in the same weight column requires a parsing strategy. Worth a short research spike before implementation.


Phase Ordering Rationale

  • Data model first: Six of nine pitfalls identified are Phase 1 prevention items. The schema is the hardest thing to change later and the most consequential.
  • Threads before setups: Thread resolution creates collection items; setup composition consumes them. But more importantly, threads are the differentiating feature — proving the thread workflow works is more valuable than setups.
  • Dashboard last: Explicitly recommended by architecture research. Aggregating from incomplete entities produces misleading data and masks bugs.
  • Impact preview in Phase 5: This feature requires both stable setups (Phase 3) and stable threads (Phase 2). Building it before both are solid means rebuilding it when either changes.
  • Photos deferred to Phase 5: The core value proposition is weight/cost tracking and purchase planning, not a photo gallery. Adding photo infrastructure in Phase 1 increases scope without validating the core concept.

Research Flags

Needs design/research before coding:

  • Phase 2 (Thread State Machine): Design the state diagram for thread lifecycle x candidate status before writing any code. Define all valid transitions and invalid states explicitly. This is the most stateful feature in the product and has no off-the-shelf pattern to follow.
  • Phase 5 (CSV Import): Design the column-mapping and unit-detection strategy before implementation. The spreadsheet-to-app migration workflow is critical for the target audience (users migrating from gear spreadsheets).

Standard patterns — no research phase needed:

  • Phase 1 (Data model + CRUD): Schema design for inventory apps is well-documented. Drizzle + bun:sqlite patterns are covered in official docs.
  • Phase 3 (Setups): Junction table composition is a standard relational pattern. Foreign key behavior for integrity is documented.
  • Phase 4 (Dashboard): Aggregation endpoints and charting are standard. No novel patterns.

Confidence Assessment

Area Confidence Notes
Stack HIGH All technologies verified against official docs. Version compatibility confirmed. One flag: verify @hono/zod-validator supports Zod 4.x before starting.
Features HIGH Competitor analysis is thorough (LighterPack, GearGrams, Packstack, Hikt all compared). Feature gaps and differentiators are clearly identified.
Architecture HIGH Bun fullstack monolith pattern is official and well-documented. Service layer and data flow patterns are standard.
Pitfalls HIGH Pitfalls are domain-specific and well-sourced. SQLite BLOB guidance from official SQLite docs. Comparison UX from NN/g. Unit conversion antipatterns documented.

Overall confidence: HIGH

Gaps to Address

  • Zod 4 / @hono/zod-validator compatibility: STACK.md flags this explicitly. Verify before starting. If incompatible, pin Zod 3.23.x. This is a quick check, not a blocker.

  • Bun fullstack vs. Vite proxy setup: STACK.md describes the Vite dev server proxy pattern (standard approach), while ARCHITECTURE.md describes Bun's HTML-based routing with Bun.serve() (newer approach). These are two valid patterns. The architecture file's approach (Bun fullstack) is simpler for production deployment. Confirm which pattern to follow before project setup — they require different vite.config.ts and entry point structures.

  • Weight classification (base/worn/consumable) data model: Where does this live? On the setup_items junction table (per-setup classification, same item can be "base" in one setup and "worn" in another) or on the item itself (one classification for all setups)? The per-setup model is more flexible but more complex. Decide in Phase 1 schema design, not Phase 3 when setups are built.

  • Tag vs. single-category field: PITFALLS.md recommends a flat tag system. FEATURES.md implies a single "category" field. The right answer is probably a single optional category field (for broad grouping, e.g., "clothing") plus user-defined tags for fine-grained organization. Confirm the data model in Phase 1.

Sources

Primary (HIGH confidence)

Secondary (MEDIUM confidence)

Tertiary (LOW confidence / needs validation)

  • Zod v4 release notes — @hono/zod-validator compatibility with Zod 4 unconfirmed, verify before use

Research completed: 2026-03-14 Ready for roadmap: yes