Files
GearBox/.planning/RETROSPECTIVE.md
Jean-Luc Makiola 1733fe8cfb chore: archive v2.3 milestone files
Archive setup sharing, currency system, and i18n foundation milestone.
Reorganize ROADMAP.md with v2.3 details block, update PROJECT.md,
MILESTONES.md, STATE.md deferred items, and RETROSPECTIVE.md.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 17:05:21 +02:00

18 KiB
Raw Blame History

Project Retrospective

A living document updated after each milestone. Lessons feed forward into future planning.

Milestone: v2.3 — Global & Social Ready

Shipped: 2026-04-19 Phases: 3 (32-34) | Plans: 18 | Commits: 99

What Was Built

  • Setup visibility system replacing boolean isPublic with private/link/public, share tokens with 128-bit entropy, and visibility-transition side effects
  • ShareModal with Google Docs-style UX — visibility picker, link creation/expiry, revoke, deactivation warning
  • Shared setup viewer with short URL redirect, read-only mode, and three-way data source logic
  • Multi-currency pricing: ECB exchange rates with 24h cache, market_prices and community_prices tables, ownership-validated submissions, median aggregation
  • Market-aware MSRP on catalog detail pages with collapsible "Other Markets" section
  • i18n framework: react-i18next, 7 namespaces, English + German translations, language detection, language picker

What Worked

  • Phased schema approach: do the migration first (32-01), service layer next, UI last — no mid-phase schema surprises
  • Dynamic import to break circular dependency (setup.service.ts → share.service.ts) was clean and discovered quickly
  • ECB exchange rate module-level cache is dead simple and effective for a single-process Bun app
  • Namespace-per-feature for i18n matches the existing file-based routing structure naturally

What Was Inefficient

  • Phase 32 progress table in ROADMAP.md showed 0/4 Planned despite all plans being complete — tracking drift not caught until milestone close
  • Several todos from early in the milestone (April 10) accumulated and weren't cleared before close — 6 deferred items
  • REQUIREMENTS.md was never refreshed for v2.2 or v2.3; requirements were tracked informally in STATE.md decisions

Patterns Established

  • visibility text enum over boolean flags for any future toggle-able states (shareable, public, featured)
  • Shares as a separate table with revocation semantics — reusable pattern for future permission systems
  • Community aggregation floor (3 reports minimum) before surfacing median — prevents single-user stat manipulation
  • i18n namespace per feature domain matches the codebase's existing routing and component organization

Key Lessons

  1. Keep REQUIREMENTS.md current across milestones — informal tracking in STATE.md decisions is not a substitute
  2. Todo triage at milestone close works, but earlier triage (mid-milestone) would reduce the deferred backlog
  3. The shares deactivate/reactivate pattern (not destroy) gives users a better experience at near-zero complexity cost
  4. Language detection: localStorage-first is the right call — user preference must win over browser default

Cost Observations

  • Model mix: sonnet throughout
  • Sessions: ~18 plan executions across 6 days
  • Notable: Phase 34 (i18n) was the heaviest at 8 plans — string extraction across the full app touches every component

Milestone: v1.0 — MVP

Shipped: 2026-03-15 Phases: 3 | Plans: 10 | Commits: 53

What Was Built

  • Full gear collection with item CRUD, categories, weight/cost totals, and image uploads
  • Planning threads with candidate comparison and thread resolution into collection
  • Named setups (loadouts) composed from collection items with live totals
  • Dashboard home page with summary cards
  • Onboarding wizard for first-time user experience
  • Service-level and route-level integration tests

What Worked

  • Coarse 3-phase structure kept momentum high — no planning overhead between tiny phases
  • TDD approach for backend (service tests first) caught issues early and made frontend integration smooth
  • Service layer with DI (db as first param) made testing trivial with in-memory SQLite
  • Visual verification checkpoints at end of each phase caught UI issues before moving on
  • Bun + Vite + Hono stack had zero friction — everything worked together cleanly

What Was Inefficient

  • Verification plans (XX-03) were mostly rubber-stamp auto-approvals in yolo mode — could skip for v2
  • Some ROADMAP plan checkboxes never got checked off (cosmetic, didn't affect tracking)
  • Performance metrics in STATE.md had stale placeholder data alongside real data

Patterns Established

  • Service functions: (db, params) => result with production db default
  • Route-level integration tests using Hono context variables for db injection
  • Prices in cents everywhere, display conversion in UI only
  • Tab navigation via URL search params for shareability
  • Atomic sync pattern: delete-all + re-insert in transaction

Key Lessons

  1. Coarse granularity (3 phases for an MVP) is the right call for a greenfield app — avoids over-planning
  2. The Vite proxy pattern is required when using TanStack Router plugin — can't do Bun fullstack serving
  3. drizzle-kit needs better-sqlite3 even on Bun — can't use bun:sqlite for migrations
  4. Onboarding state belongs in the database (settings table), not in client-side stores

Cost Observations

  • Model mix: quality profile throughout
  • Sessions: ~10 plan executions across 2 days
  • Notable: Most plans completed in 3-5 minutes, total wall time under 1 hour

Milestone: v1.1 — Fixes & Polish

Shipped: 2026-03-15 Phases: 3 | Plans: 7 | Files changed: 65

What Was Built

  • Fixed threads table and thread creation with categoryId support and modal dialog
  • Overhauled planning tab with educational empty state, pill tabs, and category filter
  • Fixed image display bug (Zod schema missing imageFilename)
  • Redesigned image upload as 4:3 hero preview area with placeholders on all cards
  • Migrated categories from emoji to Lucide icons with 119-icon curated picker
  • Built IconPicker with search, 8 group tabs, and portal popover

What Worked

  • Auto-advance pipeline (discuss → plan → execute) completed both phases end-to-end without manual intervention
  • Wave-based parallel execution in Phase 6 — plans 06-02 and 06-03 ran concurrently with no conflicts
  • Executor auto-fix deviations handled cascading renames gracefully (emoji→icon required touching hooks/routes beyond plan scope)
  • Context discussion upfront captured clear decisions — no ambiguity during execution
  • Verifier caught real issues (Zod schema root cause) and confirmed all must-haves

What Was Inefficient

  • Schema renames cascade through many files (12 in 06-01) — executors had to auto-fix downstream references not in the plan
  • Some ROADMAP.md plan checkboxes remained unchecked despite plans completing (cosmetic tracking drift)
  • Phase 5 executor installed inline SVGs for ImageUpload icons, then Phase 6 added lucide-react anyway — could have coordinated

Patterns Established

  • Portal-based popover pattern: reused from EmojiPicker → IconPicker (click-outside, escape, portal rendering)
  • LucideIcon dynamic lookup component: icons[name] from lucide-react for runtime icon resolution
  • Curated icon data file pattern: static data organized by groups for picker UIs
  • Hero image area: full-width 4:3 preview at top of forms with placeholder/upload/preview states

Key Lessons

  1. Zod validation middleware silently strips unknown fields — always add new schema fields to Zod schemas, not just DB schema
  2. Auto-fix deviations are a feature, not a bug — executors that fix cascading renames save manual replanning
  3. Auto-advance pipeline works well for straightforward phases — interactive discussion ensures decisions are clear before autonomous execution
  4. Parallel Wave 2 execution with no file overlap is safe and efficient

Cost Observations

  • Model mix: opus for execution, sonnet for verification/checking
  • Sessions: 1 continuous auto-advance pipeline for both phases
  • Notable: Full milestone (discuss + plan + execute × 2 phases) completed in a single session

Milestone: v1.2 — Collection Power-Ups

Shipped: 2026-03-16 Phases: 3 | Plans: 6 | Files changed: 66

What Was Built

  • Weight unit conversion (g/oz/lb/kg) with segmented toggle wired across all weight display call sites
  • Candidate status tracking (researching/ordered/arrived) with clickable StatusBadge popup
  • Sticky search/filter toolbar with text search and icon-aware CategoryFilterDropdown
  • Per-setup item classification (base/worn/consumable) with click-to-cycle ClassificationBadge
  • Recharts donut chart with category/classification toggle and hover tooltips
  • Classification-preserving sync that maintains metadata across atomic setup item re-sync

What Worked

  • Coarse 3-phase structure again — 19 requirements compressed into 3 phases with clear dependency ordering
  • TDD red/green commits for schema migrations (status, classification) caught edge cases early
  • Vertical slice pattern (schema → service → tests → API → UI in one plan) kept each deliverable self-contained
  • Click-outside dismiss pattern established in v1.1 was reused cleanly in StatusBadge and CategoryFilterDropdown
  • All 6 plans executed with zero deviations from plan — evidence of mature planning process

What Was Inefficient

  • Some ROADMAP.md plan checkboxes remained unchecked despite summaries existing (persistent cosmetic drift)
  • Recharts v3 Cell component is deprecated for v4 — will need migration eventually
  • Phase 8 bundled search/filter with candidate status (different concerns) — could have been separate phases for cleaner scope

Patterns Established

  • Click-to-cycle badge: for small enums (3 values), direct click cycling is simpler than popup menus
  • Join table metadata preservation: save metadata to Map before atomic sync, restore after re-insert
  • CategoryFilterDropdown: reusable filter dropdown (separate from form-based CategoryPicker)
  • Chart data transformation: group items by key, sum weights, compute percentages, filter zeroes
  • apiPatch helper: PATCH method now available in client API library for partial updates

Key Lessons

  1. Classification belongs on join tables (setupItems), not entity tables (items) — same item has different roles in different contexts
  2. Vertical slice delivery (schema → service → test → API → UI) is the optimal plan structure for feature additions
  3. Search complexity should match data scale — no debounce needed for <1000 items
  4. Recharts composable API (PieChart + Pie + Cell + Tooltip + Label) gives fine-grained chart control with minimal wrapper code

Cost Observations

  • Model mix: quality profile throughout (opus for execution)
  • Sessions: 3 continuous auto-advance sessions (one per phase)
  • Notable: All plans completed with zero deviations, execution faster than v1.0/v1.1

Milestone: v1.3 — Research & Decision Tools

Shipped: 2026-04-08 Phases: 4 | Plans: 6 | Files changed: 52 (+3,106 / -158)

What Was Built

  • Pros/cons text annotation on candidates with visual indicator badges
  • Candidate ranking with sortOrder REAL column, drag-to-reorder via Reorder.Group, and gold/silver/bronze badges
  • Side-by-side comparison table with sticky attribute labels, weight/price delta highlighting, and winner marking
  • Setup impact preview with per-candidate weight/cost deltas, replacement detection, and "no weight data" indicator

What Worked

  • TDD for impact delta computation (Phase 13) — pure function tested in isolation before any UI work
  • Vertical slice pattern continued from v1.2 — each plan delivered end-to-end from schema to UI
  • framer-motion Reorder.Group provided drag-to-reorder with minimal code vs building from scratch
  • candidateViewMode pattern in UIStore cleanly separates grid/list/compare views without route complexity

What Was Inefficient

  • Phase 13 had a 3-week gap between research (2026-03-17) and execution (2026-04-08) — v2.0 work interleaved
  • Comparison table required careful horizontal scroll CSS that took iteration to get right
  • The 11-02 summary extraction failed (garbled output) — plan summaries should always have clean one-liners

Patterns Established

  • candidateViewMode (grid/list/compare): UIStore enum for toggling candidate presentation
  • Impact delta computation as pure function: computeImpactDeltas(candidates, setup) — no side effects
  • SetupImpactSelector: dropdown component for setup selection in thread context
  • ImpactDeltaBadge: reusable delta display component with replace/add/no-data states

Key Lessons

  1. Pure computation functions (no DB, no HTTP) are the fastest to TDD and most reliable to maintain
  2. Drag-to-reorder needs REAL (float) sort_order — integer ranks break on insert between existing items
  3. Comparison tables need both horizontal scroll and fixed first column — mobile-first means testing narrow viewports early
  4. Setup impact preview is most useful when it detects category-match replacement, not just addition

Cost Observations

  • Model mix: quality profile for execution
  • Sessions: Split across v2.0 work — phases 10-12 in one burst, phase 13 after v2.0 infrastructure
  • Notable: Smallest milestone (4 phases, 6 plans) but high user value per plan

Milestone: v2.0 — Platform Foundation

Shipped: 2026-04-08 Phases: 10 | Plans: 32 | Files changed: 210 (+47,370 / -2,244)

What Was Built

  • Full PostgreSQL migration: 13 pgTable definitions, async services, PGlite test infrastructure, Docker Compose
  • External OIDC auth via Logto: three-way middleware (browser sessions, API keys, MCP OAuth)
  • Multi-user data model: userId FK on 6 entity tables, cross-user isolation, composite constraints
  • S3 object storage via MinIO: upload/delete/presigned URL abstraction, image migration script
  • Global item catalog: search, owner count, tags, 18-item bikepacking seed
  • User profiles with public setup sharing and visibility toggle
  • Reference item model with COALESCE merge pattern
  • Full catalog-driven gear flow: FAB, search overlay, add-to-collection/thread modals, manual fallback
  • Item and catalog detail pages replacing all slide-out panels

What Worked

  • Infrastructure phases (14-17) done in one concentrated push — no mixing infra with features
  • COALESCE merge pattern allowed reference items to inherit global data without duplication
  • Three-way auth middleware cleanly separated browser, API key, and MCP OAuth concerns
  • PGlite for tests eliminated external Postgres dependency while keeping real SQL execution
  • Catalog-first add flow with modal confirmation provided good UX without losing flexibility
  • Phase-per-concern kept scope manageable despite 10 phases

What Was Inefficient

  • SQLite to Postgres migration touched every service, route, and test file — massive blast radius
  • E2E tests broke and had to be disabled (backlog 999.1) — OIDC auth incompatible with test auth flow
  • Some phases (14, 18) had many plans (5-6) — could have been split into smaller milestones
  • Auth middleware complexity (OIDC + API keys + OAuth) required multiple fix commits post-merge
  • Phase 18 plan count (5) was at the upper limit — more granular phases would have been cleaner

Patterns Established

  • PGlite test infrastructure: createTestDb() returns async in-memory Postgres
  • Three-way auth: OIDC cookie → API key header → OAuth bearer, resolved to userId
  • COALESCE merge: COALESCE(items.field, globalItems.field) for transparent reference data
  • Global FAB pattern: floating action button with animated mini menu on all authenticated routes
  • Catalog search overlay: full-screen modal with debounced search, tag chip AND-filtering
  • AddToCollectionModal / AddToThreadModal: confirmation step with category picker + personal fields
  • Detail page pattern: /items/:id and /global-items/:id replacing slide-out panels

Key Lessons

  1. Database migration milestones should be their own release — touching every file means high risk of regressions
  2. PGlite is excellent for test infrastructure — real SQL without external dependencies
  3. Auth should be designed for testability from day one — bolting on OIDC broke the E2E test model
  4. COALESCE merge for reference data is elegant but requires careful propagation to all read paths
  5. Catalog-first flow works when the catalog is pre-seeded — empty catalog defeats the purpose
  6. Slide-out panels don't scale — detail pages with edit mode toggle are better for complex data
  7. Three-way auth middleware is maintainable when each method resolves to the same userId shape

Cost Observations

  • Model mix: quality profile throughout
  • Sessions: ~15 execution sessions across 22 days
  • Notable: Largest milestone by far (32 plans, 210 files) — v2.0 was effectively a rewrite of the backend

Process Evolution

Milestone Commits Phases Key Change
v1.0 53 3 Initial build, coarse granularity, TDD backend
v1.1 ~30 3 Auto-advance pipeline, parallel wave execution, auto-fix deviations
v1.2 25 3 Zero-deviation execution, vertical slice pattern, join table metadata
v1.3 ~15 4 Pure function TDD, interleaved with v2.0, drag-to-reorder
v2.0 ~350 10 Full platform rewrite, Postgres + OIDC + multi-user + catalog

Cumulative Quality

Milestone LOC Files Tests
v1.0 5,742 114 Service + route integration
v1.1 6,134 ~130 Service + route integration (updated for icon schema)
v1.2 7,310 ~150 121 tests (service + route + classification)
v1.3 ~8,300 ~160 +impact delta tests
v2.0 23,970 210+ 161+ tests (PGlite, multi-user isolation, MCP)

Top Lessons (Verified Across Milestones)

  1. Coarse phases with TDD backend → smooth frontend integration
  2. Service DI pattern enables fast, reliable testing without mocks
  3. Always update Zod schemas alongside DB schema — middleware silently strips unvalidated fields
  4. Auto-advance pipeline (discuss → plan → execute) works well for clear-scope phases
  5. Vertical slice delivery (schema → service → test → API → UI) is optimal for feature additions
  6. Join table metadata (not entity table) when same entity plays different roles in different contexts
  7. Database migrations are high-risk — isolate them from feature work
  8. Auth testability must be designed upfront — retrofitting breaks E2E tests
  9. COALESCE merge is powerful for reference data but must be propagated to all read paths
  10. Catalog-first flows need pre-seeded data to provide value on day one