From e9581490de77048c6e5bbc463a225c92e9178955 Mon Sep 17 00:00:00 2001 From: Jean-Luc Makiola Date: Sun, 5 Apr 2026 12:03:14 +0200 Subject: [PATCH] docs(17): research phase domain --- .../phases/17-object-storage/17-RESEARCH.md | 454 ++++++++++++++++++ 1 file changed, 454 insertions(+) create mode 100644 .planning/phases/17-object-storage/17-RESEARCH.md diff --git a/.planning/phases/17-object-storage/17-RESEARCH.md b/.planning/phases/17-object-storage/17-RESEARCH.md new file mode 100644 index 0000000..dfed4f8 --- /dev/null +++ b/.planning/phases/17-object-storage/17-RESEARCH.md @@ -0,0 +1,454 @@ +# Phase 17: Object Storage - Research + +**Researched:** 2026-04-04 +**Domain:** S3-compatible object storage (MinIO), AWS SDK v3, image upload/serve refactoring +**Confidence:** MEDIUM + +## Summary + +This phase replaces local filesystem image storage (`uploads/` directory) with S3-compatible object storage. The user has decided on MinIO with `@aws-sdk/client-s3` and `@aws-sdk/s3-request-presigner`. However, research uncovered a significant development: **MinIO's GitHub repository was archived on February 13, 2026**, and official Docker images are no longer published to Docker Hub or Quay.io as of October 2025. The last available pre-built Docker image on quay.io is `RELEASE.2025-09-07T16-13-09Z`, which is still pullable and functional for development use. + +The existing quay.io images remain usable -- the image `quay.io/minio/minio:RELEASE.2025-09-07T16-13-09Z` was verified as still available. For a development-only dependency (local Docker Compose), pinning to this release is pragmatic. The S3 API is a standard -- any future migration to SeaweedFS, Garage, or AWS S3 itself requires zero code changes since `@aws-sdk/client-s3` works identically with all S3-compatible services. + +**Primary recommendation:** Proceed with MinIO using the pinned quay.io image for Docker Compose. The storage service abstraction via `@aws-sdk/client-s3` ensures the underlying S3 provider is swappable without code changes. Document the MinIO archival status and alternatives in a code comment. + + + +## User Constraints (from CONTEXT.md) + +### Locked Decisions +- **D-01:** Use `@aws-sdk/client-s3` (AWS SDK v3) for MinIO communication +- **D-02:** Use `@aws-sdk/s3-request-presigner` for generating presigned URLs +- **D-03:** Create `src/server/services/storage.service.ts` with functions: `uploadImage(buffer, filename, contentType)`, `deleteImage(filename)`, `getImageUrl(filename)` +- **D-04:** `getImageUrl()` returns a presigned URL with configurable expiry (default 1 hour) +- **D-05:** Environment variables: `S3_ENDPOINT`, `S3_ACCESS_KEY`, `S3_SECRET_KEY`, `S3_BUCKET` (default: `gearbox-images`), `S3_REGION` (default: `us-east-1`) +- **D-06:** `POST /api/images` and `POST /api/images/from-url` upload to MinIO instead of local filesystem +- **D-07:** `fetchImageFromUrl()` uploads fetched buffer to MinIO instead of writing to disk +- **D-08:** Remove static file serving for `/uploads/*` from the server +- **D-09:** API resolves `imageFilename` to a presigned MinIO URL. Add `imageUrl` field to API responses +- **D-10:** Client components use presigned URL directly +- **D-11:** Migration script `scripts/migrate-images-to-minio.ts` +- **D-12:** No filename changes during migration -- existing `imageFilename` values become MinIO object keys +- **D-13:** MinIO service in docker-compose.yml with automatic bucket creation on startup +- **D-14:** Dev compose uses fixed credentials. Prod compose uses env vars. + +### Claude's Discretion +- Presigned URL expiry duration (1h default, configurable) +- Whether to add a GET /api/images/:filename proxy endpoint as fallback +- MinIO Docker image version +- Bucket policy (private with presigned URLs vs public-read) +- Whether to delete local files after successful migration +- Error handling strategy for upload failures + +### Deferred Ideas (OUT OF SCOPE) +None + + + + + +## Phase Requirements + +| ID | Description | Research Support | +|----|-------------|------------------| +| IMG-01 | Images are stored in MinIO (S3-compatible) instead of local filesystem | Storage service wraps @aws-sdk/client-s3; upload routes refactored to call storage.uploadImage() | +| IMG-02 | Existing uploaded images are migrated to MinIO | Migration script reads uploads/ dir, uploads each file to MinIO bucket | +| IMG-03 | Image upload and retrieval work through the new storage layer | Upload endpoints use storage service; API responses include presigned URLs via getImageUrl() | +| IMG-04 | Docker Compose provides MinIO for local development | MinIO + mc init container in docker-compose.dev.yml with auto bucket creation | + + + +## Standard Stack + +### Core +| Library | Version | Purpose | Why Standard | +|---------|---------|---------|--------------| +| @aws-sdk/client-s3 | 3.1024.0 | S3 API operations (PutObject, DeleteObject, GetObject) | Official AWS SDK v3, tree-shakeable, works with any S3-compatible service | +| @aws-sdk/s3-request-presigner | 3.1024.0 | Generate presigned URLs for direct client access | Official companion package for presigned URL generation | + +### Supporting +| Library | Version | Purpose | When to Use | +|---------|---------|---------|-------------| +| minio/minio (Docker) | RELEASE.2025-09-07T16-13-09Z | S3-compatible object storage for dev/prod | Docker Compose only -- not an npm dependency | +| minio/mc (Docker) | latest | MinIO client CLI for bucket initialization | Init container in Docker Compose | + +### Alternatives Considered +| Instead of | Could Use | Tradeoff | +|------------|-----------|----------| +| MinIO (archived) | SeaweedFS | More complex Docker setup (master+volume+filer+s3 = 4 containers vs 1); better long-term viability | +| MinIO (archived) | Garage | Lightweight, Rust-based; but complex configuration for single-node | +| Presigned URLs | Proxy endpoint | Proxy adds server load but avoids CORS and presigned URL complexity | + +**Installation:** +```bash +bun add @aws-sdk/client-s3 @aws-sdk/s3-request-presigner +``` + +**Version verification:** Versions confirmed via `npm view` on 2026-04-04. Both packages at 3.1024.0. + +## Architecture Patterns + +### Recommended Project Structure +``` +src/server/ +├── services/ +│ ├── storage.service.ts # NEW: S3 storage abstraction +│ └── image.service.ts # MODIFIED: Uses storage service instead of Bun.write +├── routes/ +│ └── images.ts # MODIFIED: Uses storage service for uploads +scripts/ +└── migrate-images-to-minio.ts # NEW: One-time migration script +``` + +### Pattern 1: S3 Client Singleton +**What:** Create the S3Client once at module level with configuration from env vars. Export functions that use it. +**When to use:** All storage operations. +**Example:** +```typescript +// src/server/services/storage.service.ts +import { S3Client, PutObjectCommand, DeleteObjectCommand, GetObjectCommand } from "@aws-sdk/client-s3"; +import { getSignedUrl } from "@aws-sdk/s3-request-presigner"; + +const s3 = new S3Client({ + endpoint: process.env.S3_ENDPOINT, + region: process.env.S3_REGION ?? "us-east-1", + credentials: { + accessKeyId: process.env.S3_ACCESS_KEY!, + secretAccessKey: process.env.S3_SECRET_KEY!, + }, + forcePathStyle: true, // REQUIRED for MinIO and most S3-compatible services +}); + +const bucket = process.env.S3_BUCKET ?? "gearbox-images"; +const presignExpiry = parseInt(process.env.S3_PRESIGN_EXPIRY ?? "3600", 10); + +export async function uploadImage( + buffer: Buffer | ArrayBuffer, + filename: string, + contentType: string, +): Promise { + await s3.send(new PutObjectCommand({ + Bucket: bucket, + Key: filename, + Body: Buffer.from(buffer), + ContentType: contentType, + })); +} + +export async function deleteImage(filename: string): Promise { + await s3.send(new DeleteObjectCommand({ + Bucket: bucket, + Key: filename, + })); +} + +export async function getImageUrl(filename: string): Promise { + const command = new GetObjectCommand({ + Bucket: bucket, + Key: filename, + }); + return getSignedUrl(s3, command, { expiresIn: presignExpiry }); +} +``` + +### Pattern 2: Presigned URL Injection in API Responses +**What:** When returning items/candidates with `imageFilename`, resolve to presigned URL and add `imageUrl` field. +**When to use:** All GET endpoints that return records with `imageFilename`. +**Example:** +```typescript +// Helper to enrich records with presigned URLs +async function withImageUrl( + record: T, +): Promise { + return { + ...record, + imageUrl: record.imageFilename + ? await getImageUrl(record.imageFilename) + : null, + }; +} +``` + +### Pattern 3: Docker Compose Init Container for Bucket Creation +**What:** Use a `minio/mc` container that waits for MinIO, then creates the bucket. +**When to use:** Docker Compose dev and prod setups. +**Example:** +```yaml +minio: + image: quay.io/minio/minio:RELEASE.2025-09-07T16-13-09Z + command: server /data --console-address ":9001" + environment: + MINIO_ROOT_USER: ${S3_ACCESS_KEY:-minioadmin} + MINIO_ROOT_PASSWORD: ${S3_SECRET_KEY:-minioadmin} + ports: + - "9000:9000" + - "9001:9001" + volumes: + - minio-data:/data + healthcheck: + test: ["CMD", "mc", "ready", "local"] + interval: 5s + timeout: 3s + retries: 5 + +minio-init: + image: quay.io/minio/mc:latest + depends_on: + minio: + condition: service_healthy + entrypoint: > + /bin/sh -c " + mc alias set myminio http://minio:9000 minioadmin minioadmin; + mc mb --ignore-existing myminio/gearbox-images; + exit 0; + " +``` + +### Anti-Patterns to Avoid +- **Storing presigned URLs in the database:** URLs expire. Always generate on read. +- **Not setting `forcePathStyle: true`:** MinIO and most S3-compatible services require path-style access. Virtual-hosted style will fail. +- **Using `minio/minio:latest` from Docker Hub:** Images are no longer updated. Pin to a specific quay.io release. +- **Generating presigned URLs for every item in a list:** Batch operations can be slow. Consider caching or generating on demand. + +## Don't Hand-Roll + +| Problem | Don't Build | Use Instead | Why | +|---------|-------------|-------------|-----| +| S3 request signing | Custom HMAC signing | @aws-sdk/s3-request-presigner | Signature V4 is complex, error-prone | +| Multipart upload | Custom chunked upload | @aws-sdk/client-s3 Upload utility | Handles retries, progress, chunk management | +| Content type detection | Custom magic byte checking | File extension mapping (already in codebase) | Existing validation is sufficient for jpeg/png/webp | + +**Key insight:** The entire value of this phase is replacing local filesystem calls (Bun.write, unlink) with S3 SDK calls. The business logic (validation, filename generation, content type checking) stays unchanged. + +## Common Pitfalls + +### Pitfall 1: CORS Issues with Presigned URLs +**What goes wrong:** Browser blocks direct fetch to MinIO presigned URL due to CORS. +**Why it happens:** MinIO is a different origin (port 9000) from the app (port 3000). +**How to avoid:** Configure MinIO CORS policy via environment or mc command. Alternatively, add a proxy endpoint as fallback. +**Warning signs:** Images display as broken in dev but upload succeeds. + +### Pitfall 2: Presigned URL Expiry in Long-Lived Pages +**What goes wrong:** User opens page, leaves it open > 1 hour, images stop loading. +**Why it happens:** Presigned URLs expire after the configured duration. +**How to avoid:** 1-hour default is generous. For extra safety, re-fetch image URLs on focus/visibility change, or use a longer expiry for GET operations. +**Warning signs:** Intermittent broken images in production. + +### Pitfall 3: MinIO Health Check Timing +**What goes wrong:** App container starts before MinIO is ready, first uploads fail. +**Why it happens:** Docker Compose `depends_on` only waits for container start, not readiness. +**How to avoid:** Use health checks with `condition: service_healthy` in Docker Compose. +**Warning signs:** Startup failures in CI or fresh dev environments. + +### Pitfall 4: Missing forcePathStyle Configuration +**What goes wrong:** SDK tries virtual-hosted style URLs (`bucket.endpoint`) which don't resolve for MinIO. +**Why it happens:** AWS SDK v3 defaults to virtual-hosted style for AWS S3. +**How to avoid:** Always set `forcePathStyle: true` in S3Client config for non-AWS S3 services. +**Warning signs:** DNS resolution errors or "bucket not found" errors. + +### Pitfall 5: Performance Impact of Presigned URL Generation +**What goes wrong:** List endpoints become slow because each item needs a presigned URL. +**Why it happens:** `getSignedUrl` is a crypto operation per URL. +**How to avoid:** `getSignedUrl` from @aws-sdk/s3-request-presigner is a local crypto operation (no network call), so it should be fast. But for lists of 100+ items, use `Promise.all` to parallelize. If still slow, consider generating URLs lazily on the client. +**Warning signs:** GET /api/items response time increases noticeably. + +## Code Examples + +### Current Upload Flow (to be replaced) +```typescript +// src/server/routes/images.ts - current +await mkdir("uploads", { recursive: true }); +await Bun.write(join("uploads", filename), buffer); +return c.json({ filename }, 201); +``` + +### New Upload Flow +```typescript +// After refactoring +import { uploadImage } from "../services/storage.service"; +await uploadImage(buffer, filename, file.type); +return c.json({ filename }, 201); +``` + +### Current Image Deletion (to be replaced) +```typescript +// src/server/routes/items.ts - current +if (deleted.imageFilename) { + try { + await unlink(join("uploads", deleted.imageFilename)); + } catch { /* File missing is not an error */ } +} +``` + +### New Image Deletion +```typescript +// After refactoring +if (deleted.imageFilename) { + try { + await deleteImage(deleted.imageFilename); + } catch { /* Object missing is not an error */ } +} +``` + +### Current Client Image Display (to be changed) +```typescript +// Multiple components currently use: +src={`/uploads/${imageFilename}`} +``` + +### New Client Image Display +```typescript +// Components will use the presigned URL from API response: +src={imageUrl} +``` + +### Files That Reference `/uploads/` (must all be updated) + +**Server-side (6 locations):** +1. `src/server/services/image.service.ts` -- `Bun.write(join(uploadsDir, filename), buffer)` +2. `src/server/routes/images.ts` -- `Bun.write(join("uploads", filename), buffer)` + `mkdir("uploads")` +3. `src/server/routes/items.ts` -- `unlink(join("uploads", deleted.imageFilename))` +4. `src/server/routes/threads.ts` -- `unlink(join("uploads", filename))` (2 locations) +5. `src/server/index.ts` -- `app.use("/uploads/*", serveStatic({ root: "./" }))` +6. `docker-compose.yml` -- `volumes: - uploads:/app/uploads` + +**Client-side (6 components):** +1. `src/client/components/ImageUpload.tsx` -- `src={/uploads/${value}}` +2. `src/client/components/ItemCard.tsx` -- `src={/uploads/${imageFilename}}` +3. `src/client/components/CandidateCard.tsx` -- `src={/uploads/${imageFilename}}` +4. `src/client/components/CandidateListItem.tsx` -- `src={/uploads/${candidate.imageFilename}}` +5. `src/client/components/ComparisonTable.tsx` -- `src={/uploads/${c.imageFilename}}` +6. `src/client/routes/setups/$setupId.tsx` -- `imageFilename={item.imageFilename}` + +**MCP tools (1 location):** +1. `src/server/mcp/tools/images.ts` -- calls `fetchImageFromUrl()` which writes to local fs + +## Discretion Recommendations + +### Presigned URL Expiry +**Recommendation:** 1 hour default, configurable via `S3_PRESIGN_EXPIRY` env var. 1 hour balances security with usability. For GET-only presigned URLs, there is minimal security risk even with longer expiry. + +### Proxy Endpoint Fallback +**Recommendation:** Do NOT add a proxy endpoint. Presigned URLs are the standard pattern. Adding a proxy creates two code paths to maintain and defeats the purpose of offloading image serving to the storage service. If CORS is an issue in dev, configure MinIO CORS instead. + +### MinIO Docker Image Version +**Recommendation:** Use `quay.io/minio/minio:RELEASE.2025-09-07T16-13-09Z` (last stable release before project archival). Pin explicitly -- do not use `latest` tag. Add a comment noting the archival status and that the S3 API abstraction makes the provider swappable. + +### Bucket Policy +**Recommendation:** Private bucket with presigned URLs. This is the standard secure approach. Public-read would work but is less secure and unnecessary since presigned URL generation is a local operation with negligible overhead. + +### Delete Local Files After Migration +**Recommendation:** Do NOT auto-delete. The migration script should log success per file but leave originals intact. Add a manual cleanup step documented in the script output: "Run `rm -rf uploads/` after verifying all images load correctly from MinIO." + +### Error Handling for Upload Failures +**Recommendation:** Let S3 SDK errors propagate. Wrap in try/catch at the route level and return 500 with a generic error message. Log the full error server-side. No retry logic needed -- uploads are user-initiated and can be retried manually. + +## State of the Art + +| Old Approach | Current Approach | When Changed | Impact | +|--------------|------------------|--------------|--------| +| MinIO Docker Hub images | quay.io pinned release or build from source | Oct 2025 | Must use quay.io registry or alternative S3 provider | +| MinIO community edition | MinIO archived, AIStor commercial | Feb 2026 | No new features/security patches; S3 API is stable so existing images work | +| AWS SDK v2 (monolithic) | AWS SDK v3 (modular) | 2021+ | Tree-shakeable, smaller bundles, per-service packages | + +**Deprecated/outdated:** +- MinIO community Docker images on Docker Hub: No longer updated as of Oct 2025 +- MinIO GitHub repository: Archived Feb 2026, read-only +- `@aws-sdk/client-s3` v2 API: Use v3 modular imports + +## Open Questions + +1. **MinIO CORS Configuration for Dev** + - What we know: Presigned URLs from MinIO (port 9000) will be fetched by the browser app (port 5173/3000), creating a cross-origin request. + - What's unclear: Whether MinIO's default CORS settings allow this, or if explicit configuration is needed. + - Recommendation: Test in dev. If CORS blocks requests, configure MinIO via `mc anonymous set download myminio/gearbox-images` or set CORS policy via mc. The Vite dev server proxy could also be used as a workaround. + +2. **Presigned URL Performance at Scale** + - What we know: `getSignedUrl` is a local crypto operation (no network call). For small collections (< 100 items), overhead is negligible. + - What's unclear: Performance impact when listing 500+ items with images. + - Recommendation: Implement with `Promise.all` for list endpoints. Monitor and optimize only if measurable slowdown occurs. + +## Environment Availability + +| Dependency | Required By | Available | Version | Fallback | +|------------|------------|-----------|---------|----------| +| Docker | MinIO container | Yes | 29.0.0 | -- | +| Docker Compose | Multi-container setup | Yes | v2.40.3 | -- | +| Bun | Runtime | Yes | (project runtime) | -- | +| MinIO (quay.io) | S3 storage | Yes (verified pullable) | RELEASE.2025-09-07T16-13-09Z | SeaweedFS, Garage, or any S3-compatible service | + +**Missing dependencies with no fallback:** None + +**Missing dependencies with fallback:** None + +## Validation Architecture + +### Test Framework +| Property | Value | +|----------|-------| +| Framework | Bun test runner | +| Config file | bunfig.toml (if exists) or default | +| Quick run command | `bun test tests/services/image.service.test.ts` | +| Full suite command | `bun test` | + +### Phase Requirements to Test Map +| Req ID | Behavior | Test Type | Automated Command | File Exists? | +|--------|----------|-----------|-------------------|-------------| +| IMG-01 | Upload stores in S3 instead of filesystem | unit | `bun test tests/services/storage.service.test.ts` | No -- Wave 0 | +| IMG-01 | Image routes call storage service | integration | `bun test tests/routes/images.test.ts` | Yes -- needs update | +| IMG-02 | Migration script uploads all files from uploads/ to MinIO | integration | `bun test tests/scripts/migrate-images.test.ts` | No -- Wave 0 | +| IMG-03 | getImageUrl returns presigned URL | unit | `bun test tests/services/storage.service.test.ts` | No -- Wave 0 | +| IMG-03 | API responses include imageUrl field | integration | `bun test tests/routes/items.test.ts` | Yes -- needs update | +| IMG-04 | Docker Compose MinIO starts and bucket is created | manual | `docker compose -f docker-compose.dev.yml up -d && mc alias set ...` | N/A -- manual | + +### Sampling Rate +- **Per task commit:** `bun test tests/services/storage.service.test.ts tests/routes/images.test.ts` +- **Per wave merge:** `bun test` +- **Phase gate:** Full suite green before `/gsd:verify-work` + +### Wave 0 Gaps +- [ ] `tests/services/storage.service.test.ts` -- covers IMG-01, IMG-03 (mock S3Client) +- [ ] Update `tests/services/image.service.test.ts` -- refactor to use mocked storage service +- [ ] Update `tests/routes/images.test.ts` -- verify routes call storage service + +### Testing Strategy for S3 Operations +Storage service tests should mock the S3Client. The `@aws-sdk/client-s3` SDK supports the `aws-sdk-client-mock` library for unit testing, but for this project's scope, simple mock functions injected via a factory pattern or module-level mocking with `bun:test`'s `mock` are sufficient. Do NOT require a running MinIO instance for unit tests. + +## Project Constraints (from CLAUDE.md) + +- **Runtime:** Bun (not Node.js) +- **Server framework:** Hono with Zod validation +- **Service pattern:** Pure functions, no HTTP awareness -- storage.service.ts follows this (stateless, no db needed) +- **Path alias:** `@/*` maps to `./src/*` +- **Formatting:** Biome (tabs, double quotes, organized imports) +- **Testing:** Bun test runner, service-level and route-level tests +- **Branching:** Feature branch off Develop, merge back via PR +- **Releases:** Via Gitea Actions pipeline only + +## Sources + +### Primary (HIGH confidence) +- npm registry -- @aws-sdk/client-s3 version 3.1024.0 (verified via `npm view`) +- npm registry -- @aws-sdk/s3-request-presigner version 3.1024.0 (verified via `npm view`) +- quay.io -- minio/minio:RELEASE.2025-09-07T16-13-09Z image manifest (verified pullable) +- Codebase analysis -- all 12+ locations referencing `/uploads/` or `imageFilename` identified + +### Secondary (MEDIUM confidence) +- [AWS Developer Blog - Presigned URLs](https://aws.amazon.com/blogs/developer/generate-presigned-url-modular-aws-sdk-javascript/) -- presigned URL patterns +- [MinIO Docker Compose bucket creation](https://banach.net.pl/posts/2025/creating-bucket-automatically-on-local-minio-with-docker-compose/) -- mc init container pattern +- [Alternatives to MinIO for single-node local S3](https://rmoff.net/2026/01/14/alternatives-to-minio-for-single-node-local-s3/) -- post-archival alternatives + +### Tertiary (LOW confidence) +- MinIO CORS configuration requirements -- not verified with current version, needs testing +- Presigned URL performance at scale -- theoretical, not benchmarked + +## Metadata + +**Confidence breakdown:** +- Standard stack: HIGH -- AWS SDK v3 versions verified, well-documented, stable API +- Architecture: HIGH -- Pattern is straightforward S3 wrapper, codebase touchpoints fully mapped +- Pitfalls: MEDIUM -- CORS and presigned URL expiry are known issues but specific MinIO behavior with current quay.io image not verified +- MinIO availability: MEDIUM -- quay.io image verified pullable today, but no future updates expected + +**Research date:** 2026-04-04 +**Valid until:** 2026-05-04 (stable -- S3 API unlikely to change; MinIO image is pinned)