feat: MCP-Server für RAG-Retrieval + Webhook-Härtung

app/mcp_server.py: FastMCP (mcp SDK), streamable-http auf /mcp, statischer Bearer-Token (constant-time ASGI-Middleware), Fail-Fast ohne RAG_MCP_TOKEN. Tools rag_search (mit semester/fach/typ-Filter) + get_file_chunks. Läuft aus demselben Image wie der Ingestor und reused den Embed-Pfad → Vektoren sind garantiert kompatibel zum Ingest (der offizielle qdrant-MCP-Server kann nur fastembed → Dimension-/Schema-Mismatch). app/qdrant_store.py: search_chunks (query_points + optionaler Payload-Filter) und get_chunks_by_path (scroll, nach chunk_index sortiert). app/bulk.py: Amplification-Guard — /bulk-import lehnt mit 409 ab solange ein vorheriger Bulk noch BackgroundTasks abarbeitet. docker-compose.coolify.yml: rag-mcp-Service (nicht public, externes metamcp-net statt Stack-Coupling) + Traefik-Rate-Limit-Middleware am ingestor. tests/conftest.py: Settings-env_file in Tests neutralisieren (Dev-.env darf die Suite nicht kontaminieren). 68 passed, ruff clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 22:08:37 +02:00
parent a6a2175f8b
commit 9643011e64
12 changed files with 935 additions and 8 deletions
--- a/app/qdrant_store.py
+++ b/app/qdrant_store.py
@@ -61,3 +61,84 @@ def delete_by_path(client: QdrantClient, name: str, file_path: str) -> None:
        )
    )
    client.delete(collection_name=name, points_selector=selector)
+
+
+_RESULT_FIELDS = (
+    "text",
+    "file_path",
+    "file_name",
+    "semester",
+    "fach",
+    "typ",
+    "page",
+    "chunk_index",
+)
+
+
+def _payload_filter(
+    semester: str | None, fach: str | None, typ: str | None
+) -> qm.Filter | None:
+    """Build a Qdrant filter from optional metadata constraints, or None."""
+    conditions = [
+        qm.FieldCondition(key=key, match=qm.MatchValue(value=value))
+        for key, value in (("semester", semester), ("fach", fach), ("typ", typ))
+        if value
+    ]
+    return qm.Filter(must=conditions) if conditions else None
+
+
+def search_chunks(
+    client: QdrantClient,
+    name: str,
+    vector: list[float],
+    limit: int,
+    semester: str | None = None,
+    fach: str | None = None,
+    typ: str | None = None,
+) -> list[dict[str, Any]]:
+    """Vector search with optional metadata filtering.
+
+    Returns one dict per hit: the indexed payload fields plus the similarity
+    ``score``. Caller must pass a vector embedded with the *same* model used
+    at ingest time, otherwise results are meaningless.
+    """
+    response = client.query_points(
+        collection_name=name,
+        query=vector,
+        limit=limit,
+        query_filter=_payload_filter(semester, fach, typ),
+        with_payload=True,
+    )
+    out: list[dict[str, Any]] = []
+    for point in response.points:
+        payload = point.payload or {}
+        row: dict[str, Any] = {field: payload.get(field) for field in _RESULT_FIELDS}
+        row["score"] = point.score
+        out.append(row)
+    return out
+
+
+def get_chunks_by_path(
+    client: QdrantClient, name: str, file_path: str
+) -> list[dict[str, Any]]:
+    """Return every chunk of one document, ordered by ``chunk_index``."""
+    points, _ = client.scroll(
+        collection_name=name,
+        scroll_filter=qm.Filter(
+            must=[qm.FieldCondition(key="file_path", match=qm.MatchValue(value=file_path))]
+        ),
+        limit=10_000,
+        with_payload=True,
+        with_vectors=False,
+    )
+    rows = [
+        {
+            "chunk_index": p.payload.get("chunk_index"),
+            "page": p.payload.get("page"),
+            "text": p.payload.get("text"),
+        }
+        for p in points
+        if p.payload is not None
+    ]
+    rows.sort(key=lambda r: r["chunk_index"] if r["chunk_index"] is not None else 0)
+    return rows