feat: MCP-Server für RAG-Retrieval + Webhook-Härtung
All checks were successful
CI / ci (push) Successful in 49s
Release / release (push) Successful in 1m2s

app/mcp_server.py: FastMCP (mcp SDK), streamable-http auf /mcp, statischer
Bearer-Token (constant-time ASGI-Middleware), Fail-Fast ohne RAG_MCP_TOKEN.
Tools rag_search (mit semester/fach/typ-Filter) + get_file_chunks. Läuft aus
demselben Image wie der Ingestor und reused den Embed-Pfad → Vektoren sind
garantiert kompatibel zum Ingest (der offizielle qdrant-MCP-Server kann nur
fastembed → Dimension-/Schema-Mismatch).

app/qdrant_store.py: search_chunks (query_points + optionaler Payload-Filter)
und get_chunks_by_path (scroll, nach chunk_index sortiert).

app/bulk.py: Amplification-Guard — /bulk-import lehnt mit 409 ab solange ein
vorheriger Bulk noch BackgroundTasks abarbeitet.

docker-compose.coolify.yml: rag-mcp-Service (nicht public, externes
metamcp-net statt Stack-Coupling) + Traefik-Rate-Limit-Middleware am ingestor.

tests/conftest.py: Settings-env_file in Tests neutralisieren (Dev-.env darf
die Suite nicht kontaminieren). 68 passed, ruff clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-18 22:08:37 +02:00
parent a6a2175f8b
commit 9643011e64
12 changed files with 935 additions and 8 deletions

View File

@@ -61,3 +61,84 @@ def delete_by_path(client: QdrantClient, name: str, file_path: str) -> None:
)
)
client.delete(collection_name=name, points_selector=selector)
_RESULT_FIELDS = (
"text",
"file_path",
"file_name",
"semester",
"fach",
"typ",
"page",
"chunk_index",
)
def _payload_filter(
semester: str | None, fach: str | None, typ: str | None
) -> qm.Filter | None:
"""Build a Qdrant filter from optional metadata constraints, or None."""
conditions = [
qm.FieldCondition(key=key, match=qm.MatchValue(value=value))
for key, value in (("semester", semester), ("fach", fach), ("typ", typ))
if value
]
return qm.Filter(must=conditions) if conditions else None
def search_chunks(
client: QdrantClient,
name: str,
vector: list[float],
limit: int,
semester: str | None = None,
fach: str | None = None,
typ: str | None = None,
) -> list[dict[str, Any]]:
"""Vector search with optional metadata filtering.
Returns one dict per hit: the indexed payload fields plus the similarity
``score``. Caller must pass a vector embedded with the *same* model used
at ingest time, otherwise results are meaningless.
"""
response = client.query_points(
collection_name=name,
query=vector,
limit=limit,
query_filter=_payload_filter(semester, fach, typ),
with_payload=True,
)
out: list[dict[str, Any]] = []
for point in response.points:
payload = point.payload or {}
row: dict[str, Any] = {field: payload.get(field) for field in _RESULT_FIELDS}
row["score"] = point.score
out.append(row)
return out
def get_chunks_by_path(
client: QdrantClient, name: str, file_path: str
) -> list[dict[str, Any]]:
"""Return every chunk of one document, ordered by ``chunk_index``."""
points, _ = client.scroll(
collection_name=name,
scroll_filter=qm.Filter(
must=[qm.FieldCondition(key="file_path", match=qm.MatchValue(value=file_path))]
),
limit=10_000,
with_payload=True,
with_vectors=False,
)
rows = [
{
"chunk_index": p.payload.get("chunk_index"),
"page": p.payload.get("page"),
"text": p.payload.get("text"),
}
for p in points
if p.payload is not None
]
rows.sort(key=lambda r: r["chunk_index"] if r["chunk_index"] is not None else 0)
return rows