feat: MCP-Server für RAG-Retrieval + Webhook-Härtung
All checks were successful
CI / ci (push) Successful in 49s
Release / release (push) Successful in 1m2s

app/mcp_server.py: FastMCP (mcp SDK), streamable-http auf /mcp, statischer
Bearer-Token (constant-time ASGI-Middleware), Fail-Fast ohne RAG_MCP_TOKEN.
Tools rag_search (mit semester/fach/typ-Filter) + get_file_chunks. Läuft aus
demselben Image wie der Ingestor und reused den Embed-Pfad → Vektoren sind
garantiert kompatibel zum Ingest (der offizielle qdrant-MCP-Server kann nur
fastembed → Dimension-/Schema-Mismatch).

app/qdrant_store.py: search_chunks (query_points + optionaler Payload-Filter)
und get_chunks_by_path (scroll, nach chunk_index sortiert).

app/bulk.py: Amplification-Guard — /bulk-import lehnt mit 409 ab solange ein
vorheriger Bulk noch BackgroundTasks abarbeitet.

docker-compose.coolify.yml: rag-mcp-Service (nicht public, externes
metamcp-net statt Stack-Coupling) + Traefik-Rate-Limit-Middleware am ingestor.

tests/conftest.py: Settings-env_file in Tests neutralisieren (Dev-.env darf
die Suite nicht kontaminieren). 68 passed, ruff clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-18 22:08:37 +02:00
parent a6a2175f8b
commit 9643011e64
12 changed files with 935 additions and 8 deletions

View File

@@ -23,6 +23,12 @@ class Settings(BaseSettings):
chunk_overlap_words: int = 50
log_level: str = "INFO"
# MCP server (app.mcp_server). Optional so the ingestor — which shares
# this Settings model — is unaffected. The MCP server itself refuses to
# start when rag_mcp_token is empty.
rag_mcp_token: str = ""
rag_mcp_port: int = 9009
@lru_cache(maxsize=1)
def get_settings() -> Settings: