app/mcp_server.py: FastMCP (mcp SDK), streamable-http auf /mcp, statischer Bearer-Token (constant-time ASGI-Middleware), Fail-Fast ohne RAG_MCP_TOKEN. Tools rag_search (mit semester/fach/typ-Filter) + get_file_chunks. Läuft aus demselben Image wie der Ingestor und reused den Embed-Pfad → Vektoren sind garantiert kompatibel zum Ingest (der offizielle qdrant-MCP-Server kann nur fastembed → Dimension-/Schema-Mismatch). app/qdrant_store.py: search_chunks (query_points + optionaler Payload-Filter) und get_chunks_by_path (scroll, nach chunk_index sortiert). app/bulk.py: Amplification-Guard — /bulk-import lehnt mit 409 ab solange ein vorheriger Bulk noch BackgroundTasks abarbeitet. docker-compose.coolify.yml: rag-mcp-Service (nicht public, externes metamcp-net statt Stack-Coupling) + Traefik-Rate-Limit-Middleware am ingestor. tests/conftest.py: Settings-env_file in Tests neutralisieren (Dev-.env darf die Suite nicht kontaminieren). 68 passed, ruff clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
36 lines
909 B
Python
36 lines
909 B
Python
from functools import lru_cache
|
|
|
|
from pydantic_settings import BaseSettings, SettingsConfigDict
|
|
|
|
|
|
class Settings(BaseSettings):
|
|
model_config = SettingsConfigDict(env_file=".env", env_file_encoding="utf-8", extra="ignore")
|
|
|
|
nextcloud_webdav_url: str
|
|
nextcloud_user: str
|
|
nextcloud_app_password: str
|
|
|
|
ollama_url: str
|
|
ollama_embed_model: str
|
|
|
|
qdrant_url: str
|
|
qdrant_collection: str
|
|
|
|
webhook_secret: str
|
|
|
|
ingest_root: str = "Documents/THB/Studium"
|
|
chunk_size_words: int = 500
|
|
chunk_overlap_words: int = 50
|
|
log_level: str = "INFO"
|
|
|
|
# MCP server (app.mcp_server). Optional so the ingestor — which shares
|
|
# this Settings model — is unaffected. The MCP server itself refuses to
|
|
# start when rag_mcp_token is empty.
|
|
rag_mcp_token: str = ""
|
|
rag_mcp_port: int = 9009
|
|
|
|
|
|
@lru_cache(maxsize=1)
|
|
def get_settings() -> Settings:
|
|
return Settings()
|