feat: Mehrstufige Deep-Research-Pipeline mit Quellenkontext
- DEEP_RESEARCH_PROMPT: 4-Phasen-Strategie (Breite Erfassung → Lückenanalyse → Gezielte Tiefenrecherche → Verifikation) - Ziel 15-25 Quellen aus 5+ Quellentypen statt 8-15 aus Mainstream - researcher.search(): Neuer Parameter existing_articles — bereits bekannte Quellen werden als Kontext übergeben, damit Claude gezielt neue Perspektiven findet - orchestrator: DB-Abfrage vor Pipeline verschoben, bestehende Artikel als Kontext an Researcher übergeben (nur Research-Typ) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Dieser Commit ist enthalten in:
@@ -571,6 +571,13 @@ class AgentOrchestrator:
|
||||
"data": {"status": research_status, "detail": research_detail, "started_at": now_utc},
|
||||
}, visibility, created_by, tenant_id)
|
||||
|
||||
# Bestehende Artikel vorladen (für Dedup UND Kontext)
|
||||
cursor = await db.execute(
|
||||
"SELECT id, source_url, headline, source FROM articles WHERE incident_id = ?",
|
||||
(incident_id,),
|
||||
)
|
||||
existing_db_articles_full = await cursor.fetchall()
|
||||
|
||||
# Schritt 1+2: RSS-Feeds und Claude-Recherche parallel ausführen
|
||||
async def _rss_pipeline():
|
||||
"""RSS-Feed-Suche (Feed-Selektion + dynamische Keywords + Parsing)."""
|
||||
@@ -617,7 +624,20 @@ class AgentOrchestrator:
|
||||
async def _web_search_pipeline():
|
||||
"""Claude WebSearch-Recherche."""
|
||||
researcher = ResearcherAgent()
|
||||
results, usage = await researcher.search(title, description, incident_type, international=international, user_id=user_id)
|
||||
# Bei Research: bestehende Artikel als Kontext mitgeben
|
||||
existing_for_context = None
|
||||
if incident_type == "research" and existing_db_articles_full:
|
||||
existing_for_context = [
|
||||
{"source": row["source"] if "source" in row.keys() else "",
|
||||
"headline": row["headline"],
|
||||
"source_url": row["source_url"]}
|
||||
for row in existing_db_articles_full
|
||||
]
|
||||
results, usage = await researcher.search(
|
||||
title, description, incident_type,
|
||||
international=international, user_id=user_id,
|
||||
existing_articles=existing_for_context,
|
||||
)
|
||||
logger.info(f"Claude-Recherche: {len(results)} Ergebnisse")
|
||||
return results, usage
|
||||
|
||||
@@ -714,14 +734,10 @@ class AgentOrchestrator:
|
||||
}, visibility, created_by, tenant_id)
|
||||
|
||||
# --- Set-basierte DB-Deduplizierung (statt N×M Queries) ---
|
||||
cursor = await db.execute(
|
||||
"SELECT id, source_url, headline FROM articles WHERE incident_id = ?",
|
||||
(incident_id,),
|
||||
)
|
||||
existing_db_articles = await cursor.fetchall()
|
||||
# existing_db_articles_full wurde bereits oben geladen
|
||||
existing_urls = set()
|
||||
existing_headlines = set()
|
||||
for row in existing_db_articles:
|
||||
for row in existing_db_articles_full:
|
||||
if row["source_url"]:
|
||||
existing_urls.add(_normalize_url(row["source_url"]))
|
||||
if row["headline"] and len(row["headline"]) > 20:
|
||||
|
||||
In neuem Issue referenzieren
Einen Benutzer sperren