SEO Stufe 1: Indexierung freischalten + Lagen-Meta

- noindex/nofollow von 12 indexierbaren Seiten entfernt (Hauptseiten DE/EN, 3 Lagen DE/EN, Legal DE/EN)
- robots.txt scharf geschaltet: Crawling allgemein erlaubt, Live-Search-AI-Bots (OAI-SearchBot, ChatGPT-User, ClaudeBot, PerplexityBot) erlaubt, Training-Bots (GPTBot, CCBot, anthropic-ai, Google-Extended, Applebot-Extended, Bytespider, ...) geblockt
- sitemap.xml: Inhalt aus sitemap-launch.xml uebernommen, mit Sitemap-Verweis in robots.txt
- Lagen-Seiten (3 DE + 3 EN): description, canonical, Open Graph, Twitter Card und Schema.org Article ergaenzt
- Lagen-Hero: Topic-Default in <p id="incident-title"> als Crawler-Fallback (JS ueberschreibt mit Datum bei Lade)
- CLAUDE.md CHANGE_LOG ergaenzt
Dieser Commit ist enthalten in:
claude-dev
2026-05-10 15:01:25 +02:00
Ursprung a6481a11c0
Commit 025ddfcebd
15 geänderte Dateien mit 527 neuen und 107 gelöschten Zeilen

Datei anzeigen

@@ -1,94 +1,23 @@
# robots.txt for AegisSight UG
# Block ALL web crawlers and bots from the entire site
# robots.txt - AegisSight UG
# Crawling allgemein erlaubt, ausser API/interne Pfade
# Keine Trainingsdaten-Verwendung durch AI-Crawler (Training-Bots geblockt)
# Live-Search-AI-Bots (OAI-SearchBot, ChatGPT-User, ClaudeBot, PerplexityBot) sind erlaubt
# Block all bots
User-agent: *
Disallow: /
Crawl-delay: 86400
Allow: /
Disallow: /api/
Disallow: /_archiv/
Disallow: /insights/
# Specifically block major search engines
User-agent: Googlebot
Disallow: /
# Sitemap
Sitemap: https://aegis-sight.de/sitemap.xml
User-agent: Bingbot
Disallow: /
User-agent: Slurp
Disallow: /
User-agent: DuckDuckBot
Disallow: /
User-agent: Baiduspider
Disallow: /
User-agent: YandexBot
Disallow: /
# Block social media crawlers
User-agent: facebookexternalhit
Disallow: /
User-agent: Twitterbot
Disallow: /
User-agent: LinkedInBot
Disallow: /
User-agent: WhatsApp
Disallow: /
User-agent: TelegramBot
Disallow: /
# Block SEO and analysis bots
User-agent: AhrefsBot
Disallow: /
User-agent: SemrushBot
Disallow: /
User-agent: DotBot
Disallow: /
User-agent: MJ12bot
Disallow: /
User-agent: SEOkicks-Robot
Disallow: /
User-agent: SeznamBot
Disallow: /
User-agent: MauiBot
Disallow: /
User-agent: Majestic-12
Disallow: /
User-agent: Majestic-SEO
Disallow: /
# Block archiving bots
User-agent: ia_archiver
Disallow: /
User-agent: Wayback Machine
Disallow: /
User-agent: SiteSnagger
Disallow: /
User-agent: WebCopier
Disallow: /
# Block AI/ML crawlers
# ----------------------------------------------------------------------
# AI-Training-Crawler -- BLOCKED (kein Training auf unseren Inhalten)
# ----------------------------------------------------------------------
User-agent: GPTBot
Disallow: /
User-agent: ChatGPT-User
Disallow: /
User-agent: CCBot
Disallow: /
@@ -98,15 +27,86 @@ Disallow: /
User-agent: Claude-Web
Disallow: /
# Block download managers
User-agent: wget
User-agent: Google-Extended
Disallow: /
User-agent: curl
User-agent: Applebot-Extended
Disallow: /
User-agent: Meta-ExternalAgent
Disallow: /
User-agent: Bytespider
Disallow: /
User-agent: cohere-ai
Disallow: /
User-agent: FacebookBot
Disallow: /
User-agent: ImagesiftBot
Disallow: /
User-agent: Diffbot
Disallow: /
User-agent: Omgilibot
Disallow: /
# ----------------------------------------------------------------------
# AI-Live-Search-Crawler -- ALLOWED (Sichtbarkeit in KI-Antworten)
# OAI-SearchBot, ChatGPT-User, ClaudeBot, PerplexityBot werden NICHT
# blockiert. Sie crawlen fuer Live-Antworten, nicht fuer Training.
# ----------------------------------------------------------------------
# ----------------------------------------------------------------------
# Archiv-Bots
# ----------------------------------------------------------------------
User-agent: ia_archiver
Disallow: /
User-agent: archive.org_bot
Disallow: /
# ----------------------------------------------------------------------
# SEO-/Spam-Crawler
# ----------------------------------------------------------------------
User-agent: AhrefsBot
Disallow: /
User-agent: SemrushBot
Disallow: /
User-agent: MJ12bot
Disallow: /
User-agent: DotBot
Disallow: /
User-agent: SEOkicks-Robot
Disallow: /
User-agent: MauiBot
Disallow: /
User-agent: Majestic-12
Disallow: /
User-agent: BLEXBot
Disallow: /
User-agent: SerendeputyBot
Disallow: /
# ----------------------------------------------------------------------
# Download-Manager
# ----------------------------------------------------------------------
User-agent: HTTrack
Disallow: /
# No sitemap provided
# No crawl permissions granted
User-agent: SiteSnagger
Disallow: /
User-agent: WebCopier
Disallow: /