Today I set up a little project website on a new subdomain. It’s not a www subdomain or a newly registered domain, which is easy to detect. We’re talking about:

Randomchars.mydomain.com

Within 20 minutes, the anthropic ClaudeBot was on it. I could tell because the nginx access log showed a hit to robots.txt and then a handful of pages.

First off, how the hell did they find it? Next, is my DNS provider, Amazon Route 53 selling this kind of data now? Or is there some kind of DNS wildcard query?

  • SpaceMan9000@lemmy.world
    link
    fedilink
    arrow-up
    1
    ·
    4 days ago

    What’s the default in nginx? Did they need to know the actual subdomain or a lot of times you can get it by querying the DNS servers directly or have certs leak it.