Today I set up a little project website on a new subdomain. It’s not a www subdomain or a newly registered domain, which is easy to detect. We’re talking about:

Randomchars.mydomain.com

Within 20 minutes, the anthropic ClaudeBot was on it. I could tell because the nginx access log showed a hit to robots.txt and then a handful of pages.

First off, how the hell did they find it? Next, is my DNS provider, Amazon Route 53 selling this kind of data now? Or is there some kind of DNS wildcard query?

  • techconsulnerd@programming.dev
    link
    fedilink
    arrow-up
    3
    ·
    4 days ago

    Perhaps it was crawling a list of IP addresses and your web server is also serving the website to your IP address (not domain/subdomain). You can configure the web server to show blank page or 403 error if accessed by IP address.