blob-on-fire

TankieTube is suffering from success.

  • PorkrollPosadist [he/him, they/them]@hexbear.netM
    link
    fedilink
    English
    arrow-up
    30
    ·
    edit-2
    24 days ago

    Going by the server stats, that’s 10% of the uploaded media, which should be pretty good I imagine (assuming a fraction of videos are popular and get a lot of requests while most videos don’t get many views at all).

    I guess another potential thing to look for is if people are deliberately trying to DOS the site. Not quite bringing it down, but draining resources. I could imagine some radlibs or NAFO dorks trying something like this if they caught wind of the place. Could also be caused by scrapers (a growing problem on the Fediverse and the Internet generally, driven by legions of tech bros trying to feed data to their bespoke AI models so they can be bought out by Andreesen-Horowitz).

    • TankieTanuki [he/him]@hexbear.netOP
      link
      fedilink
      English
      arrow-up
      22
      ·
      edit-2
      24 days ago

      I don’t know where to begin for traffic monitoring like that. HetrixTools?

      Do scrapers have a reason to download whole videos? Or are they just interested in the comments?

      • PorkrollPosadist [he/him, they/them]@hexbear.netM
        link
        fedilink
        English
        arrow-up
        20
        ·
        24 days ago

        Do scrapers download whole videos?

        I don’t know, each one is designed for a specific purpose. Some people might scrape for archival reasons, some might do it for AI training data, some might do it to build analytic user profiles, some might do it for academic reasons, some might do it to build search indices. I can’t think of a great reason to just download all the videos, but people do really dumb shit when someone else is paying the bill.

        I don’t know where to begin for traffic monitoring like that. HetrixTools?’

        Unfortunately I don’t have any great recommendations here. I’m looking into this myself. Ideally you’ll want a tool that can monitor the network interface and aggregate data on bandwidth per IP or MAC. That will at least give you an idea if anything seems egregious. (if it is by IP, it could be a large number of machines behind a NAT though, like a university or something). ntopng has piqued my interest. I might try it out and report back.

        • PorkrollPosadist [he/him, they/them]@hexbear.netM
          link
          fedilink
          English
          arrow-up
          12
          ·
          24 days ago

          Ntopng seems useful. They’re really trying to push licenses for “enterprise” features, but the “community edition” is available under the GPLv3 license and allows you to track throughput to remote hosts. Not sure how much of a performance impact it makes.