• TechNerdWizard42@lemmy.world
      link
      fedilink
      arrow-up
      4
      ·
      5 months ago

      I believe you’d need roughly 500GB of RAM to run it minimum at full context length. There is chatter that 125k context took and used 40GB

      I know I can load the 70B models into my laptop at lower bits but it consumes about 140GB of RAM.

    • keepthepace@slrpnk.net
      link
      fedilink
      arrow-up
      3
      ·
      5 months ago

      It is llama3-8B so it is not out of question but I am not sure how much memory you would need to really go to 1M context window. They use ring attention to achieve high context window, which I am unfamiliar with but that seems to lower greatly the memory requirements.