• 0 Posts
  • 16 Comments
Joined 2 个月前
cake
Cake day: 2026年2月28日

help-circle


  • Eh it’s the illusion of speed. Scaling brought enormous returns from GPT-3 -> GPT-4 but it’s been far less significant for every major release since. To compensate for this, every research lab is coming up with new ways to extract value of it of models: CoT, RL, Agent Harness etc

    However, these are all hacks to make LLMs more efficient or (try) to make them more reliable. They still have significant drawbacks which will take years (probably decades) to ever get them to the point where they can reliably replace knowledge workers. China knows this and is taking a far different approach to LLM development (not a tankie fyi). Scaling is a horrible idea which will burn billions of dollars with an astronomically low chance of return.