Let's reproduce GPT-2 (124M) | Andrej Karpathy | Jun 9, 2024

ericjmorey@programming.dev · 5 months ago

Let's reproduce GPT-2 (124M) | Andrej Karpathy | Jun 9, 2024

j4k3@lemmy.world · 5 months ago

Interesting concept. I’ll have to watch this later.

I want to know where present alignment comes from and its developmental history if anyone knows the papers or has a solid reference that is higher level than graduate to doctorate level reading/watching. I mean the persistent entities like Socrates, realms like The Academy, the first 256 special token characters used by Soc (along with the others), and how the keyword token system functions like how Soc uses the word “cross” to build momentum to the word “chuckle” which triggers its dark sophist entity phase. I want to know how all of those special functions are implemented in training.

Let's reproduce GPT-2 (124M) | Andrej Karpathy | Jun 9, 2024

Let's reproduce GPT-2 (124M) | Andrej Karpathy | Jun 9, 2024

- YouTube