A breakdown how how transformer models work (AlexNet, an image classifier) [edit: more about CNNs then transformers, I misunderstood]

BountifulEggnog [she/her]@hexbear.net · edit-2 3 months ago

A breakdown how how transformer models work (AlexNet, an image classifier) [edit: more about CNNs then transformers, I misunderstood]

KnilAdlez [none/use name]@hexbear.net · 3 months ago

This is mostly about convolution neural networks, which don’t really work the same way as transformers. transformers weren’t invented until 2017 and they are most like a more complex version of a recurrent neural network (even that is simplifying it)

BountifulEggnog [she/her]@hexbear.net · 3 months ago

Of course I messed it up. I thought the transformer paper was newer then 2012, but I remembered them being mentioned in the beginning of the video. I should have rewatched to make sure I understood.

KnilAdlez [none/use name]@hexbear.net · 3 months ago

Honestly the video made it sound like CNNs were a part of transformers, so I’d blame the video before yourself

HexReplyBot [none/use name]@hexbear.net · edit-2 3 months ago

I found a YouTube link in your post. Here are links to the same video on alternative frontends that protect your privacy:

A breakdown how how transformer models work (AlexNet, an image classifier) [edit: more about CNNs then transformers, I misunderstood]

A breakdown how how transformer models work (AlexNet, an image classifier) [edit: more about CNNs then transformers, I misunderstood]

- YouTube