Building Intuition
In my last post, I discussed how Andrej Karpathy fell a little short on building intuition around the attention mechanism in his Neural Nets – Zero to Hero series. In that post I shared how I was doing my own diagrams of certain components of the transformer architecture, most notably the self-attention block, to try and immerse myself in some of the details to build some intuition around it. That helped a lot.
But in the meantime, I have also found a few other sources that are helping build my intuition. One is part of yet another YouTube playlist covering neural networks in general and, later on, transformers specifically. This is the first of the three chapters that cover transformers.
The creator also references other resources including this from Anthropic. I’ve started reading them, which is helpful. The more angles I can get on this complex topic, the better I understand it.