How to Think About GPUs
This is reference-quality material for anyone trying to reason about accelerator performance without hand-waving. It explains GPUs as machines with memory, bandwidth, communication, and parallelism constraints.
Useful far beyond JAX. The same mental model helps with PyTorch, Transformers, NeMo, and any conversation where "just add GPUs" is hiding the actual bottleneck.