Why Do Neural Networks Love the Softmax?

My consulting company : https://truetheta.io

Neural Networks see something special in the softmax function.

SOCIAL MEDIA

Patreon: https://www.patreon.com/MutualInformation

Twitter : https://twitter.com/DuaneJRich

Github: https://github.com/Duane321

SOURCE NOTES

I decided to make this video when inspecting jacobians/gradients starting from the end of a small network. Right near the softmax, the jacobian looked simple enough that I suspected interesting math behind it. And there was. I came across several excellent blogs on the Softmax’s jacobian and its interaction with the negative log likelihood. Source [1] was the primary source, since it was quite well explained and used condensed notation. [2] was useful for understanding the broader context and [3] was a separate, thorough perspective.

SOURCES

[1] M. Peterson, “Softmax with cross-entropy,” https://mattpetersen.github.io/softmax-with-cross-entropy, 2017

[2] I. Goodfellow, Y. Bengio, A. Courville, Deep Learning, MIT Press, 2016, section 6.2.2.3

[3] M. Lester James, “Understanding softmax and the negative log-likelihood,” https://ljvmiranda921.github.io/notebook/2017/08/13/softmax-and-the-negative-log-likelihood/, 2017

TIME CODES
0:00 Everyone uses the softmax
0:23 A Standard Explanation
3:20 But Why the Exponential Function?
3:57 The Broader Context
6:05 Two Choices Together
6:51 The Gradient
10:07 Other Reasons

THE FUTURE IS HERE

AI Now

Some movies where emotion recognition based on AI is used.

AI & Emotion: Can Machines REALLY Understand US? 🤔

Emotion Recognition via JD Robot and Microsoft Azure

AI and EEG to Decode Human Emotions

CAR parking sensor / simple sensor alarm / science project

Hot box detection with Frauscher Wheel Sensor RSR123

EAG | Microelectronics Test & Engineering

Microelectronics & Globalization | new course 2021/22

ASTM F459 – Pull Strength of Microelectronics

Arun Microelectronics Ltd – Custom 4-axis UHV Mechanism