Member-only story
A Physicist’s View: The Thermodynamics of Machine Learning
Complex systems are ubiquitous in nature, and physicists have found great success using thermodynamics to study these system. Machine learning can be very complex, so can we use thermodynamics to understand it?
As a theoretical physicist turned data scientist, people often ask me how relevant my academic training was. While it is true that my ability to calculate particle interactions and understand the structures of our Universe have no direct relevance in my daily work, the physics intuitions that I learned are of immeasurable value.
Probably the most relatable areas of physics to data science is statistical physics. Below, I’ll share some thoughts on how I connect the dots, and draw inspirations from physics to help me understand an important part of data science — machine learning (ML).
While some of these thoughts below are definitely not fully mathematically rigorous, I do believe some of them are of profound importance in helping us understand the why/how of ML.
Models as Dynamical Systems
One of the key problems of data science is to to predict/describe some quantities using some other quantities. For instance, one might want to predict the price of a house based on its features, or understand how the number of patrons visiting a restaurants is affected by the menu.
To achieve this, data scientists may build mathematical objects called models, that can turn some raw inputs into useful outputs.
To make these models work, we train them. Now many readers have probably heard of the conventional explanations on how models work; here I’ll take an unconventional path — by using physics analogies.
The boring way of thinking a model mathematically is that it is a parameterized function of some sort. Let’s scratch that and think physically. A model is like a machine that turns some raw material into useful outputs. So it is a system made of many smaller, simpler moving parts — like a box filled with different particles.