Paper reaction: "Deep learning for NLTE spectral opacities"
On using neural networks to accelerate fusion simulations
Theoretical physicists often talk about elegance as a guide in finding a solution to a problem. This is all well and fine if you are studying a single particle interaction or string. But what if you have to solve a problem in the real world, say nuclear fusion at the National Ignition Facility? You have the world’s most powerful laser blasting a cylinder of gold to produce an x-ray bath which ablates the surface of a sphere, causing it to implode at 350 km/s and rapidly compress a pellet of deuterium-tritium fuel until it achieves fusion in the center and produces alpha particles which ideally generate more heat than is lost from x-ray emissions, bootstrapping into a chain reaction.
Does that seem elegant to you?
As you can imagine, there is no beautiful equation to describe that. Instead, you have a whole series of equations: some for the hydrodynamics of the plasma, some for radiation transport, some for nuclear processes, some for electronic transitions, some for magnetic fields, and many more. These equations are all coupled to each other, and need to be calculated at each time step for each cell in the mesh representing the plasma. The higher the space and time resolution, the better, but calculating all of this gets out of hand extremely quickly, even with some of the world’s best supercomputers.
There are many variables of interest in running this experiment to try and optimize the gain of the fusion chain reaction. When it’s this expensive and energy intensive to run the experiment even once, you want to simulate as much as you can and save your experimental runs for the most promising looking parameters. But what happens when the simulation itself becomes too expensive and time intensive? As it turns out, a simulation is ultimately just a function (albeit a convoluted one) from an input to an output - and a neural network can approximate that function.
To illustrate, let’s start with a specific simulation for non-local thermodynamic equilibrium which models the interaction between plasma and electromagnetic radiation, calculating absorptivity (inverse mean free distance), emissivity, and mean ionization of the plasma. Every point in the plasma is characterized by a temperature, density, and distribution of frequencies, and a system of equations derived from a steady-state condition on transitions between ion states must be solved in order to produce the three output values. Solving that sytem of equations consumes between 10% and 90% of the total computational time of a fusion simulation.
Often, to save time, a series of simplifying assumptions is performed in order to make the simulations more efficient. The main one is the fidelity of the atomic model, in which different ion states are lumped together rather than treated as separate. These shortcuts work well enough when the simulations are well behaved, but they lead to large divergences when the system enters a highly nonlinear phase, which is ironically enough exactly when you need higher fidelity.
In a study, researchers at the National Igniton Facility took the sets of inputs and outputs from their simulation and trained a neural network to reproduce the mapping, which it did with a mean error of 1-3%. This offered a 10x speedup in the computational time for this specific simulation, which is huge. Of course, in a real setting there are many other simulations which will need to serve as a rate-limiting input into this simulation, so the actual time savings will most likely be far less than 10x, but still very substantial.
Perhaps the biggest result from all this is that scientists can actually start using the highest fidelity versions of their simulations again. Rather than running the low fidelity model each time at runtime, you can instead spend a ton of resources all at once on generating high fidelity training data for the neural network, with which you can then run a blazing fast approximation of the high fidelity simulation as many times as you want.
To end on a more philosophical note, remember that we use the simulation with the iterated microscopic dynamics because we don’t actually have the concise equation that describes the macroscopic phenomenon. But in some sense, this neural network has found one, even if its nigh impossible to parse for a human. Within the byzantine weights of that neural network lies some relatively compact algorithm for turning one set of numbers into another without any hacky mesh cells or time steps. Perhaps one day, if machine learning interpretability advances enough, we’ll be able to discover that mapping. And who knows, it might even be elegant.
Link to original paper: https://pubs.aip.org/aip/pop/article/27/5/052707/290571/Deep-learning-for-NLTE-spectral-opacities
isn’t the notion of beauty as a pointer towards a solution particular to theoretical physics rather than experimental physics? in some ways, the experimental demands that contemporary physics has placed on researchers to make theoretical progress have made more experimental physicists out of theoretical ones, because it’s more difficult to do “armchair philosophizing” and thought experiments when the phenomena of study isn’t as easily accessible as in the past with say electricity, magnetism, and gravity. The phenomena of study is either super small or super large, so outside of our scale of human interaction. You could say we’ve picked the low-hanging fruit that had been at the human scale and what’s left are phenomena on the scale extents.
The notion of an ML model representing an equation is a cool one. Hopefully indeed interpretability gets good enough for us to be able to translate the weights into notation.