Resolve Magazine Fall 2023 >> Making Sense of Machine Learning >> Stories >> Building domain-informed models
Collaboration between those conducting foundational research into machine learning and those who apply machine learning methods is laying the groundwork at Lehigh for creating more precise tools to solve a range of compelling science and engineering problems.
The Program in the Foundations and Applications of Mathematical Optimization and Data Science (MODS) is an interdisciplinary group of faculty within Lehigh’s Institute for Data, Intelligent Systems, and Computation (I-DISC). MODS researchers are working on topics including optimization theory and algorithms, as well as the application of modern data science and machine learning tools in a diverse range of fields, such as energy, healthcare systems, and quantum chemistry.
“We’re no longer building purely black box models,” says MODS member Srinivas Rangarajan (pictured above), an associate professor of chemical and biomolecular engineering. “There are other ways of thinking about how these models are built, where you don't build them purely from data, but you subject those neural networks to constraints that satisfy the underlying laws of your domain.”
For instance, he says, in most real-world systems—like the chemical processes that produce fuel from raw materials—any model must satisfy known physical laws such as those concerning mass, temperature, and thermodynamics. These laws cannot be violated.
“If you just blindly build a data-driven model, it might give you negative concentrations, negative temperature, or violate conservation of mass,” says Rangarajan, whose primary research area is sustainable catalysis. He and his team use machine learning to develop models to better understand catalytic reactions and ultimately design energy- and carbon-efficient catalysts and optimal operation conditions. “You want to ensure you’re infusing domain information in machine learning models. And that requires solving constrained optimization problems.”
Those problems are currently being solved by MODS member Frank Curtis (pictured), a professor of industrial and systems engineering, and his team. Curtis is building novel algorithms to train neural networks capable of addressing the unique domain problems of researchers like Rangarajan.
“Beyond catalysis, there are lots of problems that could benefit from a machine learning model that’s actually informed by the domain so it doesn’t give us rubbish results,” says Rangarajan.” Our goal here is to bring foundational and application people together to solve these problems that our current tools can’t solve.”
Main image: Douglas Benedict/Academic Image