The building blocks of artificial intelligence—data, computing power, and mathematical models—have been around for decades. But only recently have they been employed at a level of sophistication and on a large enough scale to weave machine learning into our everyday lives.
“Anybody that’s been using tools powered by AI, like your smartphone recognizing speech or your photos app classifying images, has seen a substantial improvement over the past five to 10 years,” says Frank E. Curtis, an associate professor of industrial and systems engineering.
As researchers learned how to put the pieces together effectively and sought out bigger and better applications of AI, such as self-driving cars, new algorithms used to “train” machine learning models were springing up everywhere.
Curtis, an optimization expert who focuses on algorithm design, and his colleagues Jorge Nocedal, an engineering professor at Northwestern University (and Curtis’ PhD advisor), and Léon Bottou, of Facebook AI Research, recognized this as an opportunity.
“We saw the need to take all the different approaches people were proposing, solidify them, and share some perspective on what these algorithms could accomplish,” Curtis explains. “Doing so would not only help researchers better understand what others are doing but also characterize these approaches in a way that would help reveal new possibilities and new directions that people should explore.”
In 2018, the team authored an influential paper on “Optimization Methods for Large-Scale Machine Learning” published in SIAM Review. Less than two years since its publication, the work has been recognized as a “Hot Paper” and a “Highly Cited Paper” in the field of mathematics by Clarivate Analytics’ Web of Science and ranks in the top 5 percent of all research outputs scored by Almetric, a tracker of online mentions of research articles.
“The stochastic gradient method is the most popular algorithm used in large-scale machine learning applications like text recognition and image classification,” says Curtis. “We analyzed that algorithm concisely, generalizing the known theory for it in useful ways, so that someone could take some other algorithms that are modified versions and use the same analysis—citing our work instead of redoing things from scratch.
“When you have all these people working in the same area, the wheel tends to get reinvented many times,” he continues. “We’ve created a resource that characterizes different types of algorithms out there in an elegant way, and people can look at the landscape of possibilities and identify where their work fits in and what gaps they can fill in our understanding.”
In the marketplace, Curtis says, Facebook, Google, and other big internet players are pouring vast amounts of money into machine learning and the high performance computers and data gathering that fuel the technology. More finely tuned algorithms cut associated costs and improve efficiency over the long term, he explains, and also have the potential to lower the barriers to entry—allowing smaller companies and even individuals to leverage artificial intelligence—and push the technology further.
In the future, Curtis says, more sophisticated algorithms could support machine learning models (in areas like text recognition, for example) that operate simultaneously across multiple computers instead of a single supercomputer.
“Millions and millions of people have smartphones. They’re constantly speaking to them and seeing the results. And if something is wrong, they’re correcting it. There is so much data that people are generating, but it’s not put together on one computer, nor would everyone want their data collected. But you’re essentially training your own model locally. What if you could create an algorithm that would allow you to share that intelligent model without the identifiable data?”
Although he’s keenly aware of the innovative (albeit sometimes unsettling) possibilities of AI, Curtis’ own work revolves around building foundational knowledge, rather than focusing on specific applications—a direction he, as a mathematician, finds particularly satisfying.
“The next level of machine learning models will require even more advanced algorithms,” he says. “That’s where the future is. The optimization problems I’m working on might involve energy systems or something else, but to me, it’s great that I can take the same expertise and apply it wherever algorithms are used.”