Tacklng computing efficiency and security challenges on the cutting edge of artificial intelligence

Wujie Wen“My philosophy is, maybe you publish a 100 or so papers in your career, but a few of them should have a direct and immediate impact on the real world,” says Wujie Wen.

As an assistant professor in the Department of Electrical and Computer Engineering, Wen puts that philosophy into regular practice. He researches how to improve computing efficiency and security, primarily for machine learning and AI-related applications that will play key roles in the future of autonomous vehicles, healthcare, and other fields. Three of his most recent projects have all been supported by the NSF.

The first addresses the intense energy demands of running deep learning algorithms (the kind that power computer vision and language processing technologies). With traditional computer architecture, computation and memory are accomplished separately in hardware accelerators. “But if you look at neurons inside the human brain,” says Wen, “they can process and memorize things. Which is why the human brain is much, much more efficient than the modern computer.”

Neuromorphic computing—also known as brain-inspired computing— uses an emerging post-CMOS (complementary metal oxide semiconductor) device called a memristor crossbar to essentially mimic how neurons can accomplish both tasks simultaneously.

The device makes neuromorphic computing systems more efficient and enables on-device intelligence in pervasive edge and Internet of Things platforms with stringent resource limits. But to date, they’ve been highly unreliable in programming, reading, and retaining the data because of the immature fabrication process. Left unchecked, they can introduce errors that can accumulate when running a deep learning task and eventually lead to inaccurate results.

Wen and his team are building a self-healing framework that integrates a test, diagnosis, and recovery loop that will maintain the health of next-generation AI hardware accelerators built upon these devices.

“We’ve developed a very scalable method to address these issues, and make the computation in these emerging devices more reliable and sustainable,” he says.

His second project (now complete) was focused on developing defense solutions to ensure the security of artificial intelligence systems.

“AI is great, but it’s also dangerous,” Wen says. “The human eye can easily recognize a stop sign, but an image of a stop sign that has a scratch on it could be misclassified as the speed limit by the AI. And because so many vendors like Google allow users to upload data to a cloud platform, attackers could introduce abnormal data or inject malicious intent into the AI model and destroy it. So if you have an application like a self-driving car that accesses the cloud for decision-making, the result could be disastrous.”

He and his team developed a new low-cost, high-accuracy safeguarding paradigm by directly rooting the defense into compression techniques that are necessary for efficient AI processing in hardware. In this way, the resource overhead incurred by defense can be minimized.

Artificial IntelligenceWen’s third project explores techniques to better harness AI for future applications, such as medical imaging diagnosis and edge computing. Images like CAT scans, for example, are enormous, and sending them to a processing unit—like an edge server—could take longer than the time it takes to compute the image.

“We’re thinking that in the future, you’re going to have a lot of resource-limited devices,” says Wen. “And these devices need to communicate with the server or with the cloud to do some kind of machine learning inference service. Traditionally, the focus has been on the inference, or the computation side. But people have ignored the communication side. It can take more than 10 seconds to send data from your sensor to the processing unit, and if your computation in the cloud takes only 5 milliseconds, that’s a big issue. And we are actually the first ones to examine it.”

The ideal, he says, is end-to-end low latency, especially for real-time machine learning services, like collision avoidance applications in vehicles. So Wen and his colleagues developed new compression frameworks to dramatically reduce the latency between computation and communication.

“The existing image compression frameworks are designed for the human eye, which is more sensitive to the low-frequency components,” he says. “But machine learning systems are not like that. We were the first ones to look at this issue by examining the image perception difference between human vision and machine vision. And we were able to develop a new compression framework that lowered the communication window by four to five times.”

It’s all work that has the potential to translate into regular use in the real world, exactly in line with Wen’s mission and philosophy as a researcher. His methods are inclusive of both hardware and software, and honing that range of expertise is something he stresses for his students.

“Computer engineering requires so much knowledge. It’s not just the programming. It’s learning both the software and hardware components,” he says. “That’s what I tell my students. And when they develop that level of experience, they’re able to get hired by these big companies and solve practical engineering problems that have a real impact on society.”

Illustration: Shutterstock/Lyolik84