Whether you’re mapping a route, buying a book, or simply checking the time, if it’s on the web, Yinzhi Cao, assistant professor of computer science and engineering, is interested in—and concerned about—how it impacts your digital privacy and security.
“People are gradually beginning to understand the importance of security, but it’s still a second-tier citizen,” Cao says. “Functionality, profit, convenience—they all still come first.”
It was the Web site coupons.com that first introduced Cao the idea of web security as a field of interest and study. “I was pursuing my Ph.D. at Northwestern at the time, and I had a colleague with a baby who asked me to help him buy diapers through the site,” he says. “It allows you to print coupons for various goods and activities, but in this case, you could only do it twice, and then it was gone from your computer.”
Cao bought his share of the diapers, but his gears kept turning. “I thought it was so interesting how the site could know that you’ve printed twice,” he says. “I knew something was happening in the browser, and I knew that was a topic I wanted to look at more closely—web tracking.”
The user side of web tracking takes many different forms, Cao says. Some—like targeted couponing and online banking authentication—are good, while others—including web advertisement personalization—are privacy violations that are mostly benefit-free.
One way Cao put his own stamp on the field is to develop a way to track users across multiple web browsers. By generating a unique fingerprint for each user with data from operating systems, graphics cards, audio hardware, and central processing units, it helps save time and protect user data across browsers like Internet Explorer, Mozilla Firefox, and Google Chrome. According to a paper Cao and his colleagues published on their work, their method can successfully fingerprint as many as 99.24 percent of users.
But tracking is merely one facet of a deeply complex and constantly changing area of study.
Cao has also developed a digital product called Safepay that aims to limit the ability of scam artists to skim someone’s credit card number and use it to make fraudulent purchases.
The idea came about because the incidence of fraud was so much higher for magnetic stripe cards than it is for those with chips. However, for all the added security of chip cards, many retailers are still reluctant to adopt the technology.
“Even after the big Target breach in 2013, companies don’t want to pay up front against potential loss later,” Cao says. “Even with fraud responsibility shifting from the banks to the retail store and those entities, they still won’t do it.”
Safepay takes a novel approach to this problem by generating a unique credit card number every time you purchase. Using a mobile banking app on a smartphone, it generates this unique number and transmits it electronically through your audio jack and into an old-fashioned magnetic reader. While these devices are most closely associated with skimming problems, the fact that your credit card number would change with every purchase neutralizes this concern.
“The reluctance to change makes backward compatibility really important in the field of web security,” Cao says. “Safepay is a secure alternative for those retailers that aren’t ready to adopt the chip model yet.”
Cao also studies security and data privacy through mobile apps. “I’m very interested in app behavior and intent,” he says. “You have two apps that send your location to a third-party server. One is a map—that’s fine. The other is one that sends a “Merry Christmas” notification to your friends. That’s an illegitimate use. It doesn’t need your location data.”
What Cao and his research team are trying to determine is if there’s a correlation in the way app developers describe their products in the digital marketplace and what the product’s behavior is toward sensitive information. This may help app users make smarter and more informed choices about what apps they want to have access to their contacts, location, and other personal data.
Cao completed his doctoral work at Northwestern in 2014 and took a position as a postdoctoral scientist at Columbia. A year later, he was interviewing for other positions around the country including one at IBM. There, he met Ting Wang, now a fellow assistant professor of computer science and engineering. They were both exploring faculty positions in Lehigh’s Department of Computer Science and Engineering. They compared notes and both felt like Lehigh would be the right place for them. Now, they’re both involved in various projects related to web security and machine learning.
“One project I have right now is looking at the way bugs in autonomous systems may impact people’s lives,” he says. With self-driving cars, he can shade a few pixels in the images of the road they’re processing. “The car may need to turn left, but by introducing this bug, it’ll turn right and crash into the guardrail,” he says. By understanding how these bugs are interpreted and introduced, defense mechanisms can be developed.
And alongside a colleague from Columbia, he also had a four-year, $1.2 million research grant funded by the National Science Foundation to explore a topic called “machine unlearning,” which sought to introduce methods to permanently forget personal data into autonomous learning systems.
Cao knows privacy and security are really important aspects of digital life that tend to get pushed aside for cost and other purposes. That’s why he seeks to understand it from many different perspectives, always stay ahead of the curve, and create solutions that work for various customers and platforms.
It’s also why he stresses security when he teaches the next generation of computer scientists and engineers.
“It’s really important to me to integrate these concepts into my teaching so that students remember how important it is,” he says. “I want them to keep pushing security forward.”