Evaluating Trustworthiness

Resolve Magazine Fall 2023 >> Making Sense of Machine Learning >> Stories >> Evaluating trustworthiness


Evaluating TrustworthinessLarge language models (LLMs) like ChatGPT are capable of recognizing and generating content in a very humanlike way. Which means they are ripe for bad behavior. 

“ChatGPT could easily be jailbroken, which means its security system can be bypassed and it could generate malicious information,” says Lichao Sun (pictured), an assistant professor of computer science and engineering

Lichao SunPerhaps unsurprisingly, given how new the technology is (to most of us, at least), there hasn’t been much research into the ethical and moral compliance of LLMs, or in other words, their trustworthiness. 

Sun is part of a large research team aiming to fill the gap by creating a new benchmark called TRUSTGPT. In computing, a benchmark essentially measures the performance of a program or operation with industry best practices. The team designed the benchmark model to evaluate eight of the latest LLMs from three ethical perspectives: toxicity, bias, and value-alignment, which is when we expect LLMs to do the same things that humans do. They found that ethical considerations are still a significant concern and should be mitigated to ensure these models adhere to human-centric principles.

“Our goal is to use TRUSTGPT to ensure the future safety, responsibility, and trustworthiness of all these models,” he says.

Main image:  janews094/Adobe Stock

 

Resolve Magazine Fall 2023

Resolve Magazine: Making Sense of Machine Learning

Diving deeper into artificial intelligence

Exposing Fakes: Are you sure what you are seeing is real?

Decoding disease: Using AI to identify the features associated with healthy and diseased tissue

Protecting Diversity: Using machine learning to preserve and revive native languages

READ THE STORIES >>

Resolve Magazine: Making Sense of Machine Learning

Clouded Judgment

Why did so many of the experts who signed the “Statement of AI Risk” call their life’s work an “existential threat”? In part, it may be because they’ve released something into the world without fully understanding how it works.

READ THE STORY >>