AI 'hallucinations' can lead to catastrophic mistakes, but a new approach makes automated decisions more reliable
Researchers have developed a new method to improve the accuracy and transparency of automated anomaly detection systems deployed in critical infrastructure.
Scientists have developed a new, multi-stage method to ensure artificial intelligence (AI) systems that are designed to identify anomalies make fewer mistakes and produce explainable and easy-to-understand recommendations.
Recent advances have made AI a valuable tool to help human operators detect and address issues affecting critical infrastructure such as power stations, gas pipelines and dams. But despite showing plenty of potential, models may generate inaccurate or vague results — known as "hallucinations."
Hallucinations are common in large language models (LLMs) like ChatGPT and Google Gemini. They stem from low-quality or biased training data and user prompts that lack additional context, according to Google Cloud.
Some algorithms also exclude humans from the decision-making process — the user enters a prompt, and the AI does the rest, without explaining how it made a prediction. When applying this technology to a serious area like critical infrastructure, a major concern is whether AI’s lack of accountability and trust could result in human operators making the wrong decisions.
Some anomaly detection systems have previously been constrained by so-called "black box" AI algorithms, for example. These are characterized by opaque decision-making processes that generate recommendations difficult for humans to understand. This makes it hard for plant operators to determine, for example, the algorithm’s rationale for identifying an anomaly.
A multi-stage approach
To increase AI's reliability and minimize problems such as hallucinations, researchers have proposed four measures, outlining their proposals in a paper published July 1 at the CPSS '24 conference. In the study, they focused on AI used for critical national infrastructure (CNI), such as water treatment.
First, the scientists deploy two anomaly detection systems, known as Empirical Cumulative Distribution-based Outlier Detection (ECOD) and Deep Support Vector Data Description (DeepSVDD), to identify a range of attack scenarios in datasets taken from the Secure Water Treatment (SWaT). This system is used for water treatment system research and training.
Sign up for the Live Science daily newsletter now
Get the world’s most fascinating discoveries delivered straight to your inbox.
The researchers said both systems had short training times, provided fast anomaly detection and were efficient — enabling them to detect myriad attack scenarios. But, as noted by Rajvardhan Oak, an applied scientist at Microsoft and computer science researcher at UC Davis, ECOD had a "slightly higher recall and F1 score" than DeepSVDD. He explained that F1 scores account for the precision of anomaly data and the number of anomalies identified, allowing users to determine the "optimal operating point."
Secondly, the researchers combined these anomaly detectors with eXplainable AI (XAI) — tools that help humans better understand and assess the results generated by AI systems — to make them more trustworthy and transparent.
They found that XAI models like Shapley Additive Explanations (SHAP), which allow users to understand the role different features of a machine learning model play in making predictions, can provide highly accurate insights into AI-based recommendations and improve human decision-making.
The third component revolved around human oversight and accountability. The researchers said humans can question AI algorithms' validity when provided with clear explanations of AI-based recommendations. They could also use these to make more informed decisions regarding CNI.
The final part of this method is a scoring system that measures the accuracy of AI explanations. These scores give human operators more confidence in the AI-based insights they are reading. Sarad Venugopalan, co-author of the study, said this scoring system — which is still in development — depends on the "AI/ML model, the setup of the application use-case, and the correctness of the values input to the scoring algorithm."
Improving AI transparency
Speaking to Live Science, Venugopalan went on to explain that this method aims to provide plant operators with the ability to check whether AI recommendations are correct or not.
"This is done via message notifications to the operator and includes the reasons why it was sent," he said. "It allows the operator to verify its correctness using the information provided by the AI, and resources available to them."
Encouraged by this research and how it presents a solution for the AI black box problem, Rajvardhan Oak said: “With explanations attached to AI model findings, it is easier for subject matter experts to understand the anomaly, and for senior leadership to confidently make critical decisions. For example, knowing exactly why certain web traffic is anomalous makes it easier to justify blocking or penalizing it."
Eerke Boiten, a cybersecurity professor at De Montfort University, also sees the benefits of using explainable AI systems for anomaly detection in CNI. He said it will ensure humans are always kept in the loop when making crucial decisions based on AI recommendations. “This research is not about reducing hallucinations, but about responsibly using other AI approaches that do not cause them,” he added.
Nicholas Fearn is a freelance technology and business journalist from the Welsh Valleys. With a career spanning nearly a decade, he has written for major outlets such as Forbes, Financial Times, The Guardian, The Independent, The Daily Telegraph, Business Insider, and HuffPost, in addition to tech publications like Gizmodo, TechRadar, Computer Weekly, Computing and ITPro.