Chinese researchers develop AI hallucination correction tool

With better accuracy compared to baseline models.

By Abbinaya Kuzhanthaivel on Oct 30, 2023 9:56AM

A team of researchers at the University of Science and Technology of China (USTC) and Tencent’s YouTu lab have developed Woodpecker, a tool to correct hallucinations in multimodal large language models (MLLMs).

Hallucination is among the growing concerns of generative artificial intelligence (AI), when a large language model (LLM) like OpenAI’s ChatGPT or Google Bard generates inaccurate or false information that aren’t based on real data or events.

According to the research paper, Woodpecker offers a novel approach through a "training-free method" that corrects hallucinations from the generated text.

It uses three pre-trained AI models such as GPT-3.5 turbo, Grounding DINO and BLIP-2-FlanT5 to detect and correct hallucination.

The proposed framework performs correction in five stages - key concept extraction, question formulation, visual knowledge validation, visual claim generation, and hallucination correction.

It begins with identifying the main objects mentioned in the text, formulating questions around the extracted objects, answering those questions using expert models, converting them into a visual knowledge base and finally modifying the hallucinations.

The researchers said each step in the pipeline is clear and transparent, thus providing good interpretability.

They also claim these techniques have provided promising results on benchmark tests, showing improvements in accuracy, precision, recall and f1 score compared to baseline models.

It is said to "effectively" address both object level and attribute level hallucinations, providing a structured visual knowledge base for reference in the correction process.

"Woodpecker enhances the reliability of responses generated by MLMs and reduces hallucinations. It could achieve an accuracy of 79.2 percent with low rates of omission and miss-correction," the researchers stated.

The framework can be easily integrated with various MLLMs, serving as a general plug-and-play module. The researchers have currently open-sourced the tool with an interactive demo for further exploration.

To reach the editorial team on your feedback, story ideas and pitches, contact them here.

Tags: