Digitalisation & Technology, 11 December 2024

When the AI is hallucinating or biased

Interview with Nicolas Konnerth, Head of Conversational AI at ERGO Group

Zwischen Fakten und Fiktion: Wenn die KI halluziniert

Artificial intelligence (AI) offers us a whole range of impressive possibilities to support us in our daily work processes. But how reliable are the results? What about the so-called ‘hallucinations’, when AI invents ‘facts’ that are not real? This is a highly relevant topic, especially for companies that use AI systems to analyse their data or in customer contact. We talked about this with our colleague Nicolas Konnerth, Head of Conversational AI. We wanted to know why AI provides factually incorrect results and how users can protect themselves from them.

A look behind the scenes of AI

"Most people now know that AI does not deliver perfect results." Nicolas Konnerth makes this clear right at the beginning of our conversation, but then adds: “What is often underestimated is the question of why AI models sometimes make things up.” The answer to this is simple: artificial intelligence works with probabilities. This means that it calculates a probability for each statement in its answer based on existing training data – regardless of whether this information is correct or incorrect. This is precisely what can lead to so-called hallucinations. "An AI model like ChatGPT is trained to generate the next most likely word to make the text sound coherent," explains Nicolas Konnerth, adding, "But that also means that the AI makes no distinction between correct and incorrect information."

The more unspecific a request becomes, the more likely it is that the AI will become creative. Nicolas Konnerth cites the example of a hotel's AI-powered chatbot potentially providing incorrect information about room availability because it can't access the current data. "This isn't because the AI is 'bad', but because it didn't have access to the latest data at the time the text was generated," he clarifies.

Another problem is that AI models are based on data that is not always complete or consistent. This means that incorrect answers can be produced, especially for poorly documented topics. "For general questions that were very present in the training data, the AI usually provides quite reliable answers. But when it comes to the details or a topic rarely occurs, then it usually becomes difficult," says Konnerth.

Many people ask their questions too vaguely. This increases the probability that the AI will give a wrong or inaccurate answer because it fills the scope for interpretation creatively.

Nicolas Konnerth, Head of Conversational AI at ERGO Group

How our questions can provoke false results

But it's not just the training data that affects the quality of AI answers. Users also play a decisive role in whether a hallucination is triggered. "Many people ask their questions too vaguely," emphasises Nicolas Konnerth. "This increases the propability that the AI will give a wrong or inaccurate answer because it creatively fills the room for interpretation." At this point, he recalls an example from his own work: a colleague had once tested the summary of a longer text with two different models. Model A provided a list of bullet points, while model B provided a continuous text. For our colleague, model A was better because it better fulfilled his expectation of a summary in the form of bullet points. "But he hadn't indicated in his prompt that he wanted a summary in the form of bullet points. It is precisely such misunderstandings that lead to users misjudging the quality of AI systems," emphasises Nicolas Konnerth.

Prompt optimisation as the key to valid AI results

The quality of the output also depends to a large extent on the quality of the input – a principle that Nicolas refers to as ‘prompt optimisation’. "The clearer the specifications, the more precise the AI's response," he says. However, you should not only say exactly what you want, but also assign the AI a role. One example of this would be to describe the AI as an ‘experienced communications professional’ or a ‘careful customer service representative’ in order to achieve more specific, higher-quality results.

Furthermore, specific examples of how to use the AI help. Nicolas Konnerth advises including a few positive and negative examples in the prompt to make it clear what is and is not desired: "This way, the AI can much better assess what the goal is," he explains.

The ethical dimension: AI and social prejudices

In addition to the technical challenges, AI hallucinations also raise ethical questions. AI models are only as good or as bad as the data on which they are based. These data may reflect social prejudices and stereotypes. This is referred to as bias – the distortion or even systematic deviation of information. Konnerth cites the following example to illustrate this: "If the model is repeatedly presented with CEOs as white, middle-aged men in its training data, then it will reproduce exactly this image." This is precisely where he sees a great responsibility for both the developers and users of AI. "We have to make sure that we carefully select the data we use to train the AI and question existing stereotypes." Most providers are already setting a good example here and have integrated various filter and security mechanisms to minimise such biases in the data sets. "But the problem will remain, of course, as long as we don't also question our own assumptions," Konnerth points out, adding, “It becomes particularly critical when AI models are used to influence political or social discussions. It can be very dangerous if a model suggests that there is a 'general truth' when in reality it is only a distorted representation of reality."

Responsible use and solutions: human-in-the-loop

One approach that Nicolas Konnerth highlights is the so-called ‘human-in-the-loop’ concept. This involves a human checking every response generated by the AI before it is passed on to customers. "This minimises the risk of false information being disseminated – especially in sensitive areas such as insurance," says Nicolas, recalling an incident in Canada in this context: a company was sued because a chatbot had made a legally non-binding promise that a customer relied on – and ultimately won the lawsuit. "Examples like this show how important the human factor remains in our dealings with AI – at least for the time being," says Nicolas Konnerth with conviction.

Accepting AI errors – the next logical step?

Interestingly, some AI systems are now outperforming humans in certain tasks. "A study by OpenAI from 2023 showed that ChatGPT achieved results in the US bar exam that were in the top 10 per cent of all participants," reports Nicolas Konnerth. He takes a differentiated view of this development: "Of course, AI still makes mistakes. But so do humans. The question, of course, is whether we as a society will eventually be willing to accept mistakes made by AI systems in the same way as human mistakes." However, from Nicolas Konnerth's point of view, there is one thing we must always bear in mind: "AI models are stochastic parrots. They can reproduce content, but they don't understand it." Particularly in the education sector, it is therefore important to start training young people at an early stage to deal critically with learning systems and to understand their weaknesses.

Practical tips for companies in their dealings with AI

The Head of Conversational AI at ERGO Group advises companies that want to use AI efficiently: "Precise questions are the be-all and end-all. The more precisely I formulate what I want, the less likely it is that the AI will give a wrong answer." It is also helpful to work with clear examples from your own practice and thus show the AI what the desired result should look like.

Another way to avoid hallucinations when using AI is the ‘retrieval-augmented generation’ model (RAG). "In this case, the AI is supplemented with a knowledge base from which it can only rephrase an answer. This prevents the AI from simply making things up," explains Nicolas Konnerth, but points out that this method does not eliminate all risks either. "At the end of the day, the fact remains that AI is always based on probabilities and can therefore sometimes be wrong," he concludes.

AI is a powerful tool, but it is only as accurate as those who design and use it.

We thank Nicolas Konnerth for the conversation and his valuable insights. 


Your opinion
If you would like to share your opinion on this topic with us, please send us a message to: next@ergo.de


Further articles