10 May 2023

Can you date more than one chatbot at a time?

Feature article

Right now, language models like ChatGPT - and the challenges they create - are evolving at an exponential rate. The challenges of language models and the need to manage them are shaking the foundations of our society.

Feature article by Professor Rebecca Adler-Nissen (Department of Political Science), Professor Anders Søgaard, Professor Sune Lehmann and Professor Michael Bang Petersen in Politiken on 10 May 2023.

In recent days, Danes have made three times more Google searches for Jon Stephensen than for Volodymyr Zelenskyy, but there is one thing they have searched for even more: ChatGPT. ChatGPT is a language model developed by OpenAI.

Language models are far from a new technology. There were language models in IBM's machine translation systems in the 1990s, and there have been language models in Google Search, Google Translate, Alexa and Siri for many years. But since ChatGPT was made publicly available at the end of November, interest has exploded. And ChatGPT is not alone: there are now many alternatives.

What are language models? Language models are function estimation algorithms - a class of algorithms scientists have been using since the late 19th century - applied to text. They were first implemented on computers in the post-war era.

Language models learn a mathematical function that predicts the next word or sentence from the previous text. In other words, they learn to continue a text. The language models that everyone is now embracing are great for imitating genres and generating texts, but it's synthetic text, not real knowledge.

It's form, not content. So when you use a language model to write a legal contract, as your free psychologist or to calculate the foundation for a new motorway bridge, and you're not a lawyer, psychologist or engineer, you have no idea if the text is meaningful, full of errors or will actually harm you.

But language modelling is also a technology that will forever change our society and the way we live. Because if you can teach a computer to continue all kinds of texts in a realistic way, you've not only developed a technology that can make humans redundant in some contexts, but also affect our thoughts and behaviour in entirely new ways.

Think about it for a moment. If you could continue all texts in a way that resembles the author of the texts as well as the journalists, poets, lawyers and scientists who wrote them. How much would you need to know about politics, art, law, biology, ethnography, organic chemistry and quantum mechanics?

In the late 1990s, when we trained language models for machine translation, we trained them on thousands of documents. Today, we train on around a billion. The surprising consequence of the large amounts of training data is that large language models have so-called emergent properties. Properties that don't exist in small language models, but emerge when the complexity of the model grows large enough.

The newest and most powerful models, such as GPT-4, can translate between languages and genres (and write Shakespearean sonnets about fart jokes), make plans, analyse documents and summarise existing knowledge about a topic. And all this without being directed by humans. No programmers have asked for fart joke sonnets, no programmers have told GPT-4 how to translate from English to German. Language models have acquired these properties by observing patterns in large amounts of text and by being corrected in conversation with users.

Language modelling is a technology that will forever change our society and the way we live.

Rebecca Adler-Nissen, Anders Søgaard, Sune Lehmann and Michael Bang Petersen 

Why will language modelling change our society and the way we live? Because suddenly we all have access to a skilled mediator of almost all accumulated text. We already had access to this accumulated text through search engines, but we had to click our way through a jungle of websites, and we had to sort and translate what we found into a product.

Now we have tools that do that for us. Tools that can do it without us looking over their shoulders. And at scale. Tens or hundreds of thousands of times a day.

Language models have the potential to make the world a better place. The technology can be democratising and can help give us all better access to information - no matter where we live or how we're educated. And no matter what disabilities are holding us back. Language models can also alleviate labour demand in many places - and may soon take over much of the work that people find boring.

But language models also challenge us on many levels. The latest research suggests that not only routine jobs, but also creative and highly skilled jobs will be transformed and many may be challenged by language models. Some researchers believe that language models can be more skilled and empathetic than human doctors.

So, entire job categories are under threat - many that society has spent huge resources training over the past decades. Language models introduce security risks and can be easily misused. And they can contribute to amplifying the problems we're already discussing - around social media, the concentration of power, attention-drain and the spread of unreliable information. In this series of columns, we'll outline some of the biggest challenges and how to solve them.

You pick up the phone. A number you don't recognise. You pick it up and hear the voice of your eldest child.

"Mum, I've run out of money and can't take the train home. Can you transfer 100 kroner?" You hang up the phone. Will you transfer the money? No, it's not a question. Of course you do. But there is - even today - some chance that the sound playing on your phone was generated by artificial intelligence.

Language modellers can send emails, post on social media, create websites and make financial transactions. Completely autonomous.

Rebecca Adler-Nissen, Anders Søgaard, Sune Lehmann and Michael Bang Petersen

In 2022, Americans were scammed out of half a billion dollars in so-called love scams. After weeks or months as romantic pen pals, one party comes to you in financial distress and asks for money. And once the money is transferred, that same party stops replying to the other's messages.

By 2023, we dare to predict, that amount will be significantly higher. Where love scammers have so far been able to keep a dozen or two conversations going, they can use language models to automate these conversations. And have tens or hundreds of thousands of conversations going on at the same time.

There are already companies making a living by personalising chatbots, giving them history and offering users multiple ways to interact with them. Companies that offer you to design an avatar and chatbot persona whose life you can follow through their diary and ask about their day.

But who is also willing to make you their new best friend or boyfriend. Thousands of people are already cultivating these kinds of digital relationships. In online forums, they meet and discuss things like: Can you be married in real life if you're married to your chatbot? Or: Can you date more than one chatbot?

The messaging service SnapChat already offers our young people the opportunity to talk to a chatbot when their friends are not online. Hundreds of chat assistants are already available on the AppStore. Many of them offer chat assistants ready to be your new best friend or girlfriend. One app offers users to chat with historical deceased people. And software developers are now giving body to language models. All to make interaction with them more human. And more compelling.

Language models will challenge the police and lawmakers, but also all of us, individually, because the virtual world that already draws us in and consumes our time will become even more attractive. But language modelling also challenges us in other ways. If you start dating a language model, your most sensitive messages (and information) will end up on a server in Silicon Valley. And they could potentially be sold as training data for new language models. And: If your new boyfriend causes you to lose money, vandalise your neighbour's car or kill yourself, who is ultimately responsible? And what about the language models themselves - what could they possibly come up with?

Language modellers are currently being equipped with control over other software. Language modellers can send emails, post on social media, create websites and make financial transactions. Completely autonomous. A creeping concern arises: Are language models also potentially a security risk if they take control of software in an attempt to fulfil a request that may have been misunderstood? Or what if hackers use them to attack critical infrastructures around the world? Or if they are used in deadly weapons systems?

So, language models have the potential to challenge our law enforcement, our legal system, our propensity to indulge in entertainment, our privacy, and our personal and national security. And we haven't even mentioned the challenges around the infosphere and the labour market - how language models will dilute the credibility of online information and how language models can both alleviate labour demand and lead to structural unemployment. Most importantly, language models challenge us in all of these ways at the same time.

In 1957, Herbert Simon - one of the first artificial intelligence researchers - predicted that a computer would beat a chess grandmaster within 10 years. It took 30 years, and the Danish chess grandmaster Bent Larsen lost. So, in the past, we have overestimated the development of artificial intelligence.

We thought things would happen quickly, but in reality they happened much slower than we had anticipated. Conversely, in 2018, the World Economic Forum predicted that artificial intelligence would start programming in Python in 2023, writing high school essays in 2024 and writing top-40 pop songs in 2028. All three things happened in 2022. So, perhaps we are at a point on the growth curve where we should be careful not to underestimate the development of artificial intelligence?

In 2022, we saw the culmination of three decades of research into language modelling. In 2023, we'll see a lot of spin-offs and further development of technologies that are not quite there yet, such as text-to-video. We will also see integration with virtual reality and the first attempts to populate virtual worlds with populations of chatbots that interact and co-develop new communities. But we think politicians - and all of us - will also realise that we need to regulate the development if the benefits of language models are to continue to outweigh the drawbacks. The legislation that already exists must be enforced, but further regulation is also needed. At least while the EU's rules on artificial intelligence (the EU AI Act) are still under negotiation.

Published as a legislative proposal in 2021, the EU AI Act is a framework for regulating AI in the EU that aims to promote ethical and responsible use. The AI Act divides AI into four risk groups and requires high-risk technology to undergo evaluation before commercialisation and use. The legislation also seeks to ensure transparency, copyright protection and that the technology cannot lead to discrimination against certain population groups.

Negotiations are ongoing, and as recently as last month, additional requirements were added for models like ChatGPT, which have the potential to contribute to misinformation, cyberattacks and manipulation. AI Act By the end of 2023, the AI Act is expected to be passed. Probably. The question is, is it too late and will it be enough?

Ironically, the AI Act may be delayed by the last few months of language modelling developments, but even the end of 2023 seems far away right now. Even for those of us who have been involved in developing the technology behind language modelling, the next few months are unpredictable.

In fact, the only way to predict the future is to shape it yourself. And that's why we need to act now. To ensure peaceful coexistence with artificial intelligence. We need to put up road barriers and crash barriers. In the form of regulation and consumer protection. But won't regulation stand in the way of innovation? Not necessarily. Regulation can also promote innovation. Green energy is an example.

China recently decided to regulate language models because they too are concerned about the forces they unleash. One of the legal requirements is that language models must not generate anti-socialist or subversive content. Chinese researchers and developers are worried about their ability to participate in the development race that is taking place in these months, but we believe the regulation could work in the Chinese favour. Firstly, it probably means that they will have the Chinese market to themselves, but it also means that they need to get good at value alignment very quickly.

Value alignment - that is, alignment with societal values and norms - is a challenge for anyone developing language models. In Denmark, we don't want language models to be socialistic, but we do want to ensure that they reflect our fundamental democratic values; that they don't facilitate bias, cheating, deception and excessive attention drain; and that they don't sow discord between population groups.

Technically, our challenge is the same as the Chinese: we need to prevent language models from generating text we can't account for. But democratically, the challenge is much more difficult. Because in a democracy, we also need to agree on what we can and cannot collectively stand behind. By legislating on the political observance of language models, the Chinese have shown how complicated a challenge we face - both technologically and democratically.

Democracy works slowly, and slowness is usually the strength of democracy. But right now, language models - and the challenges they create - are evolving at an exponential rate. As a result, the challenges to language models, and the need to manage them, are shaking the pillars of our society.


In keeping with the topic, this article has been translated from Danish by a neural machine translation service.

Topics