9 June 2023

The battle for control of artificial intelligence is crucial

Feature article

Artificial intelligence has enormous potential, but it can also concentrate power even more in the hands of the companies or nations that become leaders. This is also a major challenge for Europe.

Feature article in Politiken, 9 June 2023, by Anders Søgaard (Professor, UCPH), Sune Lehmann (Professor, DTU and UCPH), Rebecca Adler-Nissen (Professor, UCPH), Ole Winther (Professor, DTU and UCPH) and Michael Bang Petersen (Professor, Aarhus University).

KNOWLEDGE IS, as Francis Bacon has pointed out, power. And knowledge drives economic growth. Today, the internet has made data more accessible than ever before. It is often described as a democratisation of knowledge, which - one would think - brought equality.

But the power has shifted. Today, the companies that control how you navigate the sea of information are the ones who hold the power. The ones that direct us to a particular piece of information and mould it to resemble the knowledge we want. And turn it into action at scale.

Data provides insights into customer behaviour and market trends and can be fed directly into companies' business strategies and used to develop new products and services. Data can also be used by politicians and opinion leaders in their attempts to sway the population in a certain direction.

Ironically, in a world where information is widely available and the global market is more or less open to everyone, we all seek knowledge in the same places.

Until recently, Google was an example of how there is often only room for one winner in the knowledge economy: 'The winner takes it all.' There is reason to believe that the same is true for language models and chatbots. The biggest player harvests the most data, has the most influence on infrastructure and attracts the most skilled labour. And is slowly expanding its lead.

The race is still open: Microsoft, Google/Alphabet, Facebook/Meta, Anthropic, Amazon, HuggingFace, Alibaba, Bytedance, Tencent and a number of smaller companies are in the race. And the payoff is huge: the value of the global AI market was estimated by Bloomberg to be $60 billion in 2021 and is expected to reach nearly $500 billion by 2028.

So who will be the new Google is an open question, but many favour Google and Microsoft. These two companies are one step ahead.

When we become dependent on other people's technology or resources, we become very vulnerable, as we saw recently with Russian natural gas

Google has Google Search and a strong brand. For most of us, searching the web is synonymous with Googling something. Google's chatbot Bard, based on the PaLM 2 language model, supplements - like GPT-4 in Microsoft's search engine Bing - your search results with actual answers. Google is now integrating Bard into their products, such as Gmail and Assistant. Microsoft, on the other hand, integrates their language models into their Office suite.

Google and Microsoft are being challenged on three fronts: other companies, open source environments and regulation. Anthropic - a relatively new company - recently raised one and a half billion dollars from Google and others to train even better and safer language models. Alibaba is integrating their chatbot into a wide range of their apps.

Conversely, we're also seeing a lot of open source initiatives. Language models are large and expensive to run, but the research community has developed a number of tools to reduce the size of language models and to customise them more efficiently. Leaked documents reveal that big tech companies are very concerned about competition from the open source community.

But perhaps the biggest challenge to tech companies' concentration of power comes from politicians' and citizens' demands for regulation.

In BRUSSELS, work is underway on ambitious AI legislation that will apply across the EU. And the stakes are high. Corporate Europe Observatory has revealed how tech giants like Google and Microsoft have lobbied Brussels lawmakers to keep their language models like Palm 2 and GPT-4 free from the requirements that will be placed on high-risk AI. This is despite the fact that large companies have the easiest time dealing with new regulations because they have more resources to adapt and access to better legal counsel.

It's not just a race for data and money, but also for control over access to knowledge. Today, social media and search engines have a huge influence on our access to knowledge, but a large part of the content is produced on user-driven platforms like Reddit, Stack Overflow and Wikipedia and on fund-driven platforms like lex.dk. These platforms are challenged because we 'starve' them when we use chatbots instead.

Chatbots are of course trained on data from user-driven platforms, but also increasingly on internal and synthetic data, and there is no guarantee which parts of the user-driven platforms' content are promoted by chatbots.

The RESULT can be an extreme asymmetry where knowledge is generated by the chatbot(s) that wins the market. This means that the dominant chatbots gain even more knowledge about us because they already have our emails, calendars and photos - data that is not on the open internet.

This means they can micro-segment in entirely new ways, which not only increases the risk of us being manipulated, but also increases the risk of the information available to us becoming one-sided and monocultural. The internet ceases to be a network and instead becomes a system where each user sits - like the end of spokes on a bicycle wheel - directly connected to the language model at the centre.

Google and Microsoft are American. Alibaba is Chinese. In Russia, Yandex is a major developer of language models. Language models have been shared across the board for the last twenty years. By both researchers and companies. Several of both Google's and Yandex's language models are freely available, but since November last year, several companies have closed themselves off a little more - because the race is heating up.

What will it mean for us who wins the race? Even in an open world, it makes a big difference. Bard, GPT-4 and Yandex's language models run on servers outside of Europe.

Public institutions and businesses are therefore limited in their use of them. A lot of data is not allowed to be sent out of the EU. There are also large differences in the requirements for language models around the world.

The AI Regulation will impose requirements on models operating in the EU, and similar legislation is in the works in the US, but there are many differences.

China has decided on a legislative package that, among other things, requires language models not to produce anti-socialist or subversive content. Finally, language models will be trained on different data and interactions with different users, and thus have different biases.

Many regulators will want to classify artificial intelligence used for critical infrastructure as 'high-risk'.

SOME MODELS will fit the Danish market better than others, and some will fulfil our needs more equally than others. And some countries may allow different business models

WHILE COMPANIES and countries fight for control of knowledge, many are beginning to question what kind of product these companies are actually selling.

A company that sells potato chips creates value by cooking potatoes that they buy from farmers. Google and Microsoft create value by compressing and making available knowledge that users on platforms like Reddit, Stack Overflow and Wikipedia have given them free access to.

OpenAI and StabilityAI give users access to tools that generate visual content but train their models on artworks without paying the artists. Google's MusicLM, which can similarly be used to create new music from lyric descriptions, is also trained on all kinds of music, but the musicians and songwriters don't share in the profits.

Reddit has been outspoken against companies training on their data. Universal Music has called artificially generated music 'counterfeit' and tried to protect their musicians and the music they have rights to. The protection of intellectual property and copyright is another of the many big issues that lawmakers are grappling with at the moment.

But of course, there is also a risk that the world will become less open. And that the exchange of technology between the US, Europe, China, Russia and the rest of the world is hampered by geopolitical tensions. When we become dependent on the technology or resources of others, we become very vulnerable, as we saw recently with Russian natural gas.

ARTIFICIAL INTELLIGENCE is increasingly being used to develop and maintain our critical infrastructure such as traffic, water, heat, energy and hospitals.

Take Bergen's new light railway. Dutch engineering consultancy Sweco NL was tasked with expanding the light railway, taking into account existing tram lines, adjacent roads, cycle paths, and pedestrian zones.

To capture these many factors, the firm used a digital twin to understand how design changes would affect the timeline, costs and the light rail's surroundings. The company estimated that it reduced construction errors by 25 per cent. Physical maintenance is often expensive and time-consuming, which can lead to infrastructure decay.

AI has a unique ability to identify patterns and anomalies before they develop into larger and more costly damages. Last year, in the midst of the energy crisis following Russia's invasion of Ukraine, France, normally Europe's largest energy exporter, suddenly became a net importer when a record number of its nuclear reactors were put out of service due to maintenance shutdowns and a heatwave.

BUT WE ALSO MAKE OURSELVES vulnerable economically and security-wise if we introduce commercial forms of AI into our critical infrastructure.

In December 2021, thousands of Americans realised how dependent they had unknowingly made themselves on Amazon's cloud computing services.

Amazon's servers suddenly crashed and Americans were left without working fridges, doorbells and robotic vacuum cleaners. These and many other things turned out to be dependent on internet access to Amazon's servers.

Educational institutions are slowly starting to base their exams on language models such as GPT-4. Students are asked to produce as good a presentation as possible using GPT-4 and then be critical of it.

But what if GPT-4 is not available on exam day? Or if OpenAI suddenly changes the user interface? Even our behaviour in traffic is now influenced by artificial intelligence. Google Maps can both solve and create traffic jams. And lead people to get lost in deserted areas. Driverless cars will make us particularly vulnerable to this kind of technology.

The list of apps and products that have been or will be equipped with chatbots is very long. For example, GPT-4 is integrated into SnapChat, Duo-Lingo, Bing AI, Khan Academy, Stripe and Petey.

In the future, our interactions with everything from our fridges and thermostats to our children's primary schools and Citizens' Services will likely be through language models. Which are probably accessed through APIs. And our access to them will therefore become dependent on language models.

If language models fail, thermostats and Citizen Service will fail. And if language models work better for some than others, they will have better access to the contents of their fridges and their children's learning plans.

It's easy to understand why many regulators will want to classify artificial intelligence used for critical infrastructure as 'high-risk'.

But operators of critical infrastructure, such as utilities, fear being denied access to useful tools that they believe will help make their systems more efficient and secure. Artificial intelligence is a double-edged sword.

CONFUTSE said that you shouldn't give a sword to someone who can't dance. The implication: A dangerous weapon is best handled by those who want life and the common good. The question is, are the people holding the artificial intelligence in their hands today also the best dancers?


In keeping with the topic, this article has been translated from Danish by a neural machine translation service.

Topics