Most of us will soon be sitting down to holiday dinner with our family and friends, and it’s likely that once the eggnog gets flowing, someone at the table will ask, “What are your thoughts on all this artificial intelligence (AI) hubbub?”
How ready are you to surf this AI tide through the Yuletide?
Usually, people who have opinions about AI fall on a continuum between AI “doomers” (who think AI will usher in the end of humanity) and “accelerationists” (who believe we should develop and integrate powerful AI systems as fast as possible). Where you fall on this scale depends on your “p(doom)” – a tongue-in-cheek term that emerged from Silicon Valley after OpenAI’s release of ChatGPT for how likely you think AI is to ruin the world.
If you’re wondering why some people think AI is so scary, you’re not alone. Most AI systems (robots excluded) don’t even have a body. What can unembodied intelligence do all by itself?
In the holiday spirit, let’s start with the good things first. Advancements in AI – and large language models (LLMs) in particular – are helping humans to tackle big data challenges more effectively by identifying patterns in data better and faster. These models leverage a fine-grained understanding of how language functions (e.g., grammar patterns, syntax, word associations).
It turns out there is something special about human language that distinguishes it from how other animals communicate. Unlike wolf howls or whale moans, human language is compositional. We can flexibly combine and recombine sets of words until the reindeers come home. Just 25 different words can generate more than 15,000 unique sentences.
Now imagine all the words in all the languages. An AI system trained to understand the dynamics embedded in all (or most) human languages can grow so powerful that it unlocks hidden patterns in other non-linguistic kinds of data, like genetic and biological data. This is because all information is a “language” with its own intrinsic patterns.
A tool that recognizes patterns is also very good at predicting what patterns follow others based on features that a system has already learned. AI can best predict what should follow a prompt, like “Ho ho ___,” based on the probability that certain words go together.
This generative capacity has changed the playing field of AI because systems can sift through vast possibilities in mere minutes or seconds to propose unique examples (e.g., drug compounds, molecular interactions, protein structures) never before thought of by humans. This reduces the need for costly and time-consuming trial and error, significantly speeding up scientific and medical discovery.
Another perk is that, unlike the previous generation of AI systems that were mainly designed to do one thing (e.g., predict the risk of cardiovascular disease) with limited data, LLMs are trained on such vast amounts of different kinds of data (e.g., text, image, audio, video, etc.) that they have become “multimodal.” This means that inferences from one type of data (like text) can be leveraged to generate outputs in another (such as images or sound), enabling programs like Dall-E and Midjourney to translate simple strings of words into breathtaking, original panoramas.
This “multimodality” is a holy grail of the AI revolution. Generating one kind of data using another can accomplish more than just funny images from the prompt “avocado armchair.” Think of what doctors and researchers might accomplish by triangulating medical images, genomic information and electronic health record data into a single, harmonized language. Or consider how existing audio or images of you can become fodder for new, photo-realistic digital clones of you doing or saying something you never have.
This is where the tinsel starts to tangle a bit. The same technologies that allow AI systems to generate novel chemical compounds or genetic sequences are also quite handy at creating deepfakes and other fabricated information. AI policymakers and safety advocates are on the edge of their seats waiting to see what mayhem generative AI might cause in the 2024 election season or how it might fuel authoritarian governments, extremist groups, conspiracy networks, social media trolls and others who thrive on misinformation.
Can’t we outlaw these misuses of AI? Some countries are trying, but it’s hard to control (mis)uses of open source technologies, which ensure that source code is freely available to others for modification and redistribution, as many LLMs and genAI models are. The catch-22 is that limiting open-sourcing by making AI models strictly proprietary or government-controlled also limits their auditability and transparency, perpetuating social injustices and (further) concentrating wealth and power among a privileged few with exclusive control over what some call one of the most important inventions in human history.
The battle over who will control AI is a battle for who controls the flow and integrity of information. It is no wonder people find that scary. AI is voracious for information – yours and mine included – and we feed it every time we use it (without credit or compensation). In essence, these models are us, a representation of all the knowledge our species has documented (and digitized), for better or worse. Naughty or nice. AI doesn’t care, can’t care.
That means we have to trust other humans to make AI care – to build guardrails in our (and our children’s and our species’) best interests. Maybe it’s that, and not the AI – that scares us.