I want to love Siri. I really do. Every day, I walk around with a little robot that lives in my pocket and responds to simple commands. It is a full-blown miracle, the kind of thing our ancestors could only dream of. The problem is Siri’s an idiot. My digital assistant and others like it from Google and Amazon are close to useless for anything more sophisticated than checking the weather. In 2024, that’s going to change. Thanks to AI, Siri is about to get a whole lot smarter, and in small but meaningful ways, that will change your life.
Next year, the tech giants will give their large language models voices and put them in charge of our phones and gadgets. It may not sound like a big deal, but it’s a sea change. Right now, you have to translate your thoughts and intentions into the language of computers. You use their mouses and keyboards, swiping around your phone and other devices manually and following convoluted step-by-step processes. Computers already let us do incredible things, but the human-machine communication barrier is a serious limit on what you can achieve. When Siri meets the kind of tech that powers ChatGPT, you’ll be able to interact with a computer the same way you talk to other people, which is a lot closer to how your brain works. It’s going to make your phone less annoying — and it will revolutionize our relationship not just with AI, but with technology in general.
“In 2024, we’re going to get to this multimodal point where you have advanced, powerful virtual assistants embedded in all kinds of apps and devices,” said Yory Wurmser, a Principal Analyst at Insider Intelligence who covers the AI market. “That’s going to be a major inflection point. It will transform the way we search and interact with our phones, essentially turn all of these limited smart devices in our homes and cars into laptops, and give people a more personal connection to computing that’s going to raise all kinds of interesting questions.”
Think about the last time you got mad at your phone. Typically, there’s a specific and limited combination of steps you have to execute to accomplish any given computing task. User experience experts sometimes call this the “happy path.” Take one step off that happy path, and you’re in for serious frustration. Now, AI will let you compute with words instead of clicks. A new way to interact with machines is about to emerge, and it’s already started.
Silicon Valley wants you to think that tech companies are going to make AI as smart as a guy, and then that fake guy is going to take your job. Despite what you hear from OpenAI, that sci-fi vision of the future may never come to pass. But the real promise of AI isn’t robot super intelligence, it’s a world where all of your devices can understand what you’re saying.
Hey Siri, grow up.
This fall, a company called Humane released an AI Pin you wear on your shirt that’s pitched as a replacement for your smartphone. It generated a lot of early buzz, but this $700 device was mostly met with shrugs and mockery when it debuted. Among other flubs, the video introducing the product featured its AI making several factual errors. There are a lot of problems with Humane’s pin, but above all else, it misses the mark conceptually. We don’t need a device that unlocks the power of AI; we need AI to unlock the power of our devices.
In early December, Google introduced Gemini, the company’s next-generation AI. Among other announcements, Google said it also built a version of Gemini called Nano that’s customized to run on cell phones. Nano came with a promise. Gemini now powers Bard, the company’s GPT-style chatbot, and sometime early next year, Google says it will marry Bard with Assistant, the Siri-esque tool that answers when you say “Hey Google.”
“We’re working on this notion of combining Bard’s capability of doing things with you with Google Assistant’s ability to do things for you,” Krawczyk said. “We certainly believe computing is changing at a fundamental level.”
As usual, OpenAI and its partners at Microsoft are first out of the starting gate on this project. The ever-maligned search engine Bing has a ChatGPT-powered assistant built-in that can seamlessly transition from conversation to generating text to searching the web. In September, OpenAI gave the ChatGPT app voice functionality. You can’t pull it up by saying “Hey ChatGPT,” but that’s an easy fix. Apple is behind, but that doesn’t matter. Apple launched several ambitious AI initiatives, and the company doesn’t have to be first, it just needs to add a useful AI product to the iPhones that hundreds of millions of people already have. Amazon’s Alexa is certainly on the same path, as the company unveiled its own chatbot called Q earlier this year.
“Bard is a form of AI that we call ‘augmented imagination,’ it takes the ideas that you have in your head and helps you explore different ways to bring them to life,” Jack Krawczyk, who heads up the Bard team at Google, said in a recent interview with Gizmodo. Like Siri or Alexa, Google Assistant can take orders like “set a timer for ten minutes.” On the other hand, Bard can collaborate, following commands like “help me write a letter.” The real possibilities come when you combine these technologies.
It starts with easing minor frustrations. I can’t tell Siri or Google Assistant to open up the Netflix app and play a movie because it would take a lot of work for someone to go build that functionality, and it’s not worth Apple’s time. Forget that, Siri can barely Google things for me. Alexa, Google Assistant, and Siri can only handle commands set up by teams of engineers. They chose predetermined combinations of words to trigger those commands. You have to use the exact right language to stay on the happy path. The more complicated your task, the more ways there are to say what you want to do, and the less likely it is that digital assistants will understand.
But if you’ve used ChatGPT, you know it’s a different story. You can ask ChatGPT the same question a million different ways, and more often than not, the AI gets what you’re trying to say. Successfully combine these technologies, and there’s nothing you can’t ask Siri to do.
AI lowers the barriers. It makes it more likely that a digital assistant will be able to handle any given task on the far reaches of your smartphone. Its responses to search inquiries won’t always be accurate and the problem of hallucination will persist but it will be able to handle multi-part instructions or help you with vague suggestions in ways we’ve never seen. Imagine looking through any product page in an Amazon app and casually saying “Siri, make a note about this, send a screenshot to my brother, and buy it at 10 am next Thursday.” Can’t remember your brother-in-law’s name? “Hey Google, call what’s-his-name who was texting me about LeBron James yesterday.”
What happens when your refrigerator goes woke?
The secret isn’t just building an interface between people and computers. Incorporating AI into our operating systems is going to make it easier for computers to talk to each other, too. Right now, making your apps and devices interact requires painstaking manual effort. You have to get computer programs, often ones written in different languages, to send and receive commands using customized, non-standard protocols. AI will help solve that problem. Large language models excel at translating one computer programming language into another for the same reasons they’re good at understanding English.
“In the last 30-odd years of mainstream personal computing, what you have is layers upon layers of metaphors,” said Yousef Ali, CEO of the social audio platform Blast Radio and a long-time tech industry insider. “You have your file system and your windowing system. Then there’s a layer of applications, and within that you’ve got your browser, and built on that is all your cloud-based software. AI gives us one final level of abstraction: I can tell the computer in plain English to go navigate all of that nonsense to find the setting that turns off my stupid Bluetooth.”
If your phone gets smarter, that means all your other devices get smarter too. The AI-fueled simplicity that will make your phone easier to operate, will extend to all the connected products in your life.
“Natural language processing is interoperable by definition, it makes it so much easier for two machines to talk to each other,” Wurmser said. “The level of data and complexity means it will be a long time before everything is fully interoperable, but using AI as the linkage variable definitely eases everything, and that’s happening soon.”
If you follow that idea to its logical conclusion, we’re opening the door to a very strange world. “Google in particular has this vision of ambient computing, where there’s just a computer everywhere you look,” Wurmser said. “I’m not saying that’s going to happen, but you are going to be able to talk to your refrigerator.” And if your refrigerator can talk, do you want it to help keep you healthy? Does that mean we’ll have to step in to make sure it won’t body-shame you?
Starting in 2024, we’re going to have a reckoning about the moral and ethical codes baked into every appliance in our houses. A talking device can not be neutral; someone has to decide what it’s going to say. We’ve gotten a preview of this boring dystopia through the likes of Elon Musk, who’s so furious about ChatGPT’s “woke” responses that he commissioned his own chatbot called Grok. (Musk was unhappy to learn that, like ChatGPT, Grok displayed what many would characterize as left-leaning political positions in its early days.)
“It’s artificial, but these chatbots have personalities. I think people will become more attached, not necessarily to their physical devices, but to the behaviors that these devices have,” he said. “It opens up a lot of exciting possibilities, but it comes with dangers too in terms of people relying on their devices too much, not to mention the questions around privacy and data usage.”
By now you’ve probably played around with ChatGPT, or at least seen others do it. Like Siri, this generation of chatbots is great for a limited set of tasks and not much else: writing limericks, helping engineers code, composing a first draft of an email—impressive but basic stuff. If you’re like most people, this technology has almost no effect on your daily life, but everyone seems to agree it will be transformative in one way or another. Experts disagree about the limits of AI, let alone the possibilities of our current models. But long before we reach robot super intelligence, the AI we already have is going to seep into every corner of our lives.
These updates will happen in a long-term series of stuttering changes, so you may not notice the shift. It’s a process that starts in 2024 and goes on for a long, long time. But when you’re looking back this time next year, the world is going to feel a lot different.