Google unveiled its new AI model Gemini on Wednesday, giving the public a first look at a technology that’s had the tech press mired in rumors. Gemini, the company’s most powerful AI to date, comes to Bard and Pixel 8 Pro smartphones starting today, and will soon integrate with other products across Google’s services including Chrome, Search, Ads, and more. Google has a top-line message it wants you to hear: this thing is way better than anything you’ll get from OpenAI.
“This new era of models represents one of the biggest science and engineering efforts we’ve undertaken as a company,” Google CEO Sundar Pichai said in a statement. “I’m genuinely excited for what’s ahead, and for the opportunities Gemini will unlock for people everywhere.”
Just over a year ago, OpenAI dropped ChatGPT on the world, sending Google and other companies scrambling to prove their tools are just as advanced. So far, Google’s chatbot Bard pales in comparison to ChatGPT. The search giant says that’s changing, starting now. Bard will be most people’s first exposure to Gemini, though it won’t launch with the model’s full capabilities.
Meet the New Bard
Gemini comes in three tiers. Gemini Ultra is Google’s most powerful model, pitched as a competitor to OpenAI’s GPT-4. Gemini Pro is a mid-range model powered to beat out GPT-3.5, the baseline version of ChatGPT. Last is Gemini Nano, a more efficient model built to run on mobile devices.
As of Wednesday, Bard is running on a “finely tuned version of Gemini Pro,” said Sissie Hsiao, Vice President of Google Assistant and Bard, at a press conference. “This will have more advanced reasoning, planning, understanding and other capabilities.”
Hsiao said Google will roll out a paid version of the chatbot running on Gemini Ultra early next year that the company calls Bard Advanced. She declined to share details on pricing.
Google shared a long list of benchmarks showing that on almost every measure, the new Bard outperforms the free version of ChatGPT. The company shared several demonstrations of Bard’s new supercharged abilities, including a collaboration with YouTuber Mark Rober in which the AI helps build a hyper-accurate paper airplane.
Along with Bard, Gemini is also coming to Pixel 8 Pro Android phones in a Wednesday update, albeit in a limited capacity. Gemini Nano now powers the Summarize feature on Android’s Recorder app on Pixel 8 Pros. Google says the AI will also power Android’s Smart Reply feature on the Pixel 8 Pro, but only if you’re using the Google keyboard, and only in WhatsApp. The company says Gemini is coming to more messaging apps and other parts of the operating system next year.
Google says Gemini is better than GPT-4
For now, GPT-4 is the most powerful model available to the public. Google says it has GPT-4 beat, and Gemini Ultra will be the best AI on the market when it rolls out.
“With a score of over 90%, Gemini is the first A.I. model to outperform human experts on the industry standard benchmark MMLU,” said Eli Collins, Vice President of Product at Google DeepMind. “It’s our largest and most capable A.I. model.” MMLU, short for Massive Multitask Language Understanding, measures AI capabilities using standard tests in a combination of 57 subjects such as math, physics, history, law, medicine, and ethics.
It’s unclear when the public will get to see the proof, however. Over the last week, the Information reported that Google pushed back the Gemini launch because the AI “didn’t reliably handle some non-English queries.” Google’s in-person Gemini demos, which were slated for this week, were postponed indefinitely. In response to questions about the alleged foreign language problems, Collins said “Gemini is, actually, quite performant with regards to multilingual capabilities.” Google wouldn’t get more specific than to say Gemini Ultra will be available “early next year.”
“Gemini’s performance also exceeds current state-of-the-art results on 30 out of 32 widely used industry benchmarks,” Collins said.
Google stressed that Gemini is built for “multimodal performance,” meaning it can comprehend different kinds of information such as text, images, video, audio, and more. Google shared a video where a Gemini-powered Bard helps with a student’s physics homework starting with a photo of the assignment with handwritten questions. The AI then seamlessly transitions to written advice, complete with equations and step-by-step answers.
In November, Reuters reported that OpenAI had made progress towards “artificial general intelligence” or AGI, the industry term for AI that’s smarter than human beings, with a secret model called “Q-Star” or “Q*.” The alleged news was Q* demonstrated abilities to answer basic math questions, which is more significant than it sounds, as LLMs aren’t trained to handle questions with one right answer. Competency in math would demonstrate high-level reasoning capabilities.
Google repeatedly stressed Gemini’s math and physics performance, but AGI wasn’t mentioned during the press conference. Gizmodo asked if Gemini’s math performance gave any indication of AGI progress.
“I didn’t see the details of the OpenAI work, so I can’t really speak to that,” Collins said. “However, we have per the presentation made a lot of progress on multimodal reasoning as well as advanced reasoning in mathematics.”