Thursday, May 23, 2024
Tech News

Google teases an AI camera feature ahead of I/O that looks better than Rabbit R1’s

Screenshot of Gemini AI

Google teases advancements in Gemini’s multi-modal AI capability.


Ahead of its much-anticipated annual I/O event, Google released a short teaser video on X showing off some new multimodal AI functionality that is sure to have the makers of Rabbit’s R1 quaking in their boots.

In the video, the user holds up their (Android) phone’s camera to the I/O stage and asks “What do you think is happening here?” Gemini, Google’s AI model then responds, “it looks like people are setting up for a large event, perhaps a conference or a presentation.” Then, Gemini asks its own question: “Is there something in particular that catches your eye?” 

When the user asks Gemini what the large letters on the stage mean, Gemini correctly identifies Google’s developer conference. The question likely helps the AI gain contextual information, which in turn positions it to provide more useful answers. The chatbot then follows up with another question: “Have you ever attended Google I/O?” The conversation appears natural and effortless, at least in the video. 

In April, Rabbit showed off similar multimodal AI technology during its R1 launch demo that many lauded as an exciting feature. Google’s teaser video proves the company has been hard at work in developing similar functionality for Gemini, and from the looks of it, it might even be better.

Google and Rabbit aren’t alone. Also today, OpenAI showed off its own suite of developments in its OpenAI Spring Update livestream, including GPT-4o, its newest AI model that now powers ChatGPT to “see, hear, and speak.” During the demo, presenters showed the AI a host of different things via their smartphone’s camera, including a math problem written by hand, and the presenter’s facial expressions, with the AI correctly identifying these things through a similar conversational back-and-forth with its users.

Also: What to expect from Google I/O 2024: Android 15, Gemini, Wear OS, and more

When Google updates Gemini on mobile with this feature, the company’s technology could jump to the front of the pack in the AI assistant race, particularly with Gemini’s exceedingly natural-sounding cadence and follow-up questions. Although the exact breadth of capabilities will be revealed at I/O, this development certainly puts Rabbit in a tricky position, making one of its standout features essentially redundant.

Also: What is Gemini? Everything you should know about Google’s new AI model

As with any demo that isn’t shown off live, you should take this one with a grain of salt. The strategic release of this video just an hour before OpenAI’s livestream suggests Google will have a lot more to say about Gemini this week.