Will Google's Gemini Live AI Finally Deliver After Years Of Better Assistant Promises?

What's the definition of a truly helpful assistant on a phone? I'd say one that can talk less like a robot, and get work done across various apps installed on your phone. Google's Gemini — the spiritual successor to Google Assistant — is headed in that direction. Earlier today, Google lifted the covers from its Pixel 9 series phones, and also detailed a handful of cool new tricks arriving within the Gemini experience.

Advertisement

Remember ChatGPT Voice Mode that wowed us with its eerily natural conversation (read: too flirtatious) capabilities not too long ago? Well, Google is targeting something similar with a new feature called Gemini Live. Simply launch it, and start talking to the virtual assistant, almost like a voice call. From web search to finding job opportunities, it will handle it all in natural language.

Google says you can even interrupt it mid-way, ask follow-up questions, and keep the conversation going even when the screen is locked. There are 10 voices to pick from, and support for the English language is available from the get-go. However, as they say, nothing good in this world comes free. Gemini Live will need you to cough up extra cash for a Gemini Advanced subscription, and if that sounds like a non-issue, well, the feature starts rolling out today. Android will be the first to get it, but support for iOS is also on the way.

Advertisement

A boost to comprehension capabilities

A free-flowing conversation with an assistant sounds appealing, but what about getting work done in apps that one uses on a daily basis? Well, just like Siri's new avatar in Apple Intelligence, Gemini is finally going to talk with apps. Or to put it simply, when you ask it to do something, the AI will automatically pick up the right app and get it done. For example, you can ask it to add Himalayan salt to the shopping list, and Gemini will do the needful in your shopping list in Keep notes.

Advertisement

This app integration is being referred to as "extensions" by Google, and in the coming weeks, its horizons will expand to apps and tools like YouTube Music, alarm, media playback, and system-level controls such as wireless connectivity, flashlight, and more. Here is the best part: Gemini will do your bidding with a multi-modal approach, which means it can read, hear, and see whatever it is that you are pushing as an input.

For example, open the Gemini interface, click a picture of a wedding card, and ask it to check whether you have free space on your calendar. Gemini will pick the date printed on the card, match it with your calendar entries, and tell you whether you are free on that date. These are the kind of features that add some quality-of-life convenience to your day-to-day phone usage. If Google can manage to bring third-party apps under the Gemini umbrella, it would redefine how we interact with Android phones forever.

Advertisement

More versatile understanding and interactions

Gemini is also getting another superpower that Apple, too, demoed for Siri at its recent developers conference. That capability is on-screen content awareness. Say, you are streaming a video on your phone: Pull up Gemini, tap on the "Ask about this video" option, and the AI assistant will glean information from that video and present it all in a neatly summarized fashion.

Advertisement

This capability will work with system-level interactions, as well. The overarching theme here is the "Ask about this screen" functional template. Simply put, Gemini will try to make sense of whatever it is that is flashing on your phone's display. It's also a massive accessibility boost, especially for people living with vision and hearing challenges.

Now, Gemini seems to have a few advantages here, compared to Apple's reborn Siri. First, Gemini Live is launching in over 150 countries, and will soon add support for non-English languages. Gemini Extensions, on the other hand, will work across mobile and web platforms and already support over 35 languages. On-screen content awareness, which arrives in the coming weeks, will be available in all markets. 

Advertisement

Apple Intelligence, on the other hand, is currently limited to the U.S. market (plus English language) and its fate in the EU is still a bit shaky. Then there's the convenience aspect: All visual media that appears in the Gemini window can be imported directly within apps like Gmail or Keep running in the background, saving users the chore of multi-tasking back and forth between two separate apps.

Recommended

Advertisement