close
close

Google Lens can now answer questions about videos

Google is expanding its visual search app Lens with the ability to answer questions about your surroundings in near real time.

English-speaking Android and iOS users with the Google app installed can now start recording a video through Lens and ask questions about interesting objects in the video.

Lou Wang, head of product management at Lens, said the feature uses a “tailor-made” Gemini model to understand the video and relevant questions. Gemini is Google's family of AI models and powers a number of products across the company's portfolio.

“Let’s say you want to learn more about some interesting fish,” Wang said in a news conference. “[Lens will] Create an overview that explains why they swim in circles, along with other resources and helpful information.”

To access Lens' new video analytics feature, you'll need to sign up for Google's Search Labs program and sign up for the experimental AI Overviews and More features in Labs. In the Google app, holding down your smartphone's shutter button will activate Lens' video recording mode.

If you ask a question while recording a video, Lens will link to an answer provided by AI Overviews, the feature in Google Search that uses AI to summarize information from the web.

Photo credit:Google

According to Wang, Lens uses AI to determine which images in a video are most “interesting” and salient – ​​and most importantly, relevant to the question being asked – and uses these to “ground” the answer from AI overviews.

“All of this comes from observing how people are currently trying to use things like Lens,” Wang said. “If you lower the barrier to asking these questions and help people satisfy their curiosity, people will naturally pick up on it.”

The introduction of video for Lens follows a similar feature that Meta unveiled last month for its Ray-Ban Meta AR glasses. Meta plans to equip the glasses with real-time AI video capabilities that will allow wearers to ask questions about their surroundings (e.g. “What kind of flower is that?”).

OpenAI also announced a feature that allows its Advanced Voice Mode tool to understand videos. Finally, Advanced Voice Mode – a premium feature of ChatGPT – will be able to analyze videos in real time and take context into account while responding to you.

Google appears to have beaten both companies – except for the fact that Lens is asynchronous (you can't chat with it in real time) and assuming the video feature works as advertised. We weren't shown a live demo during the press conference, and Google has a history of overpromising when it comes to the capabilities of its AI.

In addition to video analysis, Lens can now also search images and text in one go. English-speaking users, even those who don't participate in Labs, can launch the Google app and press and hold the shutter button to take a photo, then ask a question by speaking out loud.

Finally, Lens gets new e-commerce specific features.

Starting today, when Lens on Android or iOS detects a product, it will display information about it, including price and offers, brand, reviews and inventory. Product ID works on uploaded and newly taken photos (but not videos) and is initially limited to select countries and certain shopping categories, including electronics, toys and beauty.

Shopping with Google Lens
Photo credit:Google

“Let’s say you saw a backpack and you like it,” Wang said. “You can use Lens to identify this product and you can immediately see details you might be wondering about.”

There is also an advertising component. According to Google, the results page for products identified by Lens also shows “relevant” shopping ads with options and prices.

Why stick advertising in Lens? According to Google, around 4 billion Lens searches are related to shopping every month. For a tech giant whose lifeblood is advertising, this opportunity is simply too lucrative to pass up.