Google, a subsidiary of Alphabet, is launching a new version of its giant artificial intelligence model, which it says is capable of handling larger amounts of text and video, compared to models produced by competing companies.
The updated AI model, called Gemini 1.5 Pro, will be rolled out on Thursday to cloud platform customers and developers so they can test its new features and eventually build commercial applications. Google and its rivals have spent billions enhancing their generative AI capabilities, and are eager to attract corporate clients to prove their investments are paying off.
“Today we are focused first and foremost on showcasing the research that enabled this model in your hands,” said Oriol Viñals, Google vice president and Gemini co-chief technology officer, during a press conference. “As for tomorrow, we are excited to see what the world will do.” With new capabilities. Google said that the medium-sized version of the new artificial intelligence model, “Gemini 1.5 Pro,” operates at a level similar to the larger “Gemini 1.0 Ultra” model.
Since the tremendous success achieved by OpenAI in late 2022 with its chatbot, ChatGBT, Google has been seeking to also highlight its strength in advanced generative artificial intelligence technology, which can create... New texts, images or even videos based on user commands. More companies are experimenting with this technology, which can be used to automate tasks, such as programming, summarizing reports, or creating marketing campaigns.
Google launched the Gemini AI model last December in three versions, allowing it to be customized to suit the task it is intended to accomplish, and it can be run on all platforms, from mobile devices to large-scale data centers. Gemini is Google's answer to the strong alliance between Microsoft and OpenAI, which some say has been quickest to take advantage of the current AI boom, including cloud platform customers and developers.
More powerful tools
Now, Google is seeking to attract these users to its system using more powerful tools. Gemini 1.5 can be trained faster and more efficiently, and has the ability to process a huge amount of information every time it receives commands to do so, Viñales said. For example, developers can use Gemini 1.5 Pro to query a video up to an hour long, an audio file 11 hours long, or a document containing more than 700,000 words, an amount of data that Google says is “the longest contextual window.” than any large-scale AI model to date. Gemini 1.5 can process more data than the latest AI models from OpenAI and Anthropic can handle, according to Google.
By showing a pre-recorded video it prepared for journalists, Google showed how engineers asked Gemini 1.5 Pro to digest a 402-page PDF document covering the Apollo 11 moon landing, and then asked it to find quotes from the document. "Three humorous moments" appear. One answer from this AI model indicates that five hours into the Apollo 11 mission transcript, astronaut Michael Collins told mission control: “If we're late getting back to you, it's because we're eating sandwiches.”
In another pre-recorded demo, Google engineers asked Gemini 1.5 Pro to find a specific scene in a 44-minute Buster Keaton movie, based on a rough description of the scene they remembered. Gemini successfully found the scene, noting that it was reached about 15 minutes into the video.
Test phase
However, Google cautioned that the responses, like all generative models, are not always perfect. Gemini 1.5 Pro is still prone to deception, runs slowly at times and doesn't always understand what users mean, forcing them to ask their questions in different ways before the model comes up with the correct answer. Viñales said that the company is "working to improve" the performance of Gemini 1.5 to make it faster, and that it is "still in the experimental and research phase."
The company said developers can explore Gemini 1.5 Pro using Google AI Studio, while some cloud platform customers can access the AI model through a private preview on the Vertex AI enterprise platform. ). Google also said Thursday that it will expand access to its Gemini 1.0 Ultra broadband model, making the model available to more global customers on Vertex AI.