Thursday, Google opened The Gemini 1.5 Pro is what the company describes as offering “dramatically improved performance” over the previous model. The company’s AI trajectory is under review internally increasingly critical for its future – follows Introducing Gemini 1.0 Ultra last week, along with rebranding the Bard chatbot (for Gemini) to match the new model’s more powerful and versatile capabilities.
In an announcement blog post, Google CEO Sundar Pichai and Google DeepMind CEO Demis Hassabis attempt to reassure viewers about ethical AI safety while demonstrating the rapidly evolving capabilities of their model. “Our teams continue to push the boundaries of our latest models in terms of security,” Pichai said.
The company should emphasize safety for AI skeptics (including one former Google CEO) and government regulators. But it also needs to highlight its model’s accelerating performance to AI developers, potential customers and investors who worry the company is too slow to react. The success of OpenAI with ChatGPT.
Pichai and Hassabis say the Gemini 1.5 Pro delivers comparable results to the Gemini 1.0 Ultra. However, Gemini 1.5 runs more efficiently at this level with reduced computing requirements. Multimodal capabilities include processing text, images, videos, audio or code. As AI models evolve, they will continue to provide more versatility in a query box (another recent example OpenAI integrates DALL-E 3 image generation into ChatGPT).
Gemini 1.5 Pro can also handle up to one million tokens or units of AI models in a single request. Google says Gemini 1.5 Pro can process codebases of more than 700,000 words, one hour of video, 11 hours of audio, and more than 30,000 lines of code. The company even says it has “successfully tested” a version that supports up to 10 million tokens.
The company says that Gemini 1.5 Pro maintains high accuracy in queries with a larger number of tokens when there is more new information to learn. The model is said to be impressed Evaluating a needle in a haystack. In this test, developers insert a small piece of data inside a long block of text to see if the AI model picks it up. Google says Gemini 1.5 Pro can find text embedded in data blocks of up to a million tokens 99 percent of the time.
Google says Gemini 1.5 Pro can think of various details from the 402-page Apollo 11 moon mission transcripts. Additionally, it can analyze plot points and events from a downloaded 44-minute silent film starring Buster Keaton. “Since 1.5 Pro’s long context window is the first of its kind among large-scale models, we are continuously developing new benchmarks and benchmarks to test its new capabilities,” Hassabis wrote.
Google introduces Gemini 1.5 Pro with a capacity of 128,000 tokens. the same number which OpenAI’s (publicly announced) GPT-4 models max out. Hassabis says Google will eventually introduce new pricing tiers that support up to one million token requests.
Gemini 1.5 Pro is also adept at learning new skills from data in long queries without additional fine-tuning (“learning in context”). a benchmark called Machine Translation from a Book, the model learned grammar instruction for Kalamang, a language with fewer than 200 speakers globally, and was not previously taught the language. The company says that Gemini 1.5 Pro has learned to perform at the same level as a human learning the same content when translating English to Kalamang.
In a part of the announcement that will be of interest to developers, Google said that Gemini 1.5 Pro will be able to perform troubleshooting tasks in longer blocks of code. “Given a query of more than 100,000 lines of code, it can reason better between examples, suggest useful changes, and explain how different parts of the code work,” Hassabis said.
On the ethics and security front, Google said it’s taking the “same approach to responsible hosting” as it did with its Gemini 1.0 models. This includes the development and implementation of red team techniques, where a group of ethical developers essentially serve as devil’s advocate and test “a range of potential harms.” In addition, the company says it is seriously investigating areas such as content security and representational damages. The company says it continues to develop new ethical and safety tests for its AI tools.
Google is launching Gemini 1.5 in early access for developers and enterprise customers. The company plans to make it more widely available eventually. Gemini 1.0 is currently available for consumers as well as a Pro version it costs $20 per month.