This article was last updated 1 year ago

Last week, Google unveiled its latest, big but controversial stride in artificial intelligence with the introduction of Gemini, putatively the largest and most capable AI model to date. Designed to make AI more accessible and useful for a global audience, Gemini comes in three variants: Ultra, Pro, and Nano. While Nano has already found its place in Android, specifically the Pixel 8 Pro, Gemini Pro is now available for developers and enterprises to integrate into their own use cases.

Gemini Pro, accessible through the Gemini API, outshines similar-sized models on research benchmarks. In its current form, it boasts a 32K context window for text, with plans for larger windows in future versions. Developers can leverage its capabilities for free within certain limits, and competitive pricing is said to be introduced. The feature set includes function calling, embeddings, semantic retrieval, custom knowledge grounding, and chat functionality, supporting 38 languages across 180+ countries.

As part of today’s release, Google shares that Gemini Pro accepts text as input and generates text as output. Additionally, a dedicated Gemini Pro Vision multimodal endpoint has been introduced, accommodating both text and imagery inputs with text outputs. The model comes equipped with Software Development Kits (SDKs) for seamless integration into apps, supporting Python, Android (Kotlin), Node.js, Swift, and JavaScript.

Google also introduced the Google AI Studio, a web-based developer tool facilitating the rapid development of prompts and obtaining API keys for app integration. With a generous free quota, developers can iterate quickly, benefitting from 60 requests per minute—20 times more than other free offerings. AI Studio serves as a gateway into the broader Gemini ecosystem, connecting developers with Gemini Pro and, in the future, Gemini Ultra.

Developers can also transition from AI Studio to Vertex AI on Google Cloud for a fully-managed AI platform. This transition offers customization of Gemini, ensuring full data control and access to additional Google Cloud features, prioritising enterprise security, safety, privacy, and data governance. Vertex AI developers can also explore these models at no cost until general availability early next year, after which charges will apply per 1,000 characters or per image across Google AI Studio and Vertex AI.

Looking ahead, Google anticipates the launch of Gemini Ultra, supposedly the most advanced model for highly complex tasks, early next year. Google also aims to extend Gemini to more developer platforms, including Chrome and Firebase, demonstrating its commitment to the continuous evolution of AI capabilities. With Gemini, Google embarks on a new era of AI, seeking to match and surpass industry benchmarks. The platform’s efficiency, support for multiple languages, and multimodal capabilities position it as a handy tool for developers.