Google Cloud Next ’23: AI-optimized Infrastructure, AI Models Supporting Local Languages, and Tools for Businesses in Southeast Asia

Google Cloud Next ’23: AI-optimized Infrastructure, AI Models Supporting Local Languages, and Tools for Businesses in Southeast Asia to Build Bold and Responsible Gen AI Applications

  • Cloud TPU v5e, Google Cloud’s most cost-efficient, versatile, and scalable purpose-built AI accelerator to date, to be available in public preview in Google Cloud’s Singapore cloud region later this year
  • Google Cloud and NVIDIA expand partnership to help organizations utilize the same NVIDIA technologies employed over the past two years by Google DeepMind and Google research teams
  • Businesses in Southeast Asia can now build gen AI applications that better serve local users, with Simplified Chinese, Traditional Chinese, Indonesian, Thai, and Vietnamese available for PaLM 2 for Text and Chat
  • Google Cloud becomes the first hyperscale cloud provider to enable the creation of invisible and tamper resistant watermarks in AI-generated images

At Google Cloud Next ’23, Google Cloud announced a series of new partnerships and product innovations to empower every business and public sector organization in Southeast Asia to easily experiment and build with large language models (LLMs) and generative AI (gen AI) models, customize them with enterprise data, and smoothly integrate and deploy them into applications with built-in privacy, safety features, and responsible AI.

Enhancements to Google Cloud’s purpose-built, AI-optimized infrastructure portfolio

The capabilities and applications that make gen AI so revolutionary demand the most sophisticated and capable infrastructure. Google Cloud has been investing in its data centers and network for 25 years, and now has a global network of 38 cloud regions, with a goal to operate entirely on carbon-free energy 24/7 by 2030. This global network includes cloud regions in Indonesia and Singapore, with new cloud regions coming to Malaysia and Thailand. Building on this, Google Cloud’s AI-optimized infrastructure is the leading choice for training and serving gen AI models, with more than 70% of gen AI unicorns already building on Google Cloud, including AI21, Anthropic, Cohere, Jasper, Replit, Runway, and Typeface.

To help organizations in Southeast Asia run their most demanding AI workloads cost-effectively and scalably, Google Cloud today unveiled significant enhancements to its AI-optimized infrastructure portfolio: Cloud TPU v5e—available in public preview—and the general availability of A3 VMs with NVIDIA H100 GPU.

Cloud TPU v5e is Google Cloud’s most cost-efficient, versatile, and scalable purpose-built AI accelerator to date. Now, customers can use a single Cloud Tensor Processing Unit (TPU) platform to run both large-scale AI training and inferencing. Cloud TPU v5e delivers up to 2 times higher training performance per dollar and up to 2.5 times higher inference performance per dollar for LLMs and gen AI models compared to Cloud TPU v4, making it possible for more organizations to train and deploy larger, more complex AI models. Cloud TPU v5e is currently available in public preview in Google Cloud’s Las Vegas and Columbus cloud regions, with plans to expand to other regions, including Google Cloud’s Singapore cloud region later this year.

A3 VMs, supercomputers powered by NVIDIA’s H100 Graphics Processing Unit (GPU), will be generally available next month, enabling organizations to achieve 3 times faster training performance compared to A2, its prior generation. A3 VMs are purpose-built to train and serve especially demanding LLM and gen AI workloads. On stage at Google Cloud Next ’23, Google Cloud and NVIDIA also announced new integrations to help organizations utilize the same NVIDIA technologies employed over the past two years by Google DeepMind and Google research teams.

Google Cloud also announced other key infrastructure advancements, including:

Google Kubernetes Engine (GKE) Enterprise: This enables the multi-cluster horizontal scaling required for the most demanding, mission-critical AI and machine learning (ML) workloads. Customers can now improve AI development productivity by leveraging GKE to manage large-scale AI workload orchestration on Cloud TPU v5e. In addition, GKE support for A3 VM with NVIDIA H100 GPU is now generally available.
Cross-Cloud Network: This is a global networking platform that helps customers connect and secure applications between clouds and on-premises locations. It is open, workload-optimized – which is crucial for end-to-end performance as organizations adopt gen AI, and offers ML-powered security to deliver zero trust.
New AI offerings for Google Distributed Cloud (GDC): GDC is designed to meet the unique demands of organizations that want to run workloads at the edge or in their data centers. The GDC portfolio will bring AI to the edge, with Vertex AI integrations and a new managed offering of AlloyDB Omni on GDC Hosted.

Mark Lohmeyer, Vice President and General Manager, Compute and ML Infrastructure, Google Cloud, said: “For two decades, Google has built some of the industry’s leading AI capabilities: from the creation of Google’s Transformer architecture that makes gen AI possible, to our AI-optimized infrastructure, which is built to deliver the global scale and performance required by Google products that serve billions of users like YouTube, Gmail, Google Maps, Google Play, and Android. We are excited to bring decades of innovation and research to Google Cloud customers as they pursue transformative opportunities in AI. We offer a complete solution for AI, from computing infrastructure optimized for AI to the end-to-end software and services that support the full lifecycle of model training, tuning, and serving at global scale.”

Extending enterprise-ready gen AI development with new models and tools on Vertex AI

On top of Google Cloud’s world-class infrastructure, the company delivers Vertex AI, a comprehensive AI platform that enables customers to access, tune, and deploy first-party, third-party, and open-source models, and build and scale enterprise-grade AI applications. Building on the launch of gen AI support on Vertex AI, Google Cloud is now significantly expanding Vertex AI’s capabilities. These include:

  • Enhancements to PaLM 2: 38 languages, including Simplified Chinese, Traditional Chinese, Indonesian, Thai, and Vietnamese, are now generally available for PaLM 2 for Text and Chat – first-party models for summarizing and translating text, and maintaining an ongoing conversation. PaLM 2 for Text and Chat can be accessed through Vertex AI’s Model Garden alongside adapter tuning capabilities. This makes it possible for organizations in Southeast Asia to build gen AI applications that better serve users in local languages while grounding responses with their own enterprise data or private corpus. Google Cloud is also planning to host PaLM 2 for Text and Chat in its Singapore cloud region later this year. To support longer question-answer chats and summarize and analyze large documents like research papers, books, and legal briefs, PaLM 2 for Text and Chat will now also support 32,000-token context windows (i.e., enough to include an 85-page document in a prompt).
  • Enhancements to Codey: Improvements have been made to the quality of Codey, Google Cloud’s first-party model for generating and fixing software code, by up to 25% in major supported languages for code generation and code chat. Enterprises can access Codey through Vertex AI’s Model Garden alongside adapter tuning capabilities. Google Cloud is also planning to host Codey in its Singapore cloud region later this year.
  • Enhancements to Imagen: Google Cloud introduced Style Tuning for Imagen, a new capability to help enterprises further align their images to their brand guidelines with 10 images or less. Imagen is Google Cloud’s first-party model for creating studio-grade images from text descriptions. Enterprises can access Imagen through Vertex AI’s Model Garden. Google Cloud also launched digital watermarking on Vertex AI, now in experimental availability, to give enterprises the ability to verify AI-generated images produced by Imagen. The experimental availability of digital watermarking on Vertex AI makes Google Cloud the first hyperscale cloud provider to enable the creation of invisible and tamper resistant watermarks in AI-generated images. This technology is powered by Google DeepMind SynthID, a state-of-the-art technology that embeds the digital watermark directly into the image of pixels, making it invisible to the human eye and very difficult to tamper with without damaging the image.
  • New models: Llama 2 and Code Llama from Meta, Technology Innovative Institute’s Falcon LLM—a popular open-source model—are now available on Vertex AI’s Model Garden. Google Cloud is also pre-announcing the availability of Claude 2 from Anthropic on Vertex AI’s Model Garden. Google Cloud will be the only cloud provider offering both adapter tuning and reinforcement learning from human feedback (RLHF) for Llama 2.
  • Vertex AI extensions: Developers can access, build, and manage extensions that deliver real-time information, incorporate company data, and take action on the user’s behalf.
  • Vertex AI Search and Conversation: Now generally available, these tools enable organizations to create advanced search and chat applications using their data in just minutes with minimal coding, and enterprise-grade management and security built in.
  • Grounding: Google Cloud announced an enterprise grounding service that works across Vertex AI Search and Conversation, and foundation models on Vertex AI’s Model Garden, giving organizations the ability to ground responses in their own enterprise data to deliver more accurate responses. The company is also working with a few early customers to test grounding with the technology that powers Google Search.

Google rigorously evaluates its models to ensure they meet its Responsible AI Principles. When using Vertex AI, customers retain complete control over their data: it does not need to leave the customer’s cloud tenant, is encrypted both in transit and at rest, and is not shared or used to train Google models.

Thomas Kurian, CEO, Google Cloud, said: “Equally important to discovering and training the right model is controlling your data. From the beginning, we designed Vertex AI to give you full control and segregation of your data, code, and intellectual property, with zero data leakage. When you customize and train your model with Vertex AI—with private documents and data from your SaaS applications, databases, or other proprietary sources—you are not exposing that data to the foundation model. We take a snapshot of the model, allowing you to train and encapsulate it together in a private configuration, giving you complete control over your data. Your prompts and data, as well as user inputs at inference time, are not used to improve our models and are not accessible to other customers.”

Organizations across industries and around the world are already using Vertex AI to build and deploy AI applications, including affable.ai, Aruna, Bank Raykat Indonesia, FOX Sports, GE Appliances, HCA Healthcare, HSBC, Jiva, Kasikorn Business-Technology Group Labs, KoinWorks, The Estee Lauder Companies, the Singapore Government, Mayo Clinic, Priceline, Shopify, Wendy’s, and many more.

“Since announcing gen AI support on Vertex AI less than six months ago, we’ve been thrilled and humbled to see innovative use cases from customers of all kinds – from enterprises like GE Appliances, whose consumer app SmartHQ offers users the ability to generate custom recipes based on the food in their kitchen, to startup unicorns like Typeface, which helps organizations leverage AI for compelling brand storytelling. We’re seeing strong demand, with the number of Vertex AI customer accounts growing more than 15 times in the last quarter,” added Kurian.

Source: Spark Communications