Amazon SageMaker HyperPod

Auto Added by WPeMatico

Introducing auto scaling on Amazon SageMaker HyperPod

Today, we’re excited to announce that Amazon SageMaker HyperPod now supports managed node automatic scaling with Karpenter, so you can efficiently scale your SageMaker HyperPod clusters to meet your inference and training demands. Real-time inference workloads require automatic scaling to address unpredictable traffic patterns and maintain service level agreements (SLAs). As demand spikes, organizations must […]

Introducing auto scaling on Amazon SageMaker HyperPod Read More »

Train and deploy AI models at trillion-parameter scale with Amazon SageMaker HyperPod support for P6e-GB200 UltraServers

Imagine harnessing the power of 72 cutting-edge NVIDIA Blackwell GPUs in a single system for the next wave of AI innovation, unlocking 360 petaflops of dense 8-bit floating point (FP8) compute and 1.4 exaflops of sparse 4-bit floating point (FP4) compute. Today, that’s exactly what Amazon SageMaker HyperPod delivers with the launch of support for

Train and deploy AI models at trillion-parameter scale with Amazon SageMaker HyperPod support for P6e-GB200 UltraServers Read More »

Beyond accelerators: Lessons from building foundation models on AWS with Japan’s GENIAC program

In 2024, the Ministry of Economy, Trade and Industry (METI) launched the Generative AI Accelerator Challenge (GENIAC)—a Japanese national program to boost generative AI by providing companies with funding, mentorship, and massive compute resources for foundation model (FM) development. AWS was selected as the cloud provider for GENIAC’s second cycle (cycle 2). It provided infrastructure

Beyond accelerators: Lessons from building foundation models on AWS with Japan’s GENIAC program Read More »

Streamline machine learning workflows with SkyPilot on Amazon SageMaker HyperPod

This post is co-written with Zhanghao Wu, co-creator of SkyPilot. The rapid advancement of generative AI and foundation models (FMs) has significantly increased computational resource requirements for machine learning (ML) workloads. Modern ML pipelines require efficient systems for distributing workloads across accelerated compute resources, while making sure developer productivity remains high. Organizations need infrastructure solutions

Streamline machine learning workflows with SkyPilot on Amazon SageMaker HyperPod Read More »

New capabilities in Amazon SageMaker AI continue to transform how organizations develop AI models

As AI models become increasingly sophisticated and specialized, the ability to quickly train and customize models can mean the difference between industry leadership and falling behind. That is why hundreds of thousands of customers use the fully managed infrastructure, tools, and workflows of Amazon SageMaker AI to scale and advance AI model development. Since launching

New capabilities in Amazon SageMaker AI continue to transform how organizations develop AI models Read More »

Accelerate foundation model development with one-click observability in Amazon SageMaker HyperPod

Amazon SageMaker HyperPod now provides a comprehensive, out-of-the-box dashboard that delivers insights into foundation model (FM) development tasks and cluster resources. This unified observability solution automatically publishes key metrics to Amazon Managed Service for Prometheus and visualizes them in Amazon Managed Grafana dashboards, optimized specifically for FM development with deep coverage of hardware health, resource

Accelerate foundation model development with one-click observability in Amazon SageMaker HyperPod Read More »

Accelerating Articul8’s domain-specific model development with Amazon SageMaker HyperPod

This post was co-written with Renato Nascimento, Felipe Viana, Andre Von Zuben from Articul8. Generative AI is reshaping industries, offering new efficiencies, automation, and innovation. However, generative AI requires powerful, scalable, and resilient infrastructures that optimize large-scale model training, providing rapid iteration and efficient compute utilization with purpose-built infrastructure and automated cluster management. In this

Accelerating Articul8’s domain-specific model development with Amazon SageMaker HyperPod Read More »

Multi-account support for Amazon SageMaker HyperPod task governance

GPUs are a precious resource; they are both short in supply and much more costly than traditional CPUs. They are also highly adaptable to many different use cases. Organizations building or adopting generative AI use GPUs to run simulations, run inference (both for internal or external usage), build agentic workloads, and run data scientists’ experiments.

Multi-account support for Amazon SageMaker HyperPod task governance Read More »

How Apoidea Group enhances visual information extraction from banking documents with multimodal models using LLaMA-Factory on Amazon SageMaker HyperPod

This post is co-written with Ken Tsui, Edward Tsoi and Mickey Yip from Apoidea Group. The banking industry has long struggled with the inefficiencies associated with repetitive processes such as information extraction, document review, and auditing. These tasks, which require significant human resources, slow down critical operations such as Know Your Customer (KYC) procedures, loan

How Apoidea Group enhances visual information extraction from banking documents with multimodal models using LLaMA-Factory on Amazon SageMaker HyperPod Read More »

Customize DeepSeek-R1 671b model using Amazon SageMaker HyperPod recipes – Part 2

This post is the second part of the DeepSeek series focusing on model customization with Amazon SageMaker HyperPod recipes (or recipes for brevity). In Part 1, we demonstrated the performance and ease of fine-tuning DeepSeek-R1 distilled models using these recipes. In this post, we use the recipes to fine-tune the original DeepSeek-R1 671b parameter model.

Customize DeepSeek-R1 671b model using Amazon SageMaker HyperPod recipes – Part 2 Read More »