Amazon Elastic Kubernetes Service

Auto Added by WPeMatico

Building a RAG chat-based assistant on Amazon EKS Auto Mode and NVIDIA NIMs

Chat-based assistants powered by Retrieval Augmented Generation (RAG) are transforming customer support, internal help desks, and enterprise search, by delivering fast, accurate answers grounded in your own data. With RAG, you can use a ready-to-deploy foundation model (FM) and enrich it with your own data, making responses relevant and context-aware without the need for fine-tuning […]

Building a RAG chat-based assistant on Amazon EKS Auto Mode and NVIDIA NIMs Read More »

Containerize legacy Spring Boot application using Amazon Q Developer CLI and MCP server

Organizations can optimize their migration and modernization projects by streamlining the containerization process for legacy applications. With the right tools and approaches, teams can transform traditional applications into containerized solutions efficiently, reducing the time spent on manual coding, testing, and debugging while enhancing developer productivity and accelerating time-to-market. During containerization initiatives, organizations can address compatibility,

Containerize legacy Spring Boot application using Amazon Q Developer CLI and MCP server Read More »

Fine-tune and deploy Meta Llama 3.2 Vision for generative AI-powered web automation using AWS DLCs, Amazon EKS, and Amazon Bedrock

Fine-tuning of large language models (LLMs) has emerged as a crucial technique for organizations seeking to adapt powerful foundation models (FMs) to their specific needs. Rather than training models from scratch—a process that can cost millions of dollars and require extensive computational resources—companies can customize existing models with domain-specific data at a fraction of the

Fine-tune and deploy Meta Llama 3.2 Vision for generative AI-powered web automation using AWS DLCs, Amazon EKS, and Amazon Bedrock Read More »

Beyond accelerators: Lessons from building foundation models on AWS with Japan’s GENIAC program

In 2024, the Ministry of Economy, Trade and Industry (METI) launched the Generative AI Accelerator Challenge (GENIAC)—a Japanese national program to boost generative AI by providing companies with funding, mentorship, and massive compute resources for foundation model (FM) development. AWS was selected as the cloud provider for GENIAC’s second cycle (cycle 2). It provided infrastructure

Beyond accelerators: Lessons from building foundation models on AWS with Japan’s GENIAC program Read More »

Supercharge generative AI workflows with NVIDIA DGX Cloud on AWS and Amazon Bedrock Custom Model Import

This post is co-written with Andrew Liu, Chelsea Isaac, Zoey Zhang, and Charlie Huang from NVIDIA. DGX Cloud on Amazon Web Services (AWS) represents a significant leap forward in democratizing access to high-performance AI infrastructure. By combining NVIDIA GPU expertise with AWS scalable cloud services, organizations can accelerate their time-to-train, reduce operational complexity, and unlock

Supercharge generative AI workflows with NVIDIA DGX Cloud on AWS and Amazon Bedrock Custom Model Import Read More »

Accelerate generative AI inference with NVIDIA Dynamo and Amazon EKS

This post is co-written with Kshitiz Gupta, Wenhan Tan, Arun Raman, Jiahong Liu, and Eiluth Triana Isaza from NVIDIA. As large language models (LLMs) and generative AI applications become increasingly prevalent, the demand for efficient, scalable, and low-latency inference solutions has grown. Traditional inference systems often struggle to meet these demands, especially in distributed, multi-node

Accelerate generative AI inference with NVIDIA Dynamo and Amazon EKS Read More »

Use K8sGPT and Amazon Bedrock for simplified Kubernetes cluster maintenance

As Kubernetes clusters grow in complexity, managing them efficiently becomes increasingly challenging. Troubleshooting modern Kubernetes environments requires deep expertise across multiple domains—networking, storage, security, and the expanding ecosystem of CNCF plugins. With Kubernetes now hosting mission-critical workloads, rapid issue resolution has become paramount to maintaining business continuity. Integrating advanced generative AI tools like K8sGPT and

Use K8sGPT and Amazon Bedrock for simplified Kubernetes cluster maintenance Read More »

Multi-account support for Amazon SageMaker HyperPod task governance

GPUs are a precious resource; they are both short in supply and much more costly than traditional CPUs. They are also highly adaptable to many different use cases. Organizations building or adopting generative AI use GPUs to run simulations, run inference (both for internal or external usage), build agentic workloads, and run data scientists’ experiments.

Multi-account support for Amazon SageMaker HyperPod task governance Read More »

Build scalable containerized RAG based generative AI applications in AWS using Amazon EKS with Amazon Bedrock

Generative artificial intelligence (AI) applications are commonly built using a technique called Retrieval Augmented Generation (RAG) that provides foundation models (FMs) access to additional data they didn’t have during training. This data is used to enrich the generative AI prompt to deliver more context-specific and accurate responses without continuously retraining the FM, while also improving

Build scalable containerized RAG based generative AI applications in AWS using Amazon EKS with Amazon Bedrock Read More »

How Hexagon built an AI assistant using AWS generative AI services

This post was co-written with Julio P. Roque Hexagon ALI. Recognizing the transformative benefits of generative AI for enterprises, we at Hexagon’s Asset Lifecycle Intelligence division sought to enhance how users interact with our Enterprise Asset Management (EAM) products. Understanding these advantages, we partnered with AWS to embark on a journey to develop HxGN Alix,

How Hexagon built an AI assistant using AWS generative AI services Read More »