Amazon Elastic Container Service

Build a scalable containerized web application on AWS using the MERN stack with Amazon Q Developer – Part 1

Amazon Cognito, Amazon DocumentDB, Amazon Elastic Container Service, Amazon Q Developer, Technical How-to / aiepicentre

The MERN (MongoDB, Express, React, Node.js) stack is a popular JavaScript web development framework. The combination of technologies is well-suited for building scalable, modern web applications, especially those requiring real-time updates and dynamic user interfaces. Amazon Q Developer is a generative AI-powered assistant that improves developer efficiency across the different phases of the software development […]

Build a scalable containerized web application on AWS using the MERN stack with Amazon Q Developer – Part 1 Read More »

How Amazon scaled Rufus by building multi-node inference using AWS Trainium chips and vLLM

Amazon EC2, Amazon Elastic Container Service, Architecture, AWS Trainium, Customer Solutions / aiepicentre

At Amazon, our team builds Rufus, a generative AI-powered shopping assistant that serves millions of customers at immense scale. However, deploying Rufus at scale introduces significant challenges that must be carefully navigated. Rufus is powered by a custom-built large language model (LLM). As the model’s complexity increased, we prioritized developing scalable multi-node inference capabilities that

How Amazon scaled Rufus by building multi-node inference using AWS Trainium chips and vLLM Read More »

Accelerating AI innovation: Scale MCP servers for enterprise workloads with Amazon Bedrock

Amazon Bedrock, Amazon DynamoDB, Amazon Elastic Container Service, Artificial Intelligence, AWS Fargate, AWS PrivateLink, Elastic Load Balancing, Financial Services, Generative AI, Technical How-to / aiepicentre

Generative AI has been moving at a rapid pace, with new tools, offerings, and models released frequently. According to Gartner, agentic AI is one of the top technology trends of 2025, and organizations are performing prototypes on how to use agents in their enterprise environment. Agents depend on tools, and each tool might have its

Accelerating AI innovation: Scale MCP servers for enterprise workloads with Amazon Bedrock Read More »

How Rufus doubled their inference speed and handled Prime Day traffic with AWS AI chips and parallel decoding

Amazon Elastic Container Service, AWS Batch, AWS Inferentia, AWS Trainium, Generative AI, Technical How-to / aiepicentre

Large language models (LLMs) have revolutionized the way we interact with technology, but their widespread adoption has been blocked by high inference latency, limited throughput, and high costs associated with text generation. These inefficiencies are particularly pronounced during high-demand events like Amazon Prime Day, where systems like Rufus—the Amazon AI-powered shopping assistant—must handle massive scale

How Rufus doubled their inference speed and handled Prime Day traffic with AWS AI chips and parallel decoding Read More »

Amazon Elastic Container Service

Build a scalable containerized web application on AWS using the MERN stack with Amazon Q Developer – Part 1

How Amazon scaled Rufus by building multi-node inference using AWS Trainium chips and vLLM

Accelerating AI innovation: Scale MCP servers for enterprise workloads with Amazon Bedrock

How Rufus doubled their inference speed and handled Prime Day traffic with AWS AI chips and parallel decoding

Don’t miss out the latest Ai resources

AIE