Architecture

How Amazon scaled Rufus by building multi-node inference using AWS Trainium chips and vLLM

Amazon EC2, Amazon Elastic Container Service, Architecture, AWS Trainium, Customer Solutions / aiepicentre

At Amazon, our team builds Rufus, a generative AI-powered shopping assistant that serves millions of customers at immense scale. However, deploying Rufus at scale introduces significant challenges that must be carefully navigated. Rufus is powered by a custom-built large language model (LLM). As the model’s complexity increased, we prioritized developing scalable multi-node inference capabilities that […]

How Amazon scaled Rufus by building multi-node inference using AWS Trainium chips and vLLM Read More »

Build a serverless audio summarization solution with Amazon Bedrock and Whisper

Amazon API Gateway, Amazon Bedrock, Amazon Bedrock Guardrails, Amazon EventBridge, Application Integration, Architecture, AWS Step Functions, Best Practices, Generative AI, Technical How-to / aiepicentre

Recordings of business meetings, interviews, and customer interactions have become essential for preserving important information. However, transcribing and summarizing these recordings manually is often time-consuming and labor-intensive. With the progress in generative AI and automatic speech recognition (ASR), automated solutions have emerged to make this process faster and more efficient. Protecting personally identifiable information (PII)

Build a serverless audio summarization solution with Amazon Bedrock and Whisper Read More »

Build a scalable AI assistant to help refugees using AWS

AI/ML, Amazon Bedrock, Amazon Machine Learning, Amazon Rekognition, Amazon Translate, Architecture, Artificial Intelligence, Customer Solutions, Generative AI / aiepicentre

This post is co-written with Taras Tsarenko, Vitalil Bozadzhy, and Vladyslav Horbatenko. As organizations worldwide seek to use AI for social impact, the Danish humanitarian organization Bevar Ukraine has developed a comprehensive virtual generative AI-powered assistant called Victor, aimed at addressing the pressing needs of Ukrainian refugees integrating into Danish society. This post details our

Build a scalable AI assistant to help refugees using AWS Read More »

How Amazon scaled Rufus by building multi-node inference using AWS Trainium chips and vLLM

Build a serverless audio summarization solution with Amazon Bedrock and Whisper

Build a scalable AI assistant to help refugees using AWS

Don’t miss out the latest Ai resources

AIE