Amazon's new AI tool diagnoses and resolves outages in minutes
03 Dec 2025




Amazon Web Services (AWS) has launched an innovative artificial intelligence (AI) tool, the DevOps Agent.


The new technology was unveiled at the company's annual 're:invent' conference in Las Vegas.


It is designed to automate the initial response to system failures, thereby reducing downtime and relieving pressure on engineers.


AWS claims that this tool can diagnose complex issues in as little as 15 minutes, a task that could take experienced engineers hours to resolve manually.




DevOps Agent: A digital 1st responder
Tool functionality




The DevOps Agent is designed to function like a digital first responder, automatically investigating outages and suggesting fixes even before engineers log on.


Unlike traditional monitoring systems that only raise alarms, the DevOps Agent acts on the information it gathers.


It pulls data from popular observability tools such as Datadog and Dynatrace, then spawns multiple investigation threads simultaneously.


Each thread explores different root causes of an issue while human engineers are still joining the call.




DevOps agent's effectiveness and future prospects
Performance




The tool's effectiveness was first tested at the Commonwealth Bank of Australia, where it reportedly fixed issues within minutes that would normally take teams hours to resolve.


The advantage of this tool lies in its speed and independence. It doesn't just describe a problem but also proposes fixes, learning from each incident to improve future responses.


This launch highlights AWS's growing focus on "agentic AI," intelligent systems that not only analyze data but also autonomously take meaningful actions.




DevOps agent's architecture and integration
Tool design




The DevOps Agent relies on a mix of Amazon's proprietary models and selected third-party AI systems.


Although AWS hasn't disclosed the exact architecture, the focus is on seamless integration with existing enterprise stacks that have been refined over years.


For companies, this could mean significant time and cost savings, fewer overnight calls for engineers, and faster recovery during critical outages.

Contact to : xlf550402@gmail.com


Privacy Agreement

Copyright © boyuanhulian 2020 - 2023. All Right Reserved.