AlertD launched out of stealth, unveiling its agentic AI SRE (Site Reliability Engineering) and DevOps platform designed to tackle the mounting operational complexity of cloud operations.
AlertD empowers teams to quickly get contextualized visibility across their AWS environments through an easy to use user interface that delivers verifiable data and allows for powerful collaboration.
AlertD was founded in 2024 by Geoff Hendrey (former Cisco Distinguished Engineer, AppDynamics Chief Architect, Splunk Principal Architect) and Freddy Mangum (former Cisco Entrepreneur-in-Residence, Fortinet VP of Products & Marketing, Venture Capital Cybersecurity Advisor) after experiencing firsthand the limitations of legacy observability, log analysis, cloud ops and alerting tools.
"I have decades of experience working with some of the most powerful observability tools in the industry," explains Geoff Hendrey, co-founder and CEO of AlertD. "While these legacy tools delivered rich instrumentation, they still required extensive manual setup to configure alerts that might serve as early indicators of production issues. But as application development velocity has accelerated—and with the rise of complex microservices architectures—SRE and DevOps teams are now struggling to keep up with the scale and demands of maintaining production uptime."
Hendrey's experience with foundational technologies—including work that preceded what we now know as Retrieval-Augmented Generation (RAG)—combined with years of enterprise experience at Cisco, AppDynamics, and Splunk, made it clear that thoughtfully applied LLMs have the potential to fundamentally transform the lives of SRE and DevOps teams.
The name 'AlertD' pays homage to the Unix daemon ('d')—the silent processes that power critical infrastructure. The company's vision is to build a suite of specialized AI SRE and DevOps agents that are always available, always helpful, and always working in the background to support uptime-critical teams.
The founding team brings deep expertise in launching and scaling products and scaling them from pre-revenue to hundreds of millions in revenue. AlertD has raised $3 million in pre-seed funding, led by Puneet Agarwal of True Ventures, emphasizing capital efficiency and hands-on partnership.
Agarwal is well known for backing visionary founders early, having led the first investments in companies like Duo Security (acquired by Cisco for $2.35B), Puppet Labs (acquired by Perforce), and numerous other foundational infrastructure startups.
"Geoff and Freddy bring a rare combination of technical depth and go-to-market instinct, shaped by decades of experience with some of the most widely used observability, monitoring, analysis and cybersecurity platforms in the industry," said Puneet Agarwal, partner at True Ventures. "With AlertD, they're applying LLMs in a way that has real potential to change how SRE and DevOps teams work day to day, bringing clarity and speed to some of the most demanding moments in software operations."
AlertD's AI SRE and DevOps platform has been shaped directly through ongoing collaboration with mid- to large-sized enterprises that operate mature SRE and DevOps functions. These design partnerships have enabled the team to focus on delivering measurable proactive and reactive outcomes—streamlining daily workflows for individual contributors while providing leadership with real-time visibility into system health and team performance.
Ryan Raines, Sr. Director of DevOps at Privateer—the geospatial intelligence and space sustainability company founded by Steve Wozniak in 2023—explained: "With 19 years in the industry, I lead one of the most talented SRE and DevOps teams out there. Yet even with great people, the demands we face have outpaced what we can solve by simply adding headcount.
"SREs and DevOps spend nearly 50% of their time on low-value work—not due to inefficiency, but because today's tooling for managing production uptime is overly complex while our environments continue to scale. I want my people focused on high-value work, which is why we're helping shape what an optimal AI-native tool for SREs and DevOps should look like.
"The time to embrace AI agents in SRE and DevOps is now. Just as developers have adopted AI co-pilots to accelerate coding, we must adopt intelligent automation to improve uptime. Despite the noise in the AI tooling space, few solutions truly address the breadth of needs our teams face. That's why we partnered with AlertD—their deep expertise and vision for transforming SRE and DevOps workflows makes this a tool our team will rely on daily for both proactive and reactive operations."
AlertD is not just another AI debugger relegated to incident response—it's a comprehensive and extensible platform designed for the full spectrum of cloud operations. AlertD's AI agentic platform is purpose-built to operate seamlessly within the AWS ecosystem, while maintaining a cloud-, LLM-, and SDLC-agnostic architecture. The platform supports both proactive and reactive use cases, empowering SRE and DevOps professionals to work alongside their existing tools and gain comprehensive insights across security, compliance, cost optimization, troubleshooting, account ownership, and infrastructure management. Through its intuitive interface and natural language capabilities, AlertD democratizes access to AWS environment data, enables verification of information retrieved by AI agents, and delivers actionable insights that help teams save significant time and costs.
"Today's cloud operations are overwhelmed by noise and manual toil. The volume and velocity of work demands and ticket queue outpace human capacity to drive fast, effective outcomes," said Freddy Mangum, Co-founder and COO of AlertD. "SRE and DevOps professionals are highly skilled, yet too often trapped in reactive workflows that limit their impact."
Mangum continued, "Specialized AI SRE and DevOps agents are the natural next step—operating 24/7 on behalf of cloud operations teams to filter noise, synthesize complex data across systems, and surface actionable, contextual insights within seconds. While many incumbents are introducing AI agents, most remain to reactive use cases.
"In contrast, AlertD is a multi-purpose, multi-environment–agnostic platform designed to complement and extend existing AI solutions. Just as GitHub Copilot, Cursor, or Claude enhances developer productivity during the build phase, AlertD empowers teams during the run and production phase. Our vision is to make AlertD the 'Slack for production uptime'—an indispensable tool that gives cloud operations teams the confidence and clarity to manage infrastructure at scale."
Key Differentiators:
- AI SRE and DevOps Agents – Specialized AI agents designed for specific proactive and reactive operational tasks.
- Natural Language Interface – Query complex environments and receive actionable insights using plain language—no need for specialized syntax or scripts.
- Proactive and Reactive Operations – Surface insights from AWS metrics and resources to identify and act on security, cost, compliance, and optimization opportunities.
- SRE and DevOps Expert Driven Innovation – Co-built and validated with real-world enterprise design partners to ensure practical, scalable use cases.
- Cloud Security – Deploy securely within your own AWS Virtual Private Cloud (VPC) to maintain full data ownership and control.
- Toil Relief – Obtain AWS insights in seconds—eliminating hours of manual scripting or context switching across multiple tools.
- LLM Agnostic – Vendor-neutral architecture supporting OpenAI, Anthropic, Meta, and other leading LLM providers.
- AI Transparency – Full visibility into AI reasoning, data sources, and analyses, empowering users to verify and trust results.
- Team Collaboration – Easily share knowledge, queries, and insights generated by AlertD AI agents across teams.
- Powerful Search – Advanced search capabilities enable users to drill into AWS metrics and resources for deeper analysis and actionability.
The Latest
In live financial environments, capital markets software cannot pause for rebuilds. New capabilities are introduced as stacked technology layers to meet evolving demands while systems remain active, data keeps moving, and controls stay intact. AI is no exception, and its opportunities are significant: accelerated decision cycles, compressed manual workflows, and more effective operations across complex environments. The constraint isn't the models themselves, but the architectural environments they enter ...
Like most digital transformation shifts, organizations often prioritize productivity and leave security and observability to keep pace. This usually translates to both the mass implementation of new technology and fragmented monitoring and observability (M&O) tooling. In the era of AI and varied cloud architecture, a disparate observability function can be dangerous. IT teams will lack a complete picture of their IT environment, making it harder to diagnose issues while slowing down mean time to resolve (MTTR). In fact, according to recent data from the SolarWinds State of Monitoring & Observability Report, 77% of IT personnel said the lack of visibility across their on-prem and cloud architecture was an issue ...
In MEAN TIME TO INSIGHT Episode 23, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses the NetOps labor shortage ...
Technology management is evolving, and in turn, so is the scope of FinOps. The FinOps Foundation recently updated their mission statement from "advancing the people who manage the value of cloud" to "advancing the people who manage the value of technology." This seemingly small change solidifies a larger evolution: FinOps practitioners have organically expanded to be focused on more than just cloud cost optimization. Today, FinOps teams are largely — and quickly — expanding their job descriptions, evolving into a critical function for managing the full value of technology ...
Enterprises are under pressure to scale AI quickly. Yet despite considerable investment, adoption continues to stall. One of the most overlooked reasons is vendor sprawl ... In reality, no organization deliberately sets out to create sprawling vendor ecosystems. More often, complexity accumulates over time through well-intentioned initiatives, such as enterprise-wide digital transformation efforts, point solutions, or decentralized sourcing strategies ...
Nearly every conversation about AI eventually circles back to compute. GPUs dominate the headlines while cloud platforms compete for workloads and model benchmarks drive investment decisions. But underneath that noise, a quieter infrastructure challenge is taking shape. The real bottleneck in enterprise AI is not processing power, it is the ability to store, manage and retrieve the relentless volumes of data that AI systems generate, consume and multiply ...
The 2026 Observability Survey from Grafana Labs paints a vivid picture of an industry maturing fast, where AI is welcomed with careful conditions, SaaS economics are reshaping spending decisions, complexity remains a defining challenge, and open standards continue to underpin it all ...
The observability industry has an evolving relationship with AI. We're not skeptics, but it's clear that trust in AI must be earned ... In Grafana Labs' annual Observability Survey, 92% said they see real value in AI surfacing anomalies before they cause downtime. Another 91% endorsed AI for forecasting and root cause analysis. So while the demand is there, customers need it to be trustworthy, as the survey also found that the practitioners most enthusiastic about AI are also the most insistent on explainability ...
In the modern enterprise, the conversation around AI has moved past skepticism toward a stage of active adoption. According to our 2026 State of IT Trends Report: The Human Side of Autonomous AI, nearly 90% of IT professionals view AI as a net positive, and this optimism is well-founded. We are seeing agentic AI move beyond simple automation to actively streamlining complex data insights and eliminating the manual toil that has long hindered innovation. However, as we integrate these autonomous agents into our ecosystems, the fundamental DNA of the IT role is evolving ...
AI workloads require an enormous amount of computing power ... What's also becoming abundantly clear is just how quickly AI's computing needs are leading to enterprise systems failure. According to Cockroach Labs' State of AI Infrastructure 2026 report, enterprise systems are much closer to failure than their organizations realize. The report ... suggests AI scale could cause widespread failures in as little as one year — making it a clear risk for business performance and reliability.