
tribe29 GmbH, supplier of the monitoring solution Checkmk, announced the availability of Checkmk version 1.6, offering enhanced monitoring capabilities for cloud and container environments as well as a new daemon for dynamic host configuration. The new Checkmk version includes new and improved features especially for monitoring cloud environments and container infrastructures. For example, Checkmk retrieves data from Amazon Web Services directly via the AWS HTTP API and monitors all major AWS services. To keep the cost for users low, the Checkmk developers have written the agent in such a way that requires as few costly API calls as possible. On top of that, they've implemented checks for monitoring AWS costs too, so that users always can get alerted about exploding costs. Azure users also find some new plug-ins that can monitor storage space, databases and virtual machines and cost of the Microsoft cloud. Checkmk uses the Azure API to communicate with the cloud. All Azure resources are automatically integrated into Checkmk, thus heavily reducing the required configuration effort. Checkmk 1.6 also contains improved checks for Docker, Kubernetes and OpenShift. The monitoring solution keeps an eye on clusters, nodes, and persistent storage as well as on pods, deployments, and micro services. Docker monitoring in particular has changed: The developers have completely revised the Docker check, enhancing and simplifying it and ensuring that it works even for older Docker versions. Checkmk 1.6 introduces the concept of labels for hosts and services. A host can have an unlimited number of labels. They work similarly to tags and can be used to create conditions for Checkmk rules. Labels are widely used in container and cloud environments, and Checkmk automatically detects and adopt them for better visibility of services and powerful configuration options. In cloud and container environments the number of hosts changes frequently because new ones are being created automatically, and old ones vanish. Since it's impractical to update the Checkmk configuration manually, the new version of the Checkmk Enterprise Edition (CEE) provides a brand-new dynamic configuration daemon (DCD). It simplifies the configuration process significantly by automatically detecting Kubernetes nodes, AWS EC2 instances, Azure resource groups, vSphere hosts and much more – thee daemon even removes hosts that no longer exist from the Checkmk monitoring. The developers have added plenty of new checks, i.e. for monitoring Elasticsearch, Splunk, SAP Hana, Oracle, Cisco UCS, Enviromux, Checkpoint, Dell, Fujitsu, and HP Management Boards. More than 100 new plug-ins have arrived in Checkmk 1.6 since version 1.5. The total number of Checkmk plug-ins has thus increased to 1700. In addition, the monitoring solution now works with i-doit, Slack, ServiceNow, JIRA, Opsgenie, VictorOps, PagerDuty and Mattermost. The new integrations ensure that Checkmk can "fill" the external platforms on its own. For example, notification rules can automatically create tickets for JIRA or ServiceNow.
The Latest
As AI adoption accelerates, operational complexity — not model intelligence — is becoming the primary barrier to reliable AI at scale, according to the State of AI Engineering 2026 from Datadog ... The report highlights a compounding complexity challenge as AI systems scale ... Around 5% of AI model requests fail in production, with nearly 60% of those failures caused by capacity limits ...
For years, production operations teams have treated alert fatigue as a quality-of-life problem: something that makes on-call rotations miserable but isn't considered a direct contributor to outages. That framing doesn't capture how these systems fail, and we now have data to show why. More importantly, it's now clear alert fatigue is a symptom of a deeper issue: production systems have outgrown the current operational approaches ...
I was on a customer call last fall when an enterprise architect said something I haven't been able to shake. Her team had just spent four months trying to swap one AI vendor for another. The original plan said three weeks. "We didn't switch vendors," she told me. "We rebuilt half our integrations and discovered what we'd actually been depending on." Most enterprise leaders don't expect that to be the experience ...
Ask any senior SRE or platform engineer what keeps them up at night, and the answer probably isn't the monitoring tool — it's the data feeding it. The proliferation of APM, observability, and AIOps platforms has created a telemetry sprawl problem that most teams manage reactively rather than architect proactively. Metrics are going to one platform. Traces routed somewhere else. Logs duplicated across multiple backends because nobody wants to be caught without them when something breaks. Every redundant stream costs money ...
80% of respondents agree that the IT role is shifting from operators to orchestrators, according to the 2026 IT Trends Report: The Human Side of Autonomous IT from SolarWinds ...
40% of organizations deploying AI will implement dedicated AI observability tools by 2028 to monitor model performance, bias and outputs, according to Gartner ...
Until AI-powered engineering tools have live visibility of how code behaves at runtime, they cannot be trusted to autonomously ensure reliable systems, according to the State of AI-Powered Engineering Report 2026 report from Lightrun. The report reveals that a major volume of manual work is required when AI-generated code is deployed: 43% of AI-generated code requires manual debugging in production, even after passing QA or staging tests. Furthermore, an average of three manual redeploy cycles are required to verify a single AI-suggested code fix in production ...
Many organizations describe AI as strategic, but they do not manage it strategically. When AI plans are disconnected from strategy, detached from organizational learning, and protected from serious assumptions testing, the problem is no longer technical immaturity; it is a failure of management discipline ... Executives too often tell organizations to "use AI" before they define what AI is supposed to change. The problem deepens in organizations where strategy isn't well articulated in the first place ...
Across the enterprise technology landscape, a quiet crisis is playing out. Organizations have run hundreds, sometimes thousands, of generative AI pilots. Leadership has celebrated the proof of concept (POCs) ... Industry experience points to a sobering reality: only 5-10% of AI POCs that progress to the pilot stage successfully reach scaled production. The remaining 90% fail because the enterprise environment around them was never ready to absorb them, not the AI models ...
Today's modern systems are not what they once were. Organizations now rely on distributed systems, event-driven workflows, hybrid and multi-cloud environments and continuous delivery pipelines. While each adds flexibility, it also introduces new, often invisible failures. Development speed is no longer the primary bottleneck of innovation. Reliability is ...