Skip to main content

Elastic Announces General Availability of LLM Observability for Google Cloud Vertex AI

SREs can now monitor, analyze and optimize the performance of AI deployments using models from Vertex AI

Elastic announced the general availability of the Elastic Google Cloud Vertex AI platform integration in Elastic Observability. 

This integration offers large language model (LLM) observability support for models hosted in Google Cloud’s Vertex AI platform, providing insights into costs, token usage, errors, prompts, responses and performance. Site Reliability Engineers (SRE) can now optimize resource usage, identify and resolve performance bottlenecks, and enhance model efficiency and accuracy.

“Comprehensive visibility into LLM performance is crucial for SREs and DevOps teams to ensure that their AI-powered applications are optimized,” said Santosh Krishnan, general manager of Observability and Security at Elastic. “Google Cloud’s Vertex AI platform integration provides users robust LLM observability and detection of performance anomalies in real-time, giving them critical insights into model performance that help with bottleneck identification and reliability improvements.”

Support for the Elastic Google Cloud’s Vertex AI platform integration is available now.

The Latest

From growing reliance on FinOps teams to the increasing attention on artificial intelligence (AI), and software licensing, the Flexera 2025 State of the Cloud Report digs into how organizations are improving cloud spend efficiency, while tackling the complexities of emerging technologies ...

Today, organizations are generating and processing more data than ever before. From training AI models to running complex analytics, massive datasets have become the backbone of innovation. However, as businesses embrace the cloud for its scalability and flexibility, a new challenge arises: managing the soaring costs of storing and processing this data ...

Despite the frustrations, every engineer we spoke with ultimately affirmed the value and power of OpenTelemetry. The "sucks" moments are often the flip side of its greatest strengths ... Part 2 of this blog covers the powerful advantages and breakthroughs — the "OTel Rocks" moments ...

OpenTelemetry (OTel) arrived with a grand promise: a unified, vendor-neutral standard for observability data (traces, metrics, logs) that would free engineers from vendor lock-in and provide deeper insights into complex systems ... No powerful technology comes without its challenges, and OpenTelemetry is no exception. The engineers we spoke with were frank about the friction points they've encountered ...

Enterprises are turning to AI-powered software platforms to make IT management more intelligent and ensure their systems and technology meet business needs for efficiency, lowers costs and innovation, according to new research from Information Services Group ...

The power of Kubernetes lies in its ability to orchestrate containerized applications with unparalleled efficiency. Yet, this power comes at a cost: the dynamic, distributed, and ephemeral nature of its architecture creates a monitoring challenge akin to tracking a constantly shifting, interconnected network of fleeting entities ... Due to the dynamic and complex nature of Kubernetes, monitoring poses a substantial challenge for DevOps and platform engineers. Here are the primary obstacles ...

The perception of IT has undergone a remarkable transformation in recent years. What was once viewed primarily as a cost center has transformed into a pivotal force driving business innovation and market leadership ... As someone who has witnessed and helped drive this evolution, it's become clear to me that the most successful organizations share a common thread: they've mastered the art of leveraging IT advancements to achieve measurable business outcomes ...

More than half (51%) of companies are already leveraging AI agents, according to the PagerDuty Agentic AI Survey. Agentic AI adoption is poised to accelerate faster than generative AI (GenAI) while reshaping automation and decision-making across industries ...

Image
Pagerduty

 

Real privacy protection thanks to technology and processes is often portrayed as too hard and too costly to implement. So the most common strategy is to do as little as possible just to conform to formal requirements of current and incoming regulations. This is a missed opportunity ...

The expanding use of AI is driving enterprise interest in data operations (DataOps) to orchestrate data integration and processing and improve data quality and validity, according to a new report from Information Services Group (ISG) ...

Elastic Announces General Availability of LLM Observability for Google Cloud Vertex AI

SREs can now monitor, analyze and optimize the performance of AI deployments using models from Vertex AI

Elastic announced the general availability of the Elastic Google Cloud Vertex AI platform integration in Elastic Observability. 

This integration offers large language model (LLM) observability support for models hosted in Google Cloud’s Vertex AI platform, providing insights into costs, token usage, errors, prompts, responses and performance. Site Reliability Engineers (SRE) can now optimize resource usage, identify and resolve performance bottlenecks, and enhance model efficiency and accuracy.

“Comprehensive visibility into LLM performance is crucial for SREs and DevOps teams to ensure that their AI-powered applications are optimized,” said Santosh Krishnan, general manager of Observability and Security at Elastic. “Google Cloud’s Vertex AI platform integration provides users robust LLM observability and detection of performance anomalies in real-time, giving them critical insights into model performance that help with bottleneck identification and reliability improvements.”

Support for the Elastic Google Cloud’s Vertex AI platform integration is available now.

The Latest

From growing reliance on FinOps teams to the increasing attention on artificial intelligence (AI), and software licensing, the Flexera 2025 State of the Cloud Report digs into how organizations are improving cloud spend efficiency, while tackling the complexities of emerging technologies ...

Today, organizations are generating and processing more data than ever before. From training AI models to running complex analytics, massive datasets have become the backbone of innovation. However, as businesses embrace the cloud for its scalability and flexibility, a new challenge arises: managing the soaring costs of storing and processing this data ...

Despite the frustrations, every engineer we spoke with ultimately affirmed the value and power of OpenTelemetry. The "sucks" moments are often the flip side of its greatest strengths ... Part 2 of this blog covers the powerful advantages and breakthroughs — the "OTel Rocks" moments ...

OpenTelemetry (OTel) arrived with a grand promise: a unified, vendor-neutral standard for observability data (traces, metrics, logs) that would free engineers from vendor lock-in and provide deeper insights into complex systems ... No powerful technology comes without its challenges, and OpenTelemetry is no exception. The engineers we spoke with were frank about the friction points they've encountered ...

Enterprises are turning to AI-powered software platforms to make IT management more intelligent and ensure their systems and technology meet business needs for efficiency, lowers costs and innovation, according to new research from Information Services Group ...

The power of Kubernetes lies in its ability to orchestrate containerized applications with unparalleled efficiency. Yet, this power comes at a cost: the dynamic, distributed, and ephemeral nature of its architecture creates a monitoring challenge akin to tracking a constantly shifting, interconnected network of fleeting entities ... Due to the dynamic and complex nature of Kubernetes, monitoring poses a substantial challenge for DevOps and platform engineers. Here are the primary obstacles ...

The perception of IT has undergone a remarkable transformation in recent years. What was once viewed primarily as a cost center has transformed into a pivotal force driving business innovation and market leadership ... As someone who has witnessed and helped drive this evolution, it's become clear to me that the most successful organizations share a common thread: they've mastered the art of leveraging IT advancements to achieve measurable business outcomes ...

More than half (51%) of companies are already leveraging AI agents, according to the PagerDuty Agentic AI Survey. Agentic AI adoption is poised to accelerate faster than generative AI (GenAI) while reshaping automation and decision-making across industries ...

Image
Pagerduty

 

Real privacy protection thanks to technology and processes is often portrayed as too hard and too costly to implement. So the most common strategy is to do as little as possible just to conform to formal requirements of current and incoming regulations. This is a missed opportunity ...

The expanding use of AI is driving enterprise interest in data operations (DataOps) to orchestrate data integration and processing and improve data quality and validity, according to a new report from Information Services Group (ISG) ...