Skip to main content

The No-BS Guide to Logging - Part 1

A vendor-neutral checklist to help you get your log strategy straight
Sven Dummer


We all know log files. We all use log data. At a minimum, every admin and developer knows how to fire up tail –f and use an arsenal of command line tools to dig into a system's log files. But the days where those practices would suffice for operational troubleshooting are long gone. Today, you need a solid log strategy.

Log data has become big data and is more relevant to your success than ever before. Not being able to manage it and make meaningful use of it can, in the worst case, kill your business.

You might have implemented good application monitoring, but that only tells you that something is happening, not why. The information needed to understand the why, and the ability to predict and prevent it, is in your log data. And log data has exploded in volume and complexity.

Why this explosion? The commoditization of cloud technology: one of the greatest paradigm shifts the tech industry has seen over the recent years. Cloud services like Amazon's AWS, Microsoft's Azure, or Rackspace have made it affordable even for small- and medium-sized businesses to run complex applications on elastic virtual server farms. Containers running microservices are the next step in this move toward distributed and modularized systems.

The downside is that complexity is multiplied in those environments. Running tens or hundreds of machines with many different application components increases the risk that one of them will start malfunctioning.

To allow for troubleshooting, each of these many components typically (and hopefully) writes log data. Not only do you have to deal with a staggering number of large log files, but they're also scattered all over your network(s).

To make things a bit more interesting, some components, like VMs or containers, are ephemeral. They're launched on demand, and take their log data with them once they're terminated. Maybe the root cause slowing down or crashing your web store was visible in exactly one of those lost log files.

If that's still not complex enough, add in that people mix different technologies – for example, hybrid clouds that keep some systems on-premise or in a colo. You might run containers inside of VMs or use a container to deploy a hypervisor. Or you also could need to collect data from mobile applications and IoT devices.

Your log management solution needs to be able to receive and aggregate the logs from all your systems and components and store them in one central, accessible place. Leave no log behind – including the ones from ephemeral systems.

Some log management solutions require installing agents to accomplish this, while others are agentless and use de-facto standards like syslog, which are part of copious systems that allow sending logs over the network. Using agents means it's vital to make sure they are available for all operating systems, devices, and other components. You'll also need strategy to keep the agents updated and patched.

Fortunately, there is a checklist of the must-haves when it comes to log management to help you choose and sustain the best practices for your data, which I'll be sharing in my next post.

Read The No-BS Guide to Logging - Part 2

Sven Dummer is Senior Director of Product Marketing at Loggly.

Hot Topics

The Latest

I've spent a lot of time in the channel, and one thing I keep coming back to is this: a partner program is only as good as what it looks like in the field. Many programs look great on paper, but when a partner is in front of a customer navigating a complex hybrid environment or trying to make the case for AI-powered observability, the gap between what a vendor promises and what it actually delivers becomes very clear, very fast ...

Enterprises today operate in a real-time environment where uninterrupted access to trusted data has become a baseline expectation for users, applications and automated systems. Traditional DataOps models, built on manual effort and human triage, cannot keep pace with this always active demand. AI agents are emerging as the operational backbone, ensuring consistent data availability, reinforcing trustworthiness and enabling a level of scale that manual processes cannot achieve ...

For decades, trust in the digital workplace rested on familiar signals. We trusted faces on video calls, voices on the phone, and emails that appeared to come from people we knew. These cues felt human and intuitive. They anchored how decisions were made, approvals were granted, and access was authorized. AI-powered deepfakes have quietly broken that model ...

Cloud migration was supposed to be a one-way door. For most enterprises, it turns out it isn't. Cloud data repatriation is a real and growing trend. A new survey ... finds that 89% of organizations plan to expand their on-premises infrastructure footprint over the next two years — and 75% have already moved at least some workloads back from public cloud in the past 24 months. The findings point to a broad rethinking of where data belongs ...

Over the past few years, large language models (LLMs) have revolutionized the software industry. Given their ability to excel at multi-step reasoning, LLMs have helped enterprises streamline workflows and adapt to the unknown. However, employing such models comes with sky-high costs, latency issues, and limited flexibility. In the realm of IT operations, it is generally wiser to employ smaller, domain-specific models instead ...

For years, DevOps teams operated under a simple assumption: collect enough telemetry, and you can find and fix any problem. That assumption is breaking down. Modern enterprises now operate across microservices, hybrid cloud environments, APIs, Kubernetes, and highly automated delivery pipelines. Releases happen continuously, dependencies shift constantly, and failures spread faster than teams can diagnose them ...

New Relic surveyed IT and engineering leaders from the media and entertainment (M&E) sector to understand what's working — and where challenges persist with their observability practices. The findings reveal how M&E organizations are navigating rising platform complexity, audience expectations, and AI-driven change. Below are five takeaways that stand out ...

Let me start with something I've seen play out more times than I can count. A team hits a wall with the cloud. Costs creep up, then spike. Performance starts to feel inconsistent. Someone in finance asks a simple question like "why did this double?" and nobody has a clean answer ... Maybe this isn't the right place for everything. That realization feels like a breakthrough, like you've identified the problem. In reality, you've just identified the starting line ...

In MEAN TIME TO INSIGHT Episode 24, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses network observability tool sprawl ... 

In cloud-native systems, scaling is often as simple as moving a slider. For on-premise databases, the stakes are different. Over-provisioning hardware is expensive. Under-provisioning leads to performance bottlenecks that are difficult to fix once the equipment is in the rack ...

The No-BS Guide to Logging - Part 1

A vendor-neutral checklist to help you get your log strategy straight
Sven Dummer


We all know log files. We all use log data. At a minimum, every admin and developer knows how to fire up tail –f and use an arsenal of command line tools to dig into a system's log files. But the days where those practices would suffice for operational troubleshooting are long gone. Today, you need a solid log strategy.

Log data has become big data and is more relevant to your success than ever before. Not being able to manage it and make meaningful use of it can, in the worst case, kill your business.

You might have implemented good application monitoring, but that only tells you that something is happening, not why. The information needed to understand the why, and the ability to predict and prevent it, is in your log data. And log data has exploded in volume and complexity.

Why this explosion? The commoditization of cloud technology: one of the greatest paradigm shifts the tech industry has seen over the recent years. Cloud services like Amazon's AWS, Microsoft's Azure, or Rackspace have made it affordable even for small- and medium-sized businesses to run complex applications on elastic virtual server farms. Containers running microservices are the next step in this move toward distributed and modularized systems.

The downside is that complexity is multiplied in those environments. Running tens or hundreds of machines with many different application components increases the risk that one of them will start malfunctioning.

To allow for troubleshooting, each of these many components typically (and hopefully) writes log data. Not only do you have to deal with a staggering number of large log files, but they're also scattered all over your network(s).

To make things a bit more interesting, some components, like VMs or containers, are ephemeral. They're launched on demand, and take their log data with them once they're terminated. Maybe the root cause slowing down or crashing your web store was visible in exactly one of those lost log files.

If that's still not complex enough, add in that people mix different technologies – for example, hybrid clouds that keep some systems on-premise or in a colo. You might run containers inside of VMs or use a container to deploy a hypervisor. Or you also could need to collect data from mobile applications and IoT devices.

Your log management solution needs to be able to receive and aggregate the logs from all your systems and components and store them in one central, accessible place. Leave no log behind – including the ones from ephemeral systems.

Some log management solutions require installing agents to accomplish this, while others are agentless and use de-facto standards like syslog, which are part of copious systems that allow sending logs over the network. Using agents means it's vital to make sure they are available for all operating systems, devices, and other components. You'll also need strategy to keep the agents updated and patched.

Fortunately, there is a checklist of the must-haves when it comes to log management to help you choose and sustain the best practices for your data, which I'll be sharing in my next post.

Read The No-BS Guide to Logging - Part 2

Sven Dummer is Senior Director of Product Marketing at Loggly.

Hot Topics

The Latest

I've spent a lot of time in the channel, and one thing I keep coming back to is this: a partner program is only as good as what it looks like in the field. Many programs look great on paper, but when a partner is in front of a customer navigating a complex hybrid environment or trying to make the case for AI-powered observability, the gap between what a vendor promises and what it actually delivers becomes very clear, very fast ...

Enterprises today operate in a real-time environment where uninterrupted access to trusted data has become a baseline expectation for users, applications and automated systems. Traditional DataOps models, built on manual effort and human triage, cannot keep pace with this always active demand. AI agents are emerging as the operational backbone, ensuring consistent data availability, reinforcing trustworthiness and enabling a level of scale that manual processes cannot achieve ...

For decades, trust in the digital workplace rested on familiar signals. We trusted faces on video calls, voices on the phone, and emails that appeared to come from people we knew. These cues felt human and intuitive. They anchored how decisions were made, approvals were granted, and access was authorized. AI-powered deepfakes have quietly broken that model ...

Cloud migration was supposed to be a one-way door. For most enterprises, it turns out it isn't. Cloud data repatriation is a real and growing trend. A new survey ... finds that 89% of organizations plan to expand their on-premises infrastructure footprint over the next two years — and 75% have already moved at least some workloads back from public cloud in the past 24 months. The findings point to a broad rethinking of where data belongs ...

Over the past few years, large language models (LLMs) have revolutionized the software industry. Given their ability to excel at multi-step reasoning, LLMs have helped enterprises streamline workflows and adapt to the unknown. However, employing such models comes with sky-high costs, latency issues, and limited flexibility. In the realm of IT operations, it is generally wiser to employ smaller, domain-specific models instead ...

For years, DevOps teams operated under a simple assumption: collect enough telemetry, and you can find and fix any problem. That assumption is breaking down. Modern enterprises now operate across microservices, hybrid cloud environments, APIs, Kubernetes, and highly automated delivery pipelines. Releases happen continuously, dependencies shift constantly, and failures spread faster than teams can diagnose them ...

New Relic surveyed IT and engineering leaders from the media and entertainment (M&E) sector to understand what's working — and where challenges persist with their observability practices. The findings reveal how M&E organizations are navigating rising platform complexity, audience expectations, and AI-driven change. Below are five takeaways that stand out ...

Let me start with something I've seen play out more times than I can count. A team hits a wall with the cloud. Costs creep up, then spike. Performance starts to feel inconsistent. Someone in finance asks a simple question like "why did this double?" and nobody has a clean answer ... Maybe this isn't the right place for everything. That realization feels like a breakthrough, like you've identified the problem. In reality, you've just identified the starting line ...

In MEAN TIME TO INSIGHT Episode 24, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses network observability tool sprawl ... 

In cloud-native systems, scaling is often as simple as moving a slider. For on-premise databases, the stakes are different. Over-provisioning hardware is expensive. Under-provisioning leads to performance bottlenecks that are difficult to fix once the equipment is in the rack ...