The No-BS Guide to Logging - Part 1
A vendor-neutral checklist to help you get your log strategy straight
December 07, 2015

Sven Dummer
Loggly

Share this


We all know log files. We all use log data. At a minimum, every admin and developer knows how to fire up tail –f and use an arsenal of command line tools to dig into a system's log files. But the days where those practices would suffice for operational troubleshooting are long gone. Today, you need a solid log strategy.

Log data has become big data and is more relevant to your success than ever before. Not being able to manage it and make meaningful use of it can, in the worst case, kill your business.

You might have implemented good application monitoring, but that only tells you that something is happening, not why. The information needed to understand the why, and the ability to predict and prevent it, is in your log data. And log data has exploded in volume and complexity.

Why this explosion? The commoditization of cloud technology: one of the greatest paradigm shifts the tech industry has seen over the recent years. Cloud services like Amazon's AWS, Microsoft's Azure, or Rackspace have made it affordable even for small- and medium-sized businesses to run complex applications on elastic virtual server farms. Containers running microservices are the next step in this move toward distributed and modularized systems.

The downside is that complexity is multiplied in those environments. Running tens or hundreds of machines with many different application components increases the risk that one of them will start malfunctioning.

To allow for troubleshooting, each of these many components typically (and hopefully) writes log data. Not only do you have to deal with a staggering number of large log files, but they're also scattered all over your network(s).

To make things a bit more interesting, some components, like VMs or containers, are ephemeral. They're launched on demand, and take their log data with them once they're terminated. Maybe the root cause slowing down or crashing your web store was visible in exactly one of those lost log files.

If that's still not complex enough, add in that people mix different technologies – for example, hybrid clouds that keep some systems on-premise or in a colo. You might run containers inside of VMs or use a container to deploy a hypervisor. Or you also could need to collect data from mobile applications and IoT devices.

Your log management solution needs to be able to receive and aggregate the logs from all your systems and components and store them in one central, accessible place. Leave no log behind – including the ones from ephemeral systems.

Some log management solutions require installing agents to accomplish this, while others are agentless and use de-facto standards like syslog, which are part of copious systems that allow sending logs over the network. Using agents means it's vital to make sure they are available for all operating systems, devices, and other components. You'll also need strategy to keep the agents updated and patched.

Fortunately, there is a checklist of the must-haves when it comes to log management to help you choose and sustain the best practices for your data, which I'll be sharing in my next post.

Read The No-BS Guide to Logging - Part 2

Sven Dummer is Senior Director of Product Marketing at Loggly.

Share this

The Latest

January 18, 2022

As part of APMdigest's list of 2022 predictions, industry experts offer thoughtful, insightful, and often controversial predictions on how Network Performance Management (NPM) and related technologies will evolve and impact business in 2022 ...

January 13, 2022

Gartner highlighted 6 trends that infrastructure and operations (I&O) leaders must start preparing for in the next 12-18 months ...

January 11, 2022

Technology is now foundational to financial companies' operations with many institutions relying on tech to deliver critical services. As a result, uptime is essential to customer satisfaction and company success, and systems must be subject to continuous monitoring. But modern IT architectures are disparate, complex and interconnected, and the data is too voluminous for the human mind to handle. Enter AIOps ...

January 11, 2022

Having a variety of tools to choose from creates challenges in telemetry data collection. Organizations find themselves managing multiple libraries for logging, metrics, and traces, with each vendor having its own APIs, SDKs, agents, and collectors. An open source, community-driven approach to observability will gain steam in 2022 to remove unnecessary complications by tapping into the latest advancements in observability practice ...

January 10, 2022

These are the trends that will set up your engineers and developers to deliver amazing software that powers amazing digital experiences that fuel your organization's growth in 2022 — and beyond ...

January 06, 2022

In a world where digital services have become a critical part of how we go about our daily lives, the risk of undergoing an outage has become even more significant. Outages can range in severity and impact companies of every size — while outages from larger companies in the social media space or a cloud provider tend to receive a lot of coverage, application downtime from even the most targeted companies can disrupt users' personal and business operations ...

January 05, 2022

Move fast and break things: A phrase that has been a rallying cry for many SREs and DevOps practitioners. After all, these teams are charged with delivering rapid and unceasing innovation to wow customers and keep pace with competitors. But today's society doesn't tolerate broken things (aka downtime). So, what if you can move fast and not break things? Or at least, move fast and rapidly identify or even predict broken things? It's high time to rethink the old rallying cry, and with AI and observability working in tandem, it's possible ...

January 04, 2022

AIOps is still relatively new compared to existing technologies such as enterprise data warehouses, and early on many AIOps projects suffered hiccups, the aftereffects of which are still felt today. That's why, for some IT Ops teams and leaders, the prospect of transforming their IT operations using AIOps is a cause for concern ...

December 16, 2021

This year is the first time APMdigest is posting a separate list of Remote Work Predictions. Due to the drastic changes in the way we work and do business since the COVID pandemic started, and how significantly these changes have impacted IT operations, APMdigest asked industry experts — from analysts and consultants to users and the top vendors — how they think the work from home (WFH) revolution will evolve into 2022, with a special focus on IT operations and performance. Here are some very interesting and insightful predictions that may change what you think about the future of work and IT ...

December 15, 2021

Industry experts offer thoughtful, insightful, and often controversial predictions on how APM, AIOps, Observability, OpenTelemetry, and related technologies will evolve and impact business in 2022. Part 6 covers the user experience ...