Start with The No-BS Guide to Logging - Part 1
Coming off of the last post outlining the necessity for log management, the process of choosing logging software can seem daunting. The following are major elements of a good log strategy and can also serve as checklist items when you shop for a log management solution:
Collect, Aggregate, Retain
It's crucial to think about your data retention needs and the costs associated with storing them. How long do you need to keep the logs? Do you need them just for troubleshooting, or also for business intelligence type of analysis? Are there regulatory or audit requirements that require you to keep the logs for a certain period of time?
Your daily log volume might already be large, but keep in mind that it doesn't take much to multiply the volume temporarily. For example, a component failure and the resulting log messages in a complex system could easily quadruple the amount of log messages. An external event could have the same effect: if you run an online store, Black Friday might balloon your sales as well as your log volumes. If your log aggregation doesn't scale, you could lose your main troubleshooting foundation when you need it most.
Handle Log Diversity
Log files come in a variety of formats, some following standards and conventions, others completely custom. Your log solution should be able to parse and present the data in a comprehensive form in near real-time, and it should allow to define custom parsing rules. A desirable feature is the ability to add metadata.
Reveal What Matters
Just having a search tool is not enough. To make sense of your log data and the correlation between different data points, you need real-time indexing and parsing, grouping, along with powerful analytics, customizable dashboards, and data visualization. Your log analytics solution should provide a treasure map to the contents of your logs, not just a metal detector that you must use to scan indiscriminately.
Detect Anomalies
Given the volume and complexity of log data, you can't rely on searching for problems. Things you never anticipated happening are typically the type of problems that hurt the most. A good log analytics solution should be able to learn what is “normal” in your log data, and automatically identify and highlight any deviations from norms.
Make Your Own Apps Log
If you write your own code, your log management solution must be able to parse and analyze it. Consider using a well-established data format like JSON (our recommendation) or XML. Whatever you choose, make sure it's plain text format (not binary), that it is human-readable, and easy to parse. Your log solution should be able to easily receive the logs from your application and allow you to set up custom parsing rules if needed.
Be Alert(ed)
Just like every good monitoring application, every good log management solution should allow to send you and your teams alerts based on defined events, like error messages. It should be possible to send these alerts through common third party collaboration tools.
Don't Break the Bank
Cloud technologies made running distributed systems and elastic compute farms affordable for SMBs. The bill for the troubleshooting tools should be affordable, too. There are fully cloud-based SaaS solutions out there, as well as on-premise products and hybrids, which typically come at higher costs (including those for hardware and datacenter footprint).
Key criteria to decide if SaaS or on-premise solutions are right for you are the sensitivity and volume of your data. Security or privacy concerns or regulatory requirements may keep you from transferring data across public networks. Similarly, the sheer data volume could make this impossible or too expensive.
Sven Dummer is Senior Director of Product Marketing at Loggly.