Monthly Archives: January 2015

Introduction to Logstash Grok Patterns

Introduction

The grok filter – and its use of patterns – is the truly powerful part of logstash.   Grok allows you to turn unstructured log text into structured data.

grok

The grok filter attempts to match a field with a pattern.  Think of patterns as a named regular expression.  Patterns allow for increased readability and reuse.  If the pattern matches, logstash can create additional fields (similar to a regex capture group).

This example takes the event’s “message” field and attempts to match it with 5 different patterns (e.g. “IP”, “WORD”).  If it finds a match for the entire expression, it will add fields for the patterns (“IP” will be stored in the “client” field, etc).

filter {
 grok {
   match => [ "message", "%{IP:client} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:bytes} %{NUMBER:duration}" ]
 }
}

If the input doesn’t match the pattern, a tag will be added for “_grokparsefailure”.  You can (and should; see best practices) customize this tag.

Patterns

Logstash ships with lots of predefined patterns.  You can browse them on github.

Patterns consist of a label and a regex, e.g.:

USERNAME [a-zA-Z0-9._-]+

In your grok filter, you would refer to this as %{USERNAME}:

filter {
 grok {
   match => [ "message", "%{USERNAME}" ]
 }
}

Patterns can contain other patterns, e.g.:

SYSLOGTIMESTAMP %{MONTH} +%{MONTHDAY} %{TIME}

 

Target Variables

A pattern can store the matched value in a new field.  Specify the field name in the grok filter:

filter {
 grok {
   match => [ "message", "%{USERNAME:user}" ]
 }
}

If you’re using a regexp, you can make a new field with an Oniguruma trick:

filter {
  grok {
    match => [ "message", "(?<myField>[a-z]{3})" ]
  }
}

This would find three lower case letters and create a field called ‘myField’.

Casting

By default, grok’ed fields are strings.  Numeric fields (int and float) can be declared in the pattern:

filter {
 grok {
   match => [ "message", "%{USERNAME:user:int}" ]
 }
}

Note that this is just a hint that logstash will pass along to elasticsearch when it tries to insert the event.  If the field already exists in the index with a different type, this won’t change the mapping in elasticsearch until a new index is created.

Custom Patterns

While logstash ships with many patterns, you eventually will need to write a custom pattern for your application’s logs.  The general strategy is to start slowly, working your way from the left of the input string, parsing one field at a time.

Your pattern does not need to match the entire event message, so you can skip leading and trailing information if you just need something from the middle.

Grok uses Oniguruma regular expressions.

Be sure to use the debugger (see below) when developing custom patterns.

Debugging

There is an online grok debugger available for building and testing patterns.

Monitoring your log files

Overview

If you’ve setup your ELK cluster and logs are flowing in from your shippers, you’re now sitting on a goldmine of data.  The question becomes, “what should I do?!??”

A first step is to make Kibana dashboards, but they serve little value in a lights-out environment (see http://svops.com/blog/?p=11).

When you’re ready to actively monitor the information that’s sitting in the cluster, you’ll want to pull it into your monitoring system (Nagios, Zabbix, ScienceLogic, whatever).

There are many benefits to this approach over Logstash’s build-in notifications, including:

  • one alerting system (common message format, distribution groups, etc).
  • one escalation system (*)
  • one acknowledgement system (*)
  • one dashboard for monitoring

(*) Logstash doesn’t provide these features.

This system is also better than using Logstash’s nagios-related plugins, since you’ll be querying all the documents in Elasticsearch, not just one document at a time.  You’ll also be using Elasticsearch as a database, rather than using Logstash’s metric{} functionality as a poor substitute.

There are two systems that you should build.  I’ll reference Nagios as the target platform.

Individual Metrics

If you wanted to query Elasticsearch for the total number of Java exceptions that have occurred, this is a good individual metric.

In Nagios, you would first define a virtual host (e.g. “elasticsearch”, “java”, “my_app”, etc) and a virtual service (e.g. “java exceptions”).  The service would run a new command (e.g. “run_es_query”).  Set the check interval to something that makes sense for your organization.

The magic comes in writing the underlying program that is run by the “run_es_query” command.  This program should take a valid Elasticsearch query_string as a parameter, and run it against the cluster.

In the Nagios world, the script has to return the values to show OK, WARNING, etc.  The output of the script can also include performance data, which is used for charting.

The python elasticsearch module makes writing the script pretty easy.  Write one script for each query type (max, count, most recent document, etc); this will help keep your code from becoming unreadable due to being so generic.

Bulk Metrics

If you wanted to count the Java exceptions, but report them on a machine-by-machine basis, you would not want to launch the “individual metric” command for a set of physical hosts.  Doing this would result in many queries being run against Elasticsearch, and doesn’t scale well at all.

The better alternative is to run one “bulk” script that pulls the data for all hosts from Elasticsearch and then passes that information to Nagios using the “passive check” system.  Nagios will react to the information as configured.

 Where’s the Code?

I’ve written this plugin a few times for different platforms, but always as (unsharable) work-for-hire.  I hope to rewrite this in my spare time some day, but this outline should get you started.