Introduction to Logstash Grok Patterns

Introduction

The grok filter – and its use of patterns – is the truly powerful part of logstash.   Grok allows you to turn unstructured log text into structured data.

grok

The grok filter attempts to match a field with a pattern.  Think of patterns as a named regular expression.  Patterns allow for increased readability and reuse.  If the pattern matches, logstash can create additional fields (similar to a regex capture group).

This example takes the event’s “message” field and attempts to match it with 5 different patterns (e.g. “IP”, “WORD”).  If it finds a match for the entire expression, it will add fields for the patterns (“IP” will be stored in the “client” field, etc).

filter {
 grok {
   match => [ "message", "%{IP:client} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:bytes} %{NUMBER:duration}" ]
 }
}

If the input doesn’t match the pattern, a tag will be added for “_grokparsefailure”.  You can (and should; see best practices) customize this tag.

Patterns

Logstash ships with lots of predefined patterns.  You can browse them on github.

Patterns consist of a label and a regex, e.g.:

USERNAME [a-zA-Z0-9._-]+

In your grok filter, you would refer to this as %{USERNAME}:

filter {
 grok {
   match => [ "message", "%{USERNAME}" ]
 }
}

Patterns can contain other patterns, e.g.:

SYSLOGTIMESTAMP %{MONTH} +%{MONTHDAY} %{TIME}

 

Target Variables

A pattern can store the matched value in a new field.  Specify the field name in the grok filter:

filter {
 grok {
   match => [ "message", "%{USERNAME:user}" ]
 }
}

If you’re using a regexp, you can make a new field with an Oniguruma trick:

filter {
  grok {
    match => [ "message", "(?<myField>[a-z]{3})" ]
  }
}

This would find three lower case letters and create a field called ‘myField’.

Casting

By default, grok’ed fields are strings.  Numeric fields (int and float) can be declared in the pattern:

filter {
 grok {
   match => [ "message", "%{USERNAME:user:int}" ]
 }
}

Note that this is just a hint that logstash will pass along to elasticsearch when it tries to insert the event.  If the field already exists in the index with a different type, this won’t change the mapping in elasticsearch until a new index is created.

Custom Patterns

While logstash ships with many patterns, you eventually will need to write a custom pattern for your application’s logs.  The general strategy is to start slowly, working your way from the left of the input string, parsing one field at a time.

Your pattern does not need to match the entire event message, so you can skip leading and trailing information if you just need something from the middle.

Grok uses Oniguruma regular expressions.

Be sure to use the debugger (see below) when developing custom patterns.

Debugging

There is an online grok debugger available for building and testing patterns.

15 responses to “Introduction to Logstash Grok Patterns

  1. Hi guys, can someone help me with my logs ? i want to make grok filter for the below ..

    Jun 5 09:01:46 static-host-96-9-129-122.awxx.ox lawful_intercept: TM=1496636278.401667 IF=eth0.10 OF=eth1 IFS=eth0.10 UID=737716,1 BID=3165 MAC=AC:EE:9E:43:BD:FE PRO=6 OSA=10.128.239.235:55941 ODA=52.4.e.x:443 SA=10.128.239.235:55941 DA=52.4.208.241:443 MET=OTHER HOST= UA= URI=

    • Your best bet is to use the grok debugger, building up your pattern from left to right. If you still have problems, get on the #logstash IRC channel and ask for help. Be prepared to show examples of what you’ve tried and exactly what you’re stuck on.

  2. Kiranmai Reddy

    I want to grok logs with repeated fields
    eg:
    [2017-05-29 02:17:18] INFO – [ActivityServiceRest:89] – [{“callId”:”kjwefkjweqkfb”},{“callId”:”nwekgwnkqgkqr”},{“callId”:”ohjiwnwbnwbrk”}]
    where number of times callId appears is dynamic
    Can someone please help.

    • The array of callIds looks like JSON. Why not match it into one string and use the logstash json{} parser to create a real array?

  3. Sherine Davis

    how to display the text thats matched by a pattern (present in a variable) to stdout ?

  4. I have a grok match like this:
    match => [ “message”, “Duration: %{NUMBER:duration}”, “Speed: %{NUMBER:speed}” ]

    I also want to add another field to captures if it matches a grok pattern. For example, “type:duration_type” if it is duration text and “type:speed_type” if it is speed text. Can I do this within a match operation?

    I know I can use mutate plugin and if-else to add new fields but I have too many matches and it will be too long that way.

    • I don’t know of a way for grok to tell you which pattern matched the input. Note that, unless those patterns are mutually exclusive, it really feels like an inefficient pattern. Check out the kv{} filter.

  5. Prabhuanand Sivashanmugam

    %{URIHOST} %{TIME} 1501361 %{SYSLOG5424SD} %{UUID} – entry:”%{JAVACLASS}″

  6. HI

    we want to match
    2016-01-07 11:20:39,134 1501361 [http-nio-10080-exec-475] f169be79-5b50-4003-9a5a-0b69dced83f0 – entry:”membership-bs-1.1″

    and remove these 2 fields 1501361 [http-nio-10080-exec-475]

  7. It will be great if you could provide some working example on the above. This will be very helpful as most of the information on grok lacks examples and is very difficult to follow

    • There are 5 example configs in the article that should work. I’d love to expand the article if you can be more precise in describing what you’d like to see added.

  8. Fantastic website. A lot of useful information here. I’m sending it to a few pals ans also sharing in delicious. And naturally, thank you in your sweat!

Leave a Reply

Your email address will not be published. Required fields are marked *