Processing common event information with grok{}

If you’re lucky, most of your log messages in a given input will arrive in a standard format, typically with a set of common fields at the front (date, time, server, etc).

Rather than multiple grok{} patterns that are looking across the entire message, like these:

grok {
    match => ["message", "%{SYSLOGTIMESTAMP:syslogtime} %{HOSTNAME:sysloghost} Save this %{WORD:word1}"]
    tag_on_failure => ["_grokparsefailure_match1"]

grok {
    match => ["message", "%{SYSLOGTIMESTAMP:syslogtime} %{HOSTNAME:sysloghost} Save this other %{WORD:word2}"]
    tag_on_failure => ["_grokparsefailure_match2"]

I like to split off the common stuff:

grok {
    match => ["message", "%{SYSLOGTIMESTAMP:syslogtime} %{HOSTNAME:sysloghost} %{GREEDYDATA:message}"]
    overwrite => [ "message" ]
    tag_on_failure => ["_grokparsefailure_syslog"]

Note that the last pattern puts the results into the field “message”. ┬áSince that field already exists, we have to use the “overwrite” setting to update it.

Then use smaller patterns against this smaller “message” for your application specific info:

grok {
    match => ["message", "Save this %{WORD:word1}"]
    tag_on_failure => ["_grokparsefailure_match1"]

This is easier to read, and the later grok{}s will be running smaller regexps
against smaller input, which should be faster.

2 responses to “Processing common event information with grok{}

  1. Rajesh Swarnkar

    Hi DevOps, Thanks for great post!
    Given that we use just standard regex templates, Does the length of regular expression affects “significantly”?
    Is grok regex engine that slow?
    Isn’t overwrite costly in terms of efficiency?

    • The idea is that looking for “a b” or “a c” or “a d” is going to be slower than pulling off “a” and then looking for “b”, “c”, or “d”. It might be fun to explicitly measure it someday.

      I wouldn’t think overwrite is any/much worse than a basic assignment. They just make you be explicit with allowing the overwrite.

Leave a Reply

Your email address will not be published. Required fields are marked *