{"id":56,"date":"2015-01-29T20:46:47","date_gmt":"2015-01-29T20:46:47","guid":{"rendered":"http:\/\/svops.com\/blog\/?p=56"},"modified":"2015-06-17T20:08:43","modified_gmt":"2015-06-17T20:08:43","slug":"introduction-to-logstash-grok-patterns","status":"publish","type":"post","link":"http:\/\/svops.com\/blog\/introduction-to-logstash-grok-patterns\/","title":{"rendered":"Introduction to Logstash Grok Patterns"},"content":{"rendered":"<h2>Introduction<\/h2>\n<p>The grok filter &#8211; and its use of patterns &#8211; is the truly powerful part of logstash. \u00a0 Grok allows you to turn unstructured log text\u00a0into structured data.<\/p>\n<h2>grok<\/h2>\n<p>The <a href=\"http:\/\/logstash.net\/docs\/1.4.2\/filters\/grok\">grok filter<\/a>\u00a0attempts to match a field with a pattern. \u00a0Think of patterns as a named regular expression. \u00a0Patterns allow for increased readability and reuse. \u00a0If the pattern matches, logstash can create additional fields (similar to a regex capture group).<\/p>\n<p>This example takes the event&#8217;s &#8220;message&#8221; field and attempts to match it with 5 different patterns (e.g. &#8220;IP&#8221;, &#8220;WORD&#8221;). \u00a0If it finds a match for the entire expression, it will add fields for the patterns (&#8220;IP&#8221; will be stored in the &#8220;client&#8221; field, etc).<\/p>\n<pre>filter {\r\n grok {\r\n   match =&gt; [ \"message\", \"%{IP:client} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:bytes} %{NUMBER:duration}\" ]\r\n }\r\n}<\/pre>\n<p>If the input doesn&#8217;t match the pattern, a tag will be added for &#8220;_grokparsefailure&#8221;. \u00a0You\u00a0can (and should; see <a title=\"Logstash best practices\" href=\"http:\/\/svops.com\/blog\/logstash-best-practices\/\">best practices<\/a>) customize this tag.<\/p>\n<h2>Patterns<\/h2>\n<p>Logstash ships with lots of predefined patterns. \u00a0You can browse them <a href=\"https:\/\/github.com\/logstash-plugins\/logstash-patterns-core\/tree\/master\/patterns\">on github<\/a>.<\/p>\n<p>Patterns consist of a label and a regex, e.g.:<\/p>\n<pre>USERNAME [a-zA-Z0-9._-]+<\/pre>\n<p>In your grok filter, you would refer to this as %{USERNAME}:<\/p>\n<pre>filter {\r\n grok {\r\n   match =&gt; [ \"message\", \"<span style=\"color: #ff0000;\">%{USERNAME}<\/span>\" ]\r\n }\r\n}<\/pre>\n<p>Patterns can contain other patterns, e.g.:<\/p>\n<pre>SYSLOGTIMESTAMP %{MONTH} +%{MONTHDAY} %{TIME}<\/pre>\n<p>&nbsp;<\/p>\n<h2>Target Variables<\/h2>\n<p>A pattern can store the matched value in a new field. \u00a0Specify the field name in the grok filter:<\/p>\n<pre>filter {\r\n grok {\r\n   match =&gt; [ \"message\", \"%{USERNAME<span style=\"color: #ff0000;\">:user<\/span>}\" ]\r\n }\r\n}<\/pre>\n<p>If you&#8217;re using a regexp, you can make a new field with an\u00a0Oniguruma\u00a0trick:<\/p>\n<pre class=\"p1\">filter {\r\n \u00a0grok {\r\n    match =&gt; [ \"message\", \"(?&lt;myField&gt;[a-z]{3})\" ]\r\n  }\r\n}<\/pre>\n<p class=\"p1\">This would find three lower case letters and create a field called &#8216;myField&#8217;.<\/p>\n<h2>Casting<\/h2>\n<p>By default, grok&#8217;ed fields are strings. \u00a0Numeric fields (int and float) can be declared in the pattern:<\/p>\n<pre>filter {\r\n grok {\r\n   match =&gt; [ \"message\", \"%{USERNAME:user<span style=\"color: #ff0000;\">:int<\/span>}\" ]\r\n }\r\n}<\/pre>\n<p>Note that this is just a hint that logstash will pass along to elasticsearch when it tries to insert the event. \u00a0If the field already exists in the index with a different type, this won&#8217;t change the mapping in elasticsearch until a new index is created.<\/p>\n<h2>Custom Patterns<\/h2>\n<p>While logstash ships with many patterns, you eventually will need to write a custom pattern for your application&#8217;s logs. \u00a0The general strategy is to start slowly, working your way from the left of the input string, parsing one field at a time.<\/p>\n<p>Your pattern does not need to match the entire event message, so you can skip leading and trailing information if you just need something from the middle.<\/p>\n<p>Grok uses\u00a0<a href=\"http:\/\/www.geocities.jp\/kosako3\/oniguruma\/doc\/RE.txt\">Oniguruma regular expressions<\/a>.<\/p>\n<p>Be sure to use the debugger (see below) when developing custom patterns.<\/p>\n<h2>Debugging<\/h2>\n<p>There is an online <a href=\"https:\/\/grokdebug.herokuapp.com\">grok debugger<\/a> available for building and testing patterns.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Introduction The grok filter &#8211; and its use of patterns &#8211; is the truly powerful part of logstash. \u00a0 Grok allows you to turn unstructured log text\u00a0into structured data. grok The grok filter\u00a0attempts to match a field with a pattern. &hellip; <a href=\"http:\/\/svops.com\/blog\/introduction-to-logstash-grok-patterns\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[11],"tags":[],"_links":{"self":[{"href":"http:\/\/svops.com\/blog\/wp-json\/wp\/v2\/posts\/56"}],"collection":[{"href":"http:\/\/svops.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/svops.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/svops.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/svops.com\/blog\/wp-json\/wp\/v2\/comments?post=56"}],"version-history":[{"count":5,"href":"http:\/\/svops.com\/blog\/wp-json\/wp\/v2\/posts\/56\/revisions"}],"predecessor-version":[{"id":131,"href":"http:\/\/svops.com\/blog\/wp-json\/wp\/v2\/posts\/56\/revisions\/131"}],"wp:attachment":[{"href":"http:\/\/svops.com\/blog\/wp-json\/wp\/v2\/media?parent=56"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/svops.com\/blog\/wp-json\/wp\/v2\/categories?post=56"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/svops.com\/blog\/wp-json\/wp\/v2\/tags?post=56"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}