April | 2015 | SV Ops

Most logging frameworks include a concept of severity or priority, including tags like “WARNING”, “CRITICAL”, etc.

Breaking these off into their own fields in logstash makes a lot of sense. Unfortunately, when you go to look for problems, you end up with a search like this:

log_level:("EMERG" OR "ALERT" OR "CRIT" OR "ERROR")

which is both inefficient (5 string comparisons) and unclear (did I miss or misspell one?).

What I like to do is create an additional, numeric representation of the log level, so that my search looks like this:

log_code:<=3

With the two fields, you can easily query for bad stuff, but still use log_level for display (in aggregations, etc).

I have standardized on the Apache LogLevel definitions:

Level	Description	Example
`emerg`	Emergencies – system is unusable.	“Child cannot open lock file. Exiting”
`alert`	Action must be taken immediately.	“getpwuid: couldn’t determine user name from uid”
`crit`	Critical Conditions.	“socket: Failed to get a socket, exiting child”
`error`	Error conditions.	“Premature end of script headers”
`warn`	Warning conditions.	“child process 1234 did not exit, sending another SIGHUP”
`notice`	Normal but significant condition.	“httpd: caught SIGBUS, attempting to dump core in …”
`info`	Informational.	“Server seems busy, (you may need to increase StartServers, or Min/MaxSpareServers)…”
`debug`	Debug-level messages	“Opening config file …”

My logstash config is broken into many smaller pieces. The filter{} stanza for each type of log is contained in a separate file, and there are generic stanzas that run before and after the business logic.

If you’re processing a log file whose levels are different, you need to normalize them. The translate{} filter is great for this:

translate {
  dictionary => [
    "WRN", "warn",
    "INF", "info",
    "DBG", "debug"
  ]
  field => "[loglevel][tmp]"
  destination => "[loglevel][name]"
  remove_field => [ "[loglevel][tmp]" ]
}

From what I can tell, translate{} won’t replace the source field, so the earlier grok{} matches into a temporary variable that is removed here.

Once [loglevel][name] is normalized, I use a post-processing config file to add the second [loglevel][code] field:

if [loglevel] and [loglevel][name] {
  translate {
    dictionary => [
      "emerg", "0",
      "alert", "1",
      "crit", "2",
      "error", "3",
      "warn", "4",
      "notice", "5",
      "info", "6",
      "debug", "7"
    ]
    field => "[loglevel][name]"
    destination => "[loglevel][code]"
  }

  # make it an int
  mutate {
    convert => [ "[loglevel][code]", "integer" ]
  }

Monthly Archives: April 2015

Handling log levels in logstash

Images in Markdown panels

Elasticsearch mappings and templates