Monthly Archives: November 2014

Debugging your ELK cluster

Question

My ELK (ElasticSearch/LogStash/Kibana) cluster isn’t working.   How do I fix it?

Answer

Start at the beginning.

The Shipper

There are several popular pieces of software to ship your logs from the client to the logstash indexer.  Whether you’re using a full logstash installation, the logstash-forwarder, beaver, or something else, start by testing the network connectivity from your client to the logstash indexer:

telnet <ls_server> <ls_port>

There is no standard logstash port, so check your server configuration for the correct value.

If you can reach the server manually, then your shipper should be able to as well.

If you cannot reach the server with telnet, then you have some networking or connectivity issue.  Go work on that!

If you’re using the full logstash agent as your shipper, run it with “–debug” and check its own log files in /var/log/logstash/.

For logstash-forwarder, run it with the “-quiet=false” flag (0.4) or “-verbose -debug” (older) flags.

Check the list of filenames that you’ve configured – do they really match your paths?  Do any wildcards expand as desired?  In logstash-forwarder’s debug mode, it will show you the list of files that it’s processing.

Logstash

First, check that logstash can reach elasticsearch, using the same method as before.  From your logstash server:

telnet <es_server> <es_port>

If you can cannot reach the server, check the network.

If you can reach the server, we need to confirm that logstash is receiving the information from the shipper and what it’s doing with the data.  Add the following to your logstash output stanza and restart logstash:

output {
    stdout { codec => rubydebug }
}

This instructs logstash to print out a copy of each message that it processes.  These are usually written to /var/log/messages.

If information is being printed to the logs, then the shipper is sending good data to logstash.

Check the “@timestamp” value in these records.  By default, the documents will be written to an elasticsearch index according to that date.

Don’t forget to disable the extra “output” section, or you’ll run out of disk space pretty quickly!

Logstash also has “–debug” and “–verbose” command-line options that you can enable in your startup script, e.g. /etc/init.d/logstash.

Elasticsearch

If you can ship logs to logstash and logstash can see them, then logstash should be sending them to elasticsearch.  Check to see that the total document count on your server is increasing:

curl -s "localhost:9200/_nodes/stats?&pretty"

And examine the output at the beginning:

{
 "cluster_name" : "my_cluster",
 "nodes" : {
   "my_node" : {
     "indices" : {
       "docs" : {
       "count" : 123456789,
       "deleted" : 0
     },

If you run this a couple of times, you’d like to see the number increasing.

If the document count is not increasing, check the elasticsearch log file, typically in /var/log/elasticsearch/elasticsearch.log

Kibana

If documents are being written to elasticsearch, but you can’t find them in kibana, there are a few things to check:

First, is the default index for your dashboard correct?  In Kibana 3, click the “gear” in the top-right corner, switch to the “Index” tab, and confirm the setting:

Screen Shot 2014-11-04 at 3.07.57 PM

Second, make sure that your kibana date range covers the dates being used when documents are added to the index.  If the date is being overwritten (using logstash’s date filter), the logs will be in the past.  If the date is not being overwritten, the logs will show at the current time.