There seem to be a lot of old ELK guides on the internet. It’s time to make a new one that can begin its own eventual decay.
To help minimize the aging process, I’m not going to cover how to install specific packages on specific platforms, but rather discuss the choice of tools and configurations that are available.
You have to get your log files off the remote machines and eventually into Elasticsearch. It’s the shipper’s job to, um, ship the logs to the next step.
There are a few shippers, some of which are outlined here.
tl;dr: use filebeat if you're only moving logs around.
You can use the full logstash build as your shipper. There are almost no reasons to do this. Being JVM-based, it’s big and takes memory. It also has more features than you’ll typically need for a shipper.
The only excuse to run logstash as a remote shipper is if you have a ton of logs and need to apply some business logic about which ones to ship. For example, maybe DEBUG logging is enabled in your production environment (?); you could use a full logstash to only ship the more important levels for processing.
NOTE: logstash-forwarder is dead. See filebeat, below.
This is the right choice. It’s a light-weight program that does nothing other than read log files and send them to logstash. Traffic is encrypted with SSL, so certs are required.
logstash-forwarder speaks the “lumberjack” protocol with logstash.
Filebeat is the replacement for logstash-forwarder. It’s also lightweight, gives you the option of not using encryption, and they’re planning to add some nice client-side features (multiline and a basic ‘grep’).
Filebeat requires logstash 1.5+.
If you need a broker (see below), then beaver is a lightweight tool that, not being encrypted, can talk to redis.
Many guides describe the use of a broker like redis or rabbitmq between the shipper and logstash.
If you’re using logstash and/or logstash-forwarder as your shipper, you don’t need a broker. Both of these packages keep track of where they are in the local files, and should recover from a logstash outage. (If the outage lasts through a file rotation, this may not be true!).
I only like to use brokers when shipping logs from systems that don’t automatically handle logstash failures (e.g. syslog, netflow, etc). This covers for unplanned outages, and also lets you release changes to logstash without losing data.
The storage part of the whole equation.
See our best practices.