Don’t forget to check out the Logstash best practices, too.
Memory
Give elasticsearch half of your system’s RAM, up to 32GB.
Make sure the allocated memory doesn’t get swapped out by using mlockall. In your config/elasticsearch.yml, add:
bootstrap.mlockall: true
You may need to allow this as part of the startup by running
ulimit -l unlimited
On (at least) centos6, you can have this run for you in the init.d script by adding this line to /etc/sysconfig/elasticsearch:
MAX_LOCKED_MEMORY=unlimited
For centos7, edit /usr/lib/systemd/system/elasticsearch.service:
LimitMEMLOCK=infinity
After restarting, confirm the setting is correct in elasticsearch:
curl http://localhost:9200/_nodes/process?pretty
Index Names
Use an index for each day. There are only two ways to delete data in elasticsearch, and using curator against daily indexes is the right one.
Note that this is the default from logstash.
Run an odd number of nodes
This will prevent the split-brain problem.
Run at least three nodes
With one replica (two copies), using three nodes will give you an I/O boost.
Adjust the Mapping
Elasticsearch supports many different field types, and you should use the appropriate one for each field.
By using ‘int’, you can use comparisons (“http_status:>500”) or ranges (“http_status:[400 TO 499]”). Other field types give similar benefits of just using strings.