Elasticsearch best practices

Don’t forget to check out the Logstash best practices, too.

Memory

Give elasticsearch half of your system’s RAM, up to 32GB.

Make sure the allocated memory doesn’t get swapped out by using mlockall.  In your config/elasticsearch.yml, add:

bootstrap.mlockall: true

You may need to allow this as part of the startup by running

ulimit -l unlimited

On (at least) centos6, you can have this run for you in the init.d script by adding this line to /etc/sysconfig/elasticsearch:

MAX_LOCKED_MEMORY=unlimited

For centos7, edit /usr/lib/systemd/system/elasticsearch.service:

LimitMEMLOCK=infinity

After restarting, confirm the setting is correct in elasticsearch:

curl http://localhost:9200/_nodes/process?pretty

Index Names

Use an index for each day.  There are only two ways to delete data in elasticsearch, and using curator against daily indexes is the right one.

Note that this is the default from logstash.

Run an odd number of nodes

This will prevent the split-brain problem.

Run at least three nodes

With one replica (two copies), using three nodes will give you an I/O boost.

Adjust the Mapping

Elasticsearch supports many different field types, and you should use the appropriate one for each field.

By using ‘int’, you can use comparisons (“http_status:>500”) or ranges (“http_status:[400 TO 499]”).  Other field types give similar benefits of just using strings.

Leave a Reply

Your email address will not be published. Required fields are marked *