Monthly Archives: August 2016

Elasticsearch configuration options

This is an attempt at a complete listing of elasticsearch config variables since they’re located all over the elastic.co website.

The list is not complete, and will start to “rot” as soon as it’s published, but… If you know of some variables that aren’t listed, please let me know.

Note that static settings must be set in the config file on every machine in the cluster.

 

Bootstrap

Name Type Notes Doc
bootstrap.mlockall Static Link

Cloud

Name Type Notes Doc
cloud.aws.access_key Unknown Link
cloud.aws.ec2 Unknown Link
cloud.aws.protocol Unknown Link
cloud.aws.proxy Unknown Link
cloud.aws.region Unknown Link
cloud.aws.s3 Unknown Link
cloud.aws.secret_key Unknown Link
cloud.aws.signer Unknown Link

Cluster

Name Type Notes Doc
cluster.blocks.read_only Dynamic Link
cluster.info.update.interval Dynamic Link
cluster.name Unknown Link
cluster.routing.allocation.allow_rebalance Dynamic Link
cluster.routing.allocation.awareness.attributes Dynamic Link
cluster.routing.allocation.awareness.force.zone.values Dynamic Link
cluster.routing.allocation.balance.shard Unknown Link
cluster.routing.allocation.balance.index Unknown Link
cluster.routing.allocation.balance.threshold Unknown Link
cluster.routing.allocation.cluster_concurrent_rebalance Dynamic Link
cluster.routing.allocation.disk.include_relocations Dynamic Link
cluster.routing.allocation.disk.threshold_enabled Dynamic Link
cluster.routing.allocation.disk.watermark.low Dynamic Link
cluster.routing.allocation.disk.watermark.high Dynamic Link
cluster.routing.allocation.enable Dynamic Link
cluster.routing.allocation.exclude Dynamic Link
cluster.routing.allocation.include Dynamic Link
cluster.routing.allocation.node_concurrent_recoveries Dynamic Link
cluster.routing.allocation.node_initial_primaries_recoveries Dynamic Link
cluster.routing.allocation.require Dynamic Link
cluster.routing.allocation.same_shard.host Dynamic Link
cluster.routing.allocation.total_shards_per_node Dynamic Link
cluster.routing.rebalance.enable Dynamic Link

Discovery

ec2 discovery can also have: groups, host_type, availability_zones, any_group, ping_timeout, and node_cache_time. Use this inside discovery.ec2, e.g. discover.ec2.groups.

Name Type Notes Doc
discovery.type Dynamic Link
discovery.zen.minimum_master_nodes Dynamic Link
discovery.zen.ping.multicast.enabled Unknown Removed in ES 2.2 Link
discovery.zen.ping.unicast.hosts Unknown Link

Gateway

Name Type Notes Doc
gateway.expected_nodes Unknw Link
gateway.expected_master_nodes Static Link
gateway.expected_data_nodes Static Link
gateway.recover_after_time Static Link
gateway.recover_after_nodes Static Link
gateway.recover_after_master_nodes Static Link
gateway.recover_after_data_nodes Static Link

HTTP

Name Type Notes Doc
http.port Static Link
http.publish_port Static Link
http.bind_host Static Link
http.publish_host Static Link
http.host Static Link
http.max_content_length Static Link
http.max_initial_line_length Static Link
http.max_header_size Static Link
http.compression Static Link
http.compression_level Static Link
http.cors.enabled Static Link
http.cors.allow-origin Static Link
http.cors.max-age Static Link
http.cors.allow-methods Static Link
http.cors.allow-headers Static Link
http.cors.allow-credentials Static Link
http.detailed_errors.enabled Static Link
http.pipelining Static Link
http.pipelining.max_events Static Link

Index

Name Type Notes Doc
index.analysis.analyzer Static Link
index.analysis.filter Static Link
index.analysis.tokenizer Static Link
index.auto_expand_replicas Dynamic Link
index.blocks.metadata Dynamic Link
index.blocks.read Dynamic Link
index.blocks.read_only Dynamic Link
index.blocks.write Dynamic Link
index.codec Static Link
index.gateway.local.sync Unknown Renamed to index.translog.sync_interval in ES 2.0 Link
index.max_result_window Dynamic Link
index.merge.policy.calibrate_size_by_deletes Unknown Removed in ES 2.0 Link
index.merge.policy.expunge_deletes_allowed Unknown Removed in ES 2.0 Link
index.merge.policy.max_merge_docs Unknown Removed in ES 2.0 Link
index.merge.policy.max_merge_size Unknown Removed in ES 2.0 Link
index.merge.policy.merge_factor Unknown Removed in ES 2.0 Link
index.merge.policy.min_merge_docs Unknown Removed in ES 2.0 Link
index.merge.policy.min_merge_size Unknown Removed in ES 2.0 Link
index.merge.policy.type Unknown Removed in ES 2.0 Link
index.merge.scheduler.max_thread_count Dynamic Link
index.number_of_replicas Dynamic Link
index.number_of_shards Static Link
index.recovery.initial_shards Dynamic Link
index.refresh_interval Dynamic Requires units in ES 2.0 Link
index.routing.allocation.exclude Dynamic Link
index.routing.allocation.include Dynamic Link
index.routing.allocation.require Dynamic Link
index.routing.allocation.total_shards_per_node Dynamic Link
index.search.slowlog.threshold Dynamic Link
index.shard.check_on_startup Static Link
index.similarity.default.type Static Link
index.store.throttle.type Unknown Removed in Es 2.0 Link
index.store.throttle.max_bytes_per_sec Unknown Removed in Es 2.0 Link
index.store.type Static memory and ram types removed in ES 2.0 Link
index.ttl.disable_purge Dynamic Link
index.translog.durability Dynamic Link
index.translog.fs.type Dynamic Link
index.translog.flush_threshold_ops Dynamic Link
index.translog.flush_threshold_period Dynamic Link
index.translog.flush_threshold_size Dynamic Link
index.translog.interval Dynamic Link
index.translog.sync_interval Static Link
index.unassigned.node_left.delayed_timeout Dynamic Link

Indices

indices.store.throttle.max_bytes_per_sec

Name Type Notes Doc
indices.analysis.hunspell.dictionary.location Unknown Removed in ES 2.0 Link
indices.recovery.concurrent_streams Dynamic Link
indices.recovery.concurrent_small_file_streams Dynamic Link
indices.store.throttle.type Unknown Removed in ES 2.0 Link
indices.store.throttle.max_bytes_per_sec Unknown Removed in ES 2.0 Link

Logger

Name Type Notes Doc
logger.indexes.recovery Dynamic Link
logger.transport.tracer Dynamic Link

Network

Name Type Notes Doc
network.bind_host Unknown Link
network.host Dynamic See special values Link
network.publish_host Unknown Link
network.tcp.no_delay Unknown Link
network.tcp.keep_alive Unknown Link
network.tcp.reuse_address Unknown Link
network.tcp.send_buffer_size Unknown Link
network.tcp.receive_buffer_size Unknown Link

Node

Name Type Notes Doc
node.data Unknown Link
node.enable_custom_paths Unknown Removed in ES 2.0 Link
node.master Unknown Link
node.max_local_storage_nodes Static Link
node.name Unknown Link

Path

Name Type Notes Doc
path.conf Static Link
path.data Static Link
path.home Static Link
path.logs Static Link
path.plugins Static Link
path.repo Static Link
path.scripts Static Link
path.shared_data Unknown Link

Plugin

Name Type Notes Doc
plugin.mandatory Static Link

Resource

Name Type Notes Doc
resource.reload.enabled Unknown
resource.reload.interval Unknown Link
resource.reload.interval.low Unknown
resource.reload.interval.medium Unknown
resource.reload.interval.high Unknown

Repositories

Name Type Notes Doc
repositories.url.allowed_urls Unknown Link

Script

Name Type Notes Doc
script.auto_reload_enabled Static Link
script.default_lang Static Link
script.disable_dynamic Unknown Removed in ES 2.0
script.file Static Link
script.index Static Link
script.inline Static Link
script.update Static Link
script.mapping Static Link
script.engine.expression Static Link
script.engine.groovy Static Link
script.engine.javascript Static Link
script.engine.mustache Static Link
script.engine.python Static Link

Thread Pool

There are several thread pools. Elastic lists the “important” ones as including: generic, index, search, suggest, get, bulk, percolate, snapshot, warmer, refresh, listener. Some settings are documented, and are listed below.

You can also control the number of processors for a thread pool, which is briefly documented here.

Name Type Notes Doc
threadpool.generic.keep_alive Dynamic Link
threadpool.index.queue_size Dynamic Link
threadpool.index.size Dynamic Link

Transport

Transport allows you to bing to multiple ports on different interfaces. See the transport profiles doc for more info.

Name Type Notes Doc
transport.bind_host Unknown Link
transport.host Unknown Link
transport.ping_schedule Unknown Link
transport.publish_host Unknown Link
transport.publish_port Unknown Link
transport.tcp.compress Unknown Link
transport.tcp.connect_time Unknown Link
transport.tcp.port Unknown Link
transport.tracer.exclude Dynamic Link
transport.tracer.include Dynamic Link

Tribe

There are a lot of options for tribes that vary based on the tribe name. Some info is presented here.

Watcher

Name Type Notes Doc
tribe.blocks.metadata Unknown Link
tribe.blocks.metadata.indices Unknown Link
tribe.blocks.write Unknown Link
tribe.blocks.write.indices Unknown Link
tribe.t1.cluster.name Unknown Link
Name Type Notes Doc
watcher.enabled Unknown Renamed in ES 2.0 Link
watcher.interval Unknown Renamed in ES 2.0 Link
watcher.interval.low Unknown Renamed in ES 2.0 Link
watcher.interval.medium Unknown Renamed in ES 2.0 Link
watcher.interval.high Unknown Renamed in ES 2.0 Link

Managing Elastic watches with Ansible

If you’re an Ansible convert, you want every part of your deployment to be managed by the tasks in your roles.  I recently wanted to manage my Elastic Watcher config with Ansible.  It was enough of a struggle (“learning process”) that I wanted to document it here.

While I was developing my first few watches, I made stand-alone shell scripts that could be run to create/update them.  This worked fine, and even handled the basic auth configuration in front of the master nodes.  Basically it was:

curl -u username -XPUT 'http://hostname/_watcher/watch/watchname' -d '{ ... }'

It would prompt for the user’s password and send the information along.

Ansible can run local commands, but doesn’t like interactive commands.  As such, the password prompt would block the task from completing.  Looking around for other solutions led me to the ‘uri’ module.  By default, it won’t prompt for the password either, but you can use the prompts system.  I chose to use the Ansible vault to store the password to avoid external sharing/management of the password.

The final config looks like this:

- set_fact:
 watch: "{{ lookup('file','watcher.json') }}"

- name: Install watcher for cluster health
 uri:
 delegate_to: 127.0.0.1
 url: "http://hostname//_watcher/watch/watchname"
 method: PUT
 user: "ansible"
 password: "{{ ansible_password }}"
 force_basic_auth: yes
 body: "{{ watch }}"
 run_once: true

The set_fact loads the contents of the file into a variable (otherwise, you’d have to list the entire json in uri’s body section).

By using delegate_to, the PUT will be run on the local machine.  Coupled with run_once, it contacts the cluster and sets the config one time.

On my desktop mac, it originally gave me errors about httplib2, which I had to install:

sudo pip install httplib2

As mentioned before, the variable that contains the basic auth password comes from an ansible vault.

One final – yet important – gotcha.  Ansible seemingly wants to process all data as templates.  Unfortunately, watcher uses the same variable syntax as Ansible, so your watcher definition may include lines with watcher variables:

{{ ctx.payload.status }}

Ansible will think these are its own variables, and you’ll get errors when running the playbook:

fatal: [localhost] => One or more undefined variables: 'ctx' is undefined

The solution is to tell Ansible to ignore that section:

{% raw %}{{ ctx.payload.status }}{% endraw %}

This can be a little tedious, but I didn’t find a way to prevent Ansible from interpreting the string.