{"id":188,"date":"2016-06-06T19:23:01","date_gmt":"2016-06-06T19:23:01","guid":{"rendered":"http:\/\/svops.com\/blog\/?p=188"},"modified":"2019-11-20T20:24:51","modified_gmt":"2019-11-20T20:24:51","slug":"elasticsearch-disk-space-calculations","status":"publish","type":"post","link":"http:\/\/svops.com\/blog\/elasticsearch-disk-space-calculations\/","title":{"rendered":"Elasticsearch disk space calculations"},"content":{"rendered":"<p>Each node provides storage capacity to your cluster. \u00a0Elasticsearch will stop indexing if the nodes start to fill up. \u00a0This is controlled with the\u00a0cluster.routing.allocation.disk.watermark.low parameter. \u00a0By default, no new shards will be allocated when a machine goes above 85% disk space.<\/p>\n<p>Clearly you must manage the disk space when all of your nodes are running, but what happens when a node fails?<\/p>\n<p>Let&#8217;s look at a three-node cluster, setup with three shards and one replica, so data is evenly spread out across the cluster:<\/p>\n<p style=\"padding-left: 30px;\"><a href=\"http:\/\/svops.com\/blog\/wp-content\/uploads\/2016\/06\/Untitled.jpg\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-189\" src=\"http:\/\/svops.com\/blog\/wp-content\/uploads\/2016\/06\/Untitled-191x300.jpg\" alt=\"Untitled\" width=\"150\" height=\"236\" srcset=\"http:\/\/svops.com\/blog\/wp-content\/uploads\/2016\/06\/Untitled-191x300.jpg 191w, http:\/\/svops.com\/blog\/wp-content\/uploads\/2016\/06\/Untitled.jpg 459w\" sizes=\"(max-width: 150px) 100vw, 150px\" \/><\/a><\/p>\n<p>If\u00a0each node has 1TB of disk space for data, they would hit the per-node 85% limit at 850GB. \u00a0If one node failed, the 6 total shards would need to be distributed across two nodes. \u00a0 In our example, if we lost node #1, the primary for shard 1 and the replica for shard\u00a03 would be lost. \u00a0The replica for shard\u00a01 that is on node #2 would be promoted to primary, but we would then have no replica for either shards 1 or 3. \u00a0Elasticsearch would try to rebuild the replicas on the remaining hosts:<\/p>\n<p style=\"padding-left: 30px;\"><a href=\"http:\/\/svops.com\/blog\/wp-content\/uploads\/2016\/06\/Untitled-1.jpg\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-190\" src=\"http:\/\/svops.com\/blog\/wp-content\/uploads\/2016\/06\/Untitled-1-300x243.jpg\" alt=\"Untitled\" width=\"200\" height=\"162\" srcset=\"http:\/\/svops.com\/blog\/wp-content\/uploads\/2016\/06\/Untitled-1-300x243.jpg 300w, http:\/\/svops.com\/blog\/wp-content\/uploads\/2016\/06\/Untitled-1.jpg 607w\" sizes=\"(max-width: 200px) 100vw, 200px\" \/><\/a><\/p>\n<p>This is good on paper, except\u00a0each of the remaining two nodes would need to absorb up to 425GB each.\u00a0\u00a0The remaining nodes would be full, and no new shards would be created.<\/p>\n<p>To plan for a node outage, you need to have enough free disk space on each node to reallocate the primary and replica data from the dead node.<\/p>\n<p>This formula will yield the maximum amount of data a node can safely hold:<\/p>\n<pre>(disk per node * .85) * (node count - 1 \/ node count)<\/pre>\n<p>In my example, we would get:<\/p>\n<pre>( 1TB * .85 ) * ( 2 \/ 3 ) = 566GB<\/pre>\n<p>If your three nodes contained 566GB of data each and one node failed, 283GB of data would be rebuilt on the remaining two nodes, putting them at 849GB used space. \u00a0This is just below the 85% limit of 850GB.<\/p>\n<p>I would pad the number a little, and limit the disk space used to 550GB for each node, with 1.65TB data total across the 3-node cluster. \u00a0This number plays a part in your data retention policy and cluster sizing strategies.<\/p>\n<p>If 1.65TB is too low, you either need to add more space to each node, or add more nodes to the cluster. \u00a0If you added a 4th similarly-sized node, you&#8217;d get<\/p>\n<pre>( 1TB * .85 ) * ( 3 \/4 ) = 637GB<\/pre>\n<p>which would allow 2.5GB of storage across the entire cluster.<\/p>\n<p>The formula shown is based on one replica shard. \u00a0If you had configured your cluster with more replicas (to survive the outage of more than one node), note that the formula is really:<\/p>\n<pre>(space per node * .85) * ((node count - replica count) \/ node count)<\/pre>\n<p>If we had two replicas in our example, we&#8217;d get:<\/p>\n<pre>( 1TB * .85 ) * ( 1 \/ 3 ) = 283GB<\/pre>\n<p>So you would only allow 283GB of data per node if you wanted to survive a 2-node outage in a 3-node cluster.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Each node provides storage capacity to your cluster. \u00a0Elasticsearch will stop indexing if the nodes start to fill up. \u00a0This is controlled with the\u00a0cluster.routing.allocation.disk.watermark.low parameter. \u00a0By default, no new shards will be allocated when a machine goes above 85% disk &hellip; <a href=\"http:\/\/svops.com\/blog\/elasticsearch-disk-space-calculations\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[1],"tags":[],"_links":{"self":[{"href":"http:\/\/svops.com\/blog\/wp-json\/wp\/v2\/posts\/188"}],"collection":[{"href":"http:\/\/svops.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/svops.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/svops.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/svops.com\/blog\/wp-json\/wp\/v2\/comments?post=188"}],"version-history":[{"count":5,"href":"http:\/\/svops.com\/blog\/wp-json\/wp\/v2\/posts\/188\/revisions"}],"predecessor-version":[{"id":269,"href":"http:\/\/svops.com\/blog\/wp-json\/wp\/v2\/posts\/188\/revisions\/269"}],"wp:attachment":[{"href":"http:\/\/svops.com\/blog\/wp-json\/wp\/v2\/media?parent=188"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/svops.com\/blog\/wp-json\/wp\/v2\/categories?post=188"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/svops.com\/blog\/wp-json\/wp\/v2\/tags?post=188"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}