Replacing a node in a Riak Cluster
The instances that run in my infrastructure get a lifespan of 14 days. This allows me to continually test that I can replace my environment at any point. People always ask me if I follow the same principal for data nodes. I posted previously about replacing nodes is an ElasticSearch cluster, this post will detail how I replace nodes in a Riak cluster
NOTE: This post assumes that you have the Riak Control console enabled for Riak. You can find out how to enable that in the post I wrote on configuring Riak.
When going to the Riak Control, you can find the following screens:
Cluster Health
Ring Status
Cluster Management
Node Management
###Removing a node from the Cluster
In order to remove a node from the cluster, go to the cluster managemenet screen. Find the node you want to replace in the list and click on the Actions
toggle. It will reveal actions as follows:
As the node is currently running, I tend to chose the Allow this node to leave normally
option (if the node had died or was unresponsive, I would usually chose the force this node to leave
). Clicking on the Stage
button, details a plan of what is going to happen:
If the proposed changes look good, Commit
the plan. Watch the partitions drain from the node to be replaced:
When the all the partitions have drained, we now have a 2 node cluster where the partitons are split 50:50:
We can now destroy the node and let the autoscaling group launch another to replace it
###Adding a new node to the Cluster
Assuming a new node has already been launched and is ready to go into the cluster. Go to the cluster management page in the portal and enter new node details. It should follow the format riak@<ipaddress>
The list of actions that are pending on the cluster:
Commit
the changes, watch the partions rebalance across the cluster:
The cluster will return to being 3 nodes, with equal partition split and will then show as green again