RavenDB Sharding – Adding a new shard to an existing cluster, the easy way
Continuing on the theme of giving a full answer to interesting questions on the mailing list in the blog, we have the following issue.
We have a sharded cluster, and we want to add a new node to the cluster, how do we do it? Iāll discuss a few ways in which you can handle this scenario. But first, let us lay out the actual scenario.
Weāll use the Customers & Invoices system, and we put the data in the various shard along the following scheme:
| Customers | Sharded by region |
| Invoices | Sharded by customer |
| Users | Shared database (not sharded) |
We can configure this using the following:
var shards = new Dictionary<string, IDocumentStore>
{
{"Shared", new DocumentStore {Url ="http://rvn1:8080", DefaultDatabase = "Shared"}},
{"EU", new DocumentStore {Url = "http://rvn2:8080", DefaultDatabase = "Europe"}},
{"NA", new DocumentStore {Url = "http://rvn3:8080", DefaultDatabase = "NorthAmerica"}},
};
ShardStrategy shardStrategy = new ShardStrategy(shards)
.ShardingOn<Company>(company =>company.Region, region =>
{
switch (region)
{
case "USA":
case "Canada":
return "NA";
case "UK":
case "France":
return "EU";
default:
return "Shared";
}
})
.ShardingOn<Invoice>(invoice => invoice.CompanyId)
.ShardingOn<User>(user=> "Shared");
So far, so good. Now, we have so much work that we canāt just have two servers for customers & invoices, we need more. We change the sharding configuration to include 2 new servers, and we get:
var shards = new Dictionary<string, IDocumentStore>
{
{"Shared", new DocumentStore {Url = "http://rvn1:8080", DefaultDatabase = "Shared"}},
{"EU", new DocumentStore {Url = "http://rvn2:8080", DefaultDatabase = "Europe"}},
{"NA", new DocumentStore {Url = "http://rvn3:8080", DefaultDatabase = "NorthAmerica"}},
{"EU2", new DocumentStore {Url = "http://rvn4:8080", DefaultDatabase = "Europe-2"}},
{"NA2", new DocumentStore {Url = "http://rvn5:8080", DefaultDatabase = "NorthAmerica-2"}},
};
var shardStrategy = new ShardStrategy(shards);
shardStrategy.ShardResolutionStrategy = new NewServerBiasedShardResolutionStrategy(shards.Keys, shardStrategy)
.ShardingOn<Company>(company => company.Region, region =>
{
switch (region)
{
case "USA":
case "Canada":
return "NA";
case "UK":
case "France":
return "EU";
default:
return "Shared";
}
})
.ShardingOn<Invoice>(invoice => invoice.CompanyId)
.ShardingOn<User>(user => user.Id, userId => "Shared");Note that we have a new shard resolution strategy, what is that? This is how we control lower level details of the sharding behavior, in this case, we want to control where weāll write new documents.
public class NewServerBiasedShardResolutionStrategy : DefaultShardResolutionStrategy
{
public NewServerBiasedShardResolutionStrategy(IEnumerable<string> shardIds, ShardStrategy shardStrategy)
: base(shardIds, shardStrategy)
{
}
public override string GenerateShardIdFor(object entity, ITransactionalDocumentSession sessionMetadata)
{
var generatedShardId = base.GenerateShardIdFor(entity, sessionMetadata);
if (entity is Company)
{
if (DateTime.Today < new DateTime(2015, 8, 1) ||
DateTime.Now.Ticks % 3 != 0)
{
return generatedShardId + "2";
}
return generatedShardId;
}
return generatedShardId;
}
}What is the logic? If we have a new company, weāll call the GenerateShardIdFor(entity) method, and for the next 3 months, weāll create all new companies (and as a result, their invoices) in the new servers. After the 3 months have passed, weāll still generate the new companies on the new servers at a rate of two thirds on the new servers vs. one third on the old servers.
Note that in this case, we didnāt have to modify any data whatsoever. And over time, the data will balance itself out between the servers. In my next post, weāll deal with how we actually split an existing shard into multiple shards.
| Reference: | RavenDB Sharding – Adding a new shard to an existing cluster, the easy way from our NCG partner Oren Eini at the Ayende @ Rahien blog. |


