.NET

RavenDB Sharding – Map/Reduce in a cluster

In my previous post, I introduced RavenDB Sharding and discussed how we can use sharding in RavenDB. We discussed both blind sharding and data driven sharding.

Today I want to introduce another aspect of RavenDB Sharding. The usage of Map/Reduce to gather information from multiple shards.

We start by defining a map/reduce index. In this case, we want to look at the invoice totals per date.

We define the index like this:

public class InvoicesAmountByDate : AbstractIndexCreationTask<Invoice, InvoicesAmountByDate.ReduceResult>
{
    public class ReduceResult
    {
        public decimal Amount { get; set; }
        public DateTime IssuedAt { get; set; }
    }
    public InvoicesAmountByDate()
    {
        Map = invoices =>
              from invoice in invoices
              select new
              {
                  invoice.Amount,
                invoice.IssuedAt
              };
        Reduce = results =>
                 from result in results
                 group result by result.IssuedAt
                 into g
                 select new
                 {
                     Amount = g.Sum(x => x.Amount),
                    IssuedAt = g.Key
                 };
    }
}

And then we execute the following code:

using (var session = documentStore.OpenSession())
{
    var asian = new Company { Name = "Company 1", Region = "Asia" };
    session.Store(asian);
    var middleEastern = new Company { Name = "Company 2", Region = "Middle-East" };
    session.Store(middleEastern);
    var american = new Company { Name = "Company 3", Region = "America" };
    session.Store(american);
    session.Store(new Invoice { CompanyId = american.Id, Amount = 3, IssuedAt = DateTime.Today.AddDays(-1)});
    session.Store(new Invoice { CompanyId = asian.Id, Amount = 5, IssuedAt = DateTime.Today.AddDays(-1) });
    session.Store(new Invoice { CompanyId = middleEastern.Id, Amount = 12, IssuedAt = DateTime.Today });
    session.SaveChanges();
}

We use a three way sharding, based on the region of the company, so we actually have the following document sin three different servers:

First server, Asia:

Second server, Middle East:

Third server, America:

Now, let us see what happen when we use the map/reduce query:

using (var session = documentStore.OpenSession())
{
    var reduceResults = session.Query<InvoicesAmountByDate.ReduceResult, InvoicesAmountByDate>()
        .ToList();
    foreach (var reduceResult in reduceResults)
    {
        string dateStr = reduceResult.IssuedAt.ToString("MMM dd, yyyy", CultureInfo.InvariantCulture);
        Console.WriteLine("{0}: {1}", dateStr, reduceResult.Amount);
    }
    Console.WriteLine();
}

As you can see, again, we make no distinction in our code about using sharding, we just query it normally. The results, however, are quite interesting:

As you can see, we got the correct results, cluster wide.

RavenDB was able to query all the servers in the cluster for their results, reduce them again, and get us the total across all three servers.

And that, my friends, it truly awesome.

Reference: RavenDB Sharding–Map/Reduce in a cluster from our NCG partner Oren Eini at the Ayende @ Rahien blog.

Related Articles

Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Inline Feedbacks
View all comments
Back to top button