Review of "RavenDB High Performance"

My Background with RavenDB

Earlier this year I worked on a project setting up RavenDB as a datastore. It was my first experience with a NoSQL database. I learned a lot and made a lot of mistakes along the way, most of which are documented in this series of blog posts.

Despite the challenges I encountered, the first thing that stood out to me was how easy it was to program for. The API was straight-forward and the built-in support for LINQ made querying feel like a natural extension of the .NET language.

At present, Raven is up and running in our production environment with almost zero maintenance required.

Review

RavenDB High Performance starts off with a quick overview of relational databases, identifying some key shortcomings as well as architectural shifts that have led to the surge of "NoSQL" (Not Only SQL) databases in recent years. That is followed by a short summary of documents (from a NoSQL perspective) and then it jumps right into RavenDB features.

Coming to this book with prior experience in Raven, this was exactly the approach I wanted. I appreciated that it got right into the subject at hand without a lengthy introduction or filler content. I found the writing to be straight-forward without being dry or boring. It was a quick read overall, but the concepts and techniques took time to sink in.

In some ways, I found the book easier to follow than the online RavenDB documentation. While it's not extensive enough to replace the online docs, I found the organization of the book to be more logical and it did a better job of surfacing the core knowledge you'd need day-to-day.

This title covers up through RavenDB version 2.5, which is ahead of what we're using in production. I learned about a handful of helpful, new features including materialized views, SQL reporting, and the streaming API. I also learned about some useful client API features I had never come across before: session.Advanced.LoadStartingWith(...), UniqueConstraintAttribute, temporal versioning, cascade delete bundle, bulk insert, and session.Advanced.Lazily/Eagerly.

After covering data modeling and the client API, the book moves on to performance tuning and monitoring. This highlights the key features in RavenDB where admins can see statistics on each database, monitor performance, and quickly isolate issues within the system.

The book then transitions to scaling out. An entire chapter is devoted to sharding, which helped clear up a lot of unknowns for me. In fact, this was one of the more thorough chapters in the book. It contained numerous code examples and illustrations of different sharding strategies as well as how to define a custom strategy.

While we didn't need to make use of sharding for our project, we did take advantage of replication, which is the focus of the next chapter. It covers the different types of topologies (master-slave, master-master, and multi master), conflict resolution, failover behaviors, failover clustering, and read striping. This is probably the feature I had the most trouble with while implementing the project at work. We tried the master-master topology but ran into serious data issues due to our code implementation. We are now using the master-slave topology. While this chapter doesn't delve into all the intricacies of replication, it highlights the key functionality you need to get up and running.

The following chapter discusses deploying to the cloud. I really wish our project would have taken advantage of Database as a Service (DaaS). I spent a good amount of time updating and maintaining the servers myself, whereas the DaaS vendors described in this chapter handle all that headache for you. At least I know for the future.

The next chapter on extending RavenDB provides an overview of triggers, codecs, and server-side tasks. This chapter is heavy on code examples since it highlights more advanced functionality like custom compression and how to hook into different events within Raven.

The final chapter focuses on improving the user experience, taking advantage of facets, suggestions, search highlighting, and real-time updates with SignalR. Lots of useful code samples here to demonstrate each feature. Unfortunately, technology moving as fast as it does, the section on real-time updates is already out-of-date. SignalR has gone through a major version change and a lot of the code examples won't work as written, so you'll have to research the latest SignalR documentation to spin up your own implementation.

Conclusion

As the book states, it is written for people with existing knowledge of RavenDB (some ASP.NET MVC helps too). From that perspective, this is a worthwhile read as it covers the core features with a tight focus on optimization. There is no filler here, just to-the-point explanations with code samples. Aside from the outdated SignalR examples, all the other code samples are functional.

If you're just getting started with RavenDB, I would recommend going through the online tutorials on ravendb.net first and getting the basics down before you tackle this book. While this book is a good reference, it makes assumptions about your knowledge of the system and that could be frustrating to someone brand new to Raven.

Highlights

  • Assumes an existing knowledge of RavenDB
  • Moves quickly from one topic to the next
  • Focuses on optimizations across the data model, API, server, and user experience
  • Explanation of features is clear and well-written
  • Contains many useful code samples