Performance is one of most important aspects to take into account when developing web applications. Analyzing how data is fetched from a database is a good starting point when looking for enhancements. In this article, you’ll find examples on how to improve the performance by using aggregate functions and filtering data on a database level.
Some context to begin with
This article is inspired by a real problem I’ve once been tasked with. Dealing with it taught me a lot, and I still have it as a reference in my mind. I think examples are a good learning resource, they can clarify a lot. In this article, I’d like to share with you some examples of using the Active Record query methods.
In order not to introduce domain-specific details, I’ll use a sample application for a library to illustrate examples. It’s all fairly simple, as shown in the diagram below. We have four tables: authors, books, users and rentals. One user can borrow many books and one book can be borrowed by many users, so we need a joining table to model many-to-many relationships. It’s the rentals table in our case. We also store some additional info there, which are the dates of borrowing and return. An author can have many books assigned to their name. A book also has an attribute defining its genre.
User reading statistics
The task was to prepare statistics for a single user, so we could tell how many books from each genre were borrowed. My first thought was to fetch all books which have been borrowed by the user, group them by their genre and then do the mapping, so each genre has a number of books assigned instead of a list. Here is what I came up with:
While this approach works and looks clean, it doesn’t use all the possibilities that Active Record query methods offer. Thanks to them, we’re able to filter and aggregate data on a database level without using raw SQL directly in our code. Operating on a db level also increases our efficiency.
In the example above, we can use the group method instead of the Ruby’s groupby method. It will apply tthe GROUPBY clause to the tSQL query. Furthermore, the mapping and size method can be replaced with a count aggregation function. In the end, we are left with a query that looks like this:
Book.joins(:rentals).where(rentals: { user: user }).group(:genre).count(:books)
It looks even simpler!
Other useful examples
Below you’ll find some other ways of using the query methods which I find worth knowing.
Invitation for inactive users
TASK: Filter users who have never borrowed a book or did it more than a year ago.
We could fetch all users by including the associated rentals and then filter them using the select method.
User.includes(:rentals).select do |user|
user.rentals.empty? || user.rentals.none? { |rental| rental.start_date >= Date.today - 1.year }
end
But, of course, there is no need to fetch everything. By using the query methods, we can filter it out on the db level. First, let’s select users who have borrowed some books in the last year and then exclude them from the final selection.
It’s important to remember one thing while using the left_joins (and outer joins in general). If there are records in the left table (here: authors) that don’t have corresponding records in the right table (here: books), then in result the right table columns will be populated with nil values.
As we also need authors with one book assigned in the system, there are a few more operations to do. We’ll have to do the grouping, counting and adding a condition. Here is how to put it all together:
The condition comes after the aggregation function, so we have to use the HAVING clause, instead of the WHERE clause to specify it.
The Active Record query methods are worth checking when thinking about application performance. They can simplify your code and make it work faster. I hope that the shared examples will help you in exploring the possibilities the query methods have to offer.