Daniel Pericich

Nov 18, 2021

6 min read

Performing Union and Intersection Operations in Ruby on Rails

Photo by Michael Dziedzic on Unsplash

In Ruby on Rails a group of ActiveRecord records is known as an ActiveRecord Relation. This data structure is a glorified array that has certain ActiveRecord properties that make features, such as pagination, accessible. Because of the properties that an ActiveRecord Relation offers, it is important to not perform operations on this collection of records that would transform the Relation to a regular Array object.

For basic querying this is not a problem. Rails offers many methods for querying such as find, find_by and where which return individual, individual and groups of records respectively. These work great if you only want to work on a single ActiveRecord Relation, but what if you are working with multiple Relations?

To avoid dealing with 2 Relations you could write a raw SQL query and place it inside a where method call. While this works, it is not very Rails magic-y. To conform more with the Rails way we can use “merge” and “or” to perform queries that accept multiple ActiveRecord Relations and return the Union or Intersection of the given ActiveRecord Relations.

Compatible Structures

Before we jump into looking at these two ActiveRecord Relation methods, we must first talk about some requirements for using them. The most obvious requirement to use these methods is that both objects you are working with need to be ActiveRecord Relations.

If both objects are Relations, the next thing to check is that the Relations have compatible structures. What is a compatible structure? While both items being Relations is a basic requirement, the type of objects in each collection need to match as well. You can’t do Intersections or Unions between a Relation of User records and a Relation of Products because they are not the same model.

To be able to use these methods you must match Users to Users or Products to Products. Even when you have all Relations of the same type of objects, there are still more issues that can cause your queries to fail. An error that tripped me up when first working with these methods was the following:

# Error during processing: Relation passed to #or must be structurally compatible. Incompatible values: [:joins]

This error can be triggered by trying to work with Relations of different object classes, but my issue was more subtle. Both of my Relations were collections of Order objects, but one had a joins with Product objects. Again, both Relations of objects were based on the Order model, but one of the Relations was actually Order + Product.

To make these Relations compatible I had to join Products on my other Order Relation so that the structures were symmetrical and compatible. Be mindful of this as you implement the following methods as even if one model doesn’t use all joined tables’ values for its queries, you still need to include it to be able to perform Unions and Intersections.

Using OR for Unions

The first ActiveRecord method we will look at is the ‘or’ method. The syntax for this method is very simple, but before diving into this let’s talk about Unions. What is a union? A union is a term that refers to the items returned when comparing two sets of values. The union, unlike the intersection we will look at next, takes two collections and returns all unique values between the two. In visual terms you can think of Unions as:

Figure 1. Diagram showing Union operation

Here we see that there are two sets, Blue and Red, that have their own individual values as well as a set of shared values. The important thing to note with the results of a Union is that the resultant set contains only unique values. While both Blue is a set of { 1, 2, 3, 4 } and Red is a set of { 3, 4, 5, 6 }, our resulting collection will only have the 6 values 1–6 because our result has to be a set. How does the Union apply to our ActiveRecord Relations?

Say we have a search feature where we want to get an ActiveRecord Relation of all Orders where the order is incomplete or the Customer’s last name is “Smith”. Before searching for these conditions, we are already given an ActiveRecord Relation that is the result of querying only Orders made in the last month. Because we are querying off of two separate models, Order and Customer, we can’t write a complex where statement to search for the individual attributes. In this case we will have to have two separate queries which will look like this:

Figure 2. Queries for retrieving our 2 ActiveRecord Relations

Now that we have two ActiveRecord Relations, how do we get back a single ActiveRecord Relation that has either a status of “incomplete” or a Customer’s last name of “Smith?” At first I thought we could leverage the splat operator to put our two collections into an array and filter it by unique record IDs. While this works to join our collections, it breaks our rule of not changing the class of our collection from ActiveRecord Relation to Array.

To join two ActiveRecord Relations into a unique collection we will use the “or” ActiveRecord method. Our operation would like like this:

Figure 3. Attempt to join two ActiveRecord Relations into single Relation

But wait, are these compatible structures? Short answer, no. We used joins to combine our Orders and Customers tables for the Customers’s table, but did not do the same Orders. We need to make the structures symmetrical, which is accomplished by doing the following:

Figure 4. Using symmetrical structures to get Union of 2 ActiveRecord Relations

This looks much better. Running this method will now give us back an ActiveRecord Relation object that is the Union between orders with status of “incomplete” and those with Customers’ last names of “Smith”.

Using Merge for Intersections

Now that we’ve dived into the “or” method and Unions, let’s look at what the “merge” method does. The merge method is an Intersection operation which like the Union acts on two sets of values. Instead of returning all unique values that exist in Set A or Set B, it returns only the values that exist in both Set A and Set B. Here is a diagram to show which values we expect to be returned:

Figure 5. Diagram for results of Intersection operation

Here we return only the set of values, { 3, 4 } that are shared between both sets. Where might we want to use this operation? Going back to our User and Order models, we can use an Intersection to get all Orders that are for the User with the first name of “Amit” and a status of “complete.” This could be useful if we wanted to find details for all complete items for a customer with a specific name.

To get these results as an ActiveRecord Relation we would have the following method:

Figure 6. Performing an Intersection operation on 2 ActiveRecord Relations

Here we use “merge” to return only the records that are in both our orders_results ActiveRecord Relation and our customers_results Relation. This is a very powerful method that complements the Union operation quite well!

Conclusion

Working with Rails can be frustrating sometimes as going against the Rails Way causes some major headaches. Luckily Rails offers plenty of methods that assist in querying and manipulating records the Rails Way. Here we have looked at using “merge” and “or” to retain our records as ActiveRecord relations. There are many other methods that assist in keeping object classes as ActiveRecord objects while manipulating record collections. Let me know in the comments any other tips or tricks you have encountered in your software development!

Notes

https://courses.lumenlearning.com/math4libarts/chapter/union-intersection-and-complement/#:~:text=The%20union%20of%20two%20sets,is%20notated%20A%20%E2%8B%83%20B.&text=The%20intersection%20of%20two%20sets,that%20are%20in%20both%20sets.

https://www.bigbinary.com/blog/rails-5-adds-or-support-in-active-record

https://stackoverflow.com/questions/9540801/combine-two-activerecordrelation-objects

https://apidock.com/rails/v6.1.3.1/ActiveRecord/QueryMethods/or

https://apidock.com/rails/ActiveRecord/SpawnMethods/merge