Is Query Readability Costing Efficiency?

Retrieving all the information you need from a database for your web page could be imagined like you're doing some cooking. Initially, you need to get all the ingredients from your cupboard. There are a few ways of going about this.

You could go back and forth to the cupboard multiple times with a short, scannable ingredient list

However, because your list is easy to read and scan, it doesn't include specific details about each ingredient you want to use. Once you have everything out, you'd realise you only need the self-raising flour, not the plain flour you got out, but now that extra flour is using up your counter space.

You could make your ingredient list more detailed

Now each cupboard trip gets precisely the ingredients you need. You avoid filling your counter space with unnecessary items while saving time when you look through the ingredients later. However, you still need to spend lots of time walking back and forth.

You could keep your list simple but make one trip

Grab everything you could possibly need and worry about sorting through it later. Now you've saved time with one trip. Still, your counter is overflowing with unnecessary items, which you must sort through all over again to find what you actually need.

Or, you could make your list detailed and only make one trip

Now you've got everything you need, all at once, and all it took was a bit of pre-planning. You haven't made any extra trips, you have precisely what you need, and you still have a lot of counter space to go about your work.

Now, use that approach for writing queries

Think of the trips back and forth to the cupboard as a database connection, how you break down your lists and the extra details as your query, and your counter as the memory.

The crucial part of how you retrieve data from your database is the query you write, so if you focus on keeping it simple and separating what you need into multiple queries to make the code more readable, it could be costing you versus a much longer query with specific conditionals for what you need.

For example, on a recent project, we had the following readable but slow query;

User.where(state: 'approved')
    .order(created_at: :desc)
    .includes(:org_invoices)
    .limit(10)

# User Load (5.2ms) ...
# UserOrganisation Load (4.7ms) ...
# Organisation Load (2.4ms) ...
# Invoice Load (6.1ms) ...

# Total: 18.4ms

Due to the multiple table relationships, the query became less performant as the project scaled (and the cupboard analogy becomes even harder to contextualise). So, we rewrote the query to perform a single table join so all the data we'd need was in one place.

User.joins(:org_invoices)
    .where(state: :approved)
    .order('users.created_at desc')
    .limit(10)
    .select('DISTINCT users.*')

# User Load (11.0ms) ...

# Total: 11.0ms

Making more complex or specific database queries can have drawbacks. If you have 100 objects in your table and want to select 90 of them, should you specify the ids of the objects even if they can be expressed in a simple range, or get all of them? Suppose it's a query used only once, like for data migrations. Is it worth an hour rewriting a series of queries when it will only save you a few seconds overall?

Account.where(id: [1..90]) # Account Load (2.1ms)

Account.limit(90) # Account Load (1.4ms)

Account.all # Account Load (1.1ms)

In Conclusion

While over-simplification may save you time reading, understanding what you need could take extra time, and the overall performance may be slower. However, being overly specific when you don't need to could result in you over-engineering a problem that doesn't warrant it. A chosen approach for writing queries should be arrived at on a case-by-case basis rather than with the mindset that one rule fits.