Re: [Rails] Re: Optimizing query

Ruby on Rails Thursday, January 26, 2012

On Thu, Jan 26, 2012 at 6:37 PM, Peter Vandenabeele <peter@vandenabeele.com> wrote:

On Thu, Jan 26, 2012 at 6:15 PM, Linus Pettersson <linus.pettersson@gmail.com> wrote:

Hi

I tested to remove the .order(...) part and indeed, the query time goes down to ~100ms. However, it doesn't help to add indices, at least not as I did :)

add_index :categories, :name

add_index :subcategories, :name

Did some more testing and if I keep the .order... but don't join the products table I get a query that runs at about ~55ms. So the bottleneck seems to be the products table.

The query that I'm running looks like this:

Category.eager_load(:subcategories)
.joins("INNER JOIN products AS p ON resellercategories.id = p.resellercategory_id")

.order("categories.name ASC, subcategories.name ASC")

(Skipping the gender here...)

What I have is Categories and Subcategories. They are related to each other through a Resellercategories table. Products are related to Resellercategories.

So, the reason that I want to join the products as well is because I only want to show categories and subcategories that actually have some products (there are some empty categories/subcategories still).

So the above query is what we came up with in another thread here in the group.

- Maybe there is a better way to check if a category/subcategory has products without joining the entire products table?

It is possible to add a :counter_cache , but then you need to make sure you
use the proper methods for each product that you add or remove from the
association.

Alternative to the default counter cache (from Rails), you could build your own
logic as in:

* has_male_products
* ...

changing you query to ...

Category.eager_load(:subcategories).
where(:has_male_products => true).
order(...)

Then you would need to set the cache on the appropriate categories in
an after_save on the product you are creating/updating/deactivating/(deleting ?).

Both ideas would probably be faster for querying, but certainly more
complex for making sure that cache is always correct.

Sorry to reply to my own post.

TL;DR Is there pagination? Then a smaller set may return much faster.

I was thinking over my reply and may have forgotten a fundamental

aspect ... If you say 2200 categories, 8000 products.

How many entries does you query return ?

(replace .add with .count at the end).

How many do you need ?

What happens when you add .limit(20) to your query ?

By which "primary object" do you want to sort and paginate ?

(I will assume 'Product' in the discussion below).

With the includes that are currently implemented, you may

have to redo the whole query into 2 qeuries ...

1) for fetching the "primary objects"

(e.g. exactly 20 Products, no additional "has_many" data, because

that would increase the number of returned rows for 1 product and

make proper pagination in the database impossible; including

"belongs_to" here is no problem)

2) a second query for fetching eventual "has_many" data

on those 20 "primary products" (is that "Reification" ?)

If the performance problem could be solved by taking the

pagination into account, that would be a _much_ better

solution that building cache columns in this early phase

of your project.

HTH,

Peter

--
You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group.
To post to this group, send email to rubyonrails-talk@googlegroups.com.
To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.

Ruby on Rails

Re: [Rails] Re: Optimizing query

No comments:

Post a Comment

Blog Archive