FiveRuns Blog

On Rails production performance and monitoring

Posts
6 comments

Solving Rails Performance Bottlenecks with RM-Manage

In this post we will highlight how you can use FiveRuns RM-Manage to investigate performance bottlenecks in your production Rails applications. Using the example below, we’ll show you how we used the metrics provided by RM-Manage to improve performance of our app by over an order of magnitude! Of course, YMMV, but armed with this information and RM-Manage, you can learn how to identify performance bottlenecks and resolve them using FiveRuns.

For this purpose our own Mike Perham created a simple blogging application based on Rails 2.0. It’s called Megaphone. We spoke briefly about it in a previous post . Please refer to that article for general configuration of the FiveRuns instrumentation plugin.

Screenshot of Megaphone's main page

For the rest of this article we focus on the Rails Application Browser which is core to RM-Manage. It breaks down requests to your actions and extracts model and view processing times. These metrics are then aggregated over time from your production or development environments.

Baseline

To start analyzing Megaphone we added a large dataset to its database. Initially, we created 10,000 blog posts with 20,000 comments randomly mixed among the posts. Now, when we hit the front page it takes a very long time to display those entries. Let’s look at the RM-Manage Application Browser and see what is happening under the covers:

Megaphone's baseline benchmark metrics as seen in RM-Manage

At the top of this screen shot you can see that the Average Response Time for this action was over 125,000 ms. Wow, over 125 seconds to respond to a request! To identify the most obvious bottlenecks let’s see how we can break down the numbers in the Application Browser screenshot above:

  1. All shown data from the context of the index action of PostsController (which is not displayed in the screenshot) so you can see which actions are taking most of the time.
  2. The vast majority of time is spent in the model layer (84%), indicating we need to do some ActiveRecord or database tuning.
  3. The models User and Category are each called over 9000 times. Looking at the view code, we can see that the code only uses <%= post.<model>.name %>. This indicates that it is an N+1 query problem where we should be using an ActiveRecord :include clause to pull in the data with a join. Note that each call averages less than a millisecond. This signifies a cache hit since a call to the database will almost always take 5-10 msec, minimum.
  4. The model Comment is called almost 10,000 times also but it is hitting the database (since the time is about 10 msec average). Looking at the code, we call <%= post.comments.size %>
  5. Post is only being called a few times but those calls are slow. This indicates we have a slow query which could benefit from an index or some other tuning.

When performance tuning, it’s vital to have a baseline number to compare your results. Since the actual page render took 125 seconds, we’ll use this as our performance baseline.

Tuning

We’ll tackle the biggest problem first since that will get us the greatest speedup. (4) is the biggest problem because 9487 * 10 msec = 94.87 seconds. To solve (4), we have two potential solutions:

  • Add an index to the comment.post_id column so that the count query is much quicker.
  • Add a counter_cache column which holds the number of comments associated with a post.

An index will get us a nice speedup but ultimately still requires 10,000 separate queries. Adding an index will reduce the time dramatically but adding a counter_cache gets us an even greater speedup. This is straightforward. We just create a migration for adding a column to the Posts table:


    t.integer :comments_count, :default => 0

and update the belongs_to line in our Comment model:


    belongs_to :post, :counter_cache => true

For a tip on how to initialize the counter_cache column, please see Mike’s blog entry on this topic.

Check out the speed up! Now we’re at 32 seconds, down from over 125 seconds:

First performance improvements as seen in RM-Manage

Please note that by adding the counter_cache we have removed all invocation of the Comment model from this action. This is because the pre-calculated counter resides inside the Post model.

To solve (3), we modify our code to use the :include option. Before the action contained the following:


  @posts = Post.find(:all, :order => 'posts.created_at DESC')

After the change it looks like this:


  @posts = Post.find(:all, :order => 'posts.created_at DESC', :include => [:author, :category])

An order of magnitude performance improvements as seen in RM-Manage

Now we’re at 13 seconds with the majority of the time in the view layer rendering the set of 10,000 posts. That’s about an order of magnitude performance improvement. But, we can do better.

Production

Until here we have only shown aggregated metrics associated with single requests to an action. That’s useful in development and staging environments. However, the Application Browser becomes even more useful in production environments. Due to its low overhead you can take an in-depth look at your application under real (or artificial) production load.

From a user’s perspective including 10,000 posts on the front page isn’t too useful so we’ll want to trim this down to something reasonable like the latest 20 posts. Here we have launched a number of httperf processes against Megaphone which is now configured to show only the 20 most recent posts on its homepage:

Megaphone performance under load as seen in RM-Manage

As you can see, doing this gets us to the final speedup we need with roughly 66ms per request. Please also note that unlike in the other scenarios the action was invoked 28,156 times. The shown metrics have been aggregated over all these requests.

We have learned from our customers they need to understand what is going on in their production environments. With larger applications it can become difficult to simulate user traffic that mimics the real world load. A certain bottleneck might never expose itself in a staging environment with artificial load.

That’s why we focus on optimizing our Rails instrumentation layer to have minimal impact on the target system so it can be used in production environments. Efficient aggregation is key to that challenge.

Notes

We used a simple example above to show how you can use RM-Manage to improve performance. Of course, the Rails framework offers many capabilities, like built-in caching, which would solve some of the above performance problems.

The benchmark numbers above reflect response times of a single instance Rails application served by Mongrel on a desktop system. In the real world such applications typically run with multiple Mongrel processes behind a load balancer. RM-Manage supports such configurations out-of-the-box. It will aggregate performance metrics across all Mongrel processes running on a particular server. The general steps outlined above still remain in such an environment.

In this scenario we used Ruby 1.8.6 but we also could have used JRuby . We have outlined our RM-Manage support for JRuby in a previous blog post.

Results

In the above example, we’ve taken our blogging system’s index page render from 125 seconds to 0.065 seconds, that’s a 1900x speedup! To do this, we used the RM-Manage Application Browser to find hotspots in our code and employed several strategies to fix the code:

  • database tuning with cache columns,
  • query tuning with ActiveRecord options, and
  • functional tweaks to the application.

Of course, Megaphone is only a simple application that we use to highlight typical Rails performance problems. But, the same principles can be applied to any Rails application. Imagine the positive impact of improving the performance of your business-critical Rails application.

Please comment if you have requests or ideas to make our system even better. If you think RM-Manage could help you, it’s easy to get started and test drive it for yourself. Just sign up for a free 30 day trial.

Bookmark and Share
Continued Discussion

6 responses to this entry

I was checking out your screenshots-the app looks nice-very clean. Mind if I ask, what did you use for your charting solution? I’m creating a hosted load testing tool in rails and looking at graphing/charting solutions, so I’d love to hear how you guys approached this.

Eric Beland Eric Beland said:

on December 21, 2007 at 09:40 AM

Nice suggestion on the counter cache, but at least link to a better article on how to create it. Rails 2.0 recently changed how it works. But then again, the article you linked to was full MySQL syntax, not migrations…

Use migrations: http://josh.the-owens.com/archives/2007/11/03/rails-edge-change-how-to-add-a-counter-cache-to-an-existing-db-table/

Josh Owens Josh Owens said:

on December 28, 2007 at 11:54 AM

Josh, I addressed your concern directly on my site. In short, I believe you missed the point of my tip. Assume your table has one million rows. Your suggestion will pull one million entries into memory and issue one million SELECTs (to determine the count for each row) and one million UPDATEs (to update the row). My posting is a performance tip on how to quickly initialize a counter cache with a single SQL statement, rather than a statement per row.

In short, where your suggestion would require 2,000,001 statements, mine would require 1.

http://www.mikeperham.com/2007/12/17/creating-a-counter_cache-column/#comments

Mike Perham Mike Perham said:

on December 31, 2007 at 10:49 AM

Just to clarify, in my comment, the strikeout at this point was a textile accident: the app looks nice

The app does look nice, I wasn’t trying to strike that out!

Eric Beland Eric Beland said:

on January 01, 2008 at 03:59 PM

Caching is the key to performance. The more you cache the less resources you need to generate content from scratch.

The Speed guy The Speed guy said:

on March 01, 2008 at 08:23 AM

But where is the bottleneck, commonly, for rails apps? Is it just rails making a lot of rb_funcall’s with its nested filters? What gives?

roger pack degree search roger pack degree search said:

on June 25, 2008 at 05:47 PM

Contribute

Continue the conversation and share your thoughts. A name is required. Your e-mail address will not be displayed on the site. Textile formatting may be used in your comments (but will not be rendered in the live comment preview).

→ Posted by You on September 04, 2008 at 02:39 PM

Flickr

FiveRuns tagged photos on Flickr.

  • FiveRuns Booth
  • FiveRuns RailsConf 2008
  • FiveRuns RailsConf 2008
  • FiveRuns Booth
  • IMG_4660
  • FiveRuns RailsConf 2008
  • Interface
  • IMG_4684

See more FiveRuns tagged photos…

Other Categories

Entries are also organized under the following general topic categories.