Ruby on Rails Wednesday, October 8, 2014


On heroku your most important bottleneck (although not the only one) is the average response time of your requests. You want all your response times under 500ms, ideally under 200ms

See this document for an explanation:

This is the most important thing you should worry about.

The performance of the database, and its proximity to the Heroku dynos, are also important, but those can be optimized by getting a bigger database. 

Moving anything long-running into a job queue is definitely the way to go. Generally you do this with Resque (or Delayed Job) back-end, and in Rails 4 you can use the ActiveJob paradigm to create your Job classes. Most of the time jobs use a Redis back-end, which fortunately for you is really, really performant and fast. 

As far as "bulk" operations you would have to write some logic yourself to do that, you may want to experiment with using a separate Redis instance (separate from the one keeping track of the job queue) as your temporary data store, then having your jobs do bulk operations reading from Redis and moving the data into MySQL. 

Check out this tool for load testing -- I've found it slightly hard to work with but it is very powerful:


Particularly if you can use it to measure your average response times on Heroku, you will want to make sure your response times don't slow down at scale.  Make sure you have a good understanding of Heroku random (aka "dumb") routing and why scale creates request queuing. 

-Jason


On Oct 8, 2014, at 9:04 AM, LZ Olem <lzarrefolem@gmail.com> wrote:

I'm developing a polling application that will deal with an average of 1000-2000 votes per second coming from different users. In other words, it'll receive 1k to 2k requests per second with each request making a DB insert into the table that stores the voting data.

I'm using RoR 4 with MySQL and planning to push it to Heroku or AWS.

What performance issues related to database and the application itself should I be aware of?

How can I address this amount of inserts per second into the database?

EDIT

I was thinking in not inserting into the DB for each request, but instead writing to a memory stream the insert data. So I would have a scheduled job running every second that would read from this memory stream and generate a bulk insert, avoiding each insert to be made atomically. But i cannot think in a nice way to implement this.


----

Jason Fleetwood-Boldt

All material © Jason Fleetwood-Boldt 2014. Public conversations may be turned into blog posts (original poster information will be made anonymous). Email jason@datatravels.com with questions/concerns about this.

No comments:

Post a Comment