Building a Scalable API in the AWS Cloud

We've been running our Node.js based API in the AWS cloud for a couple of years but have been wanting to scale way up. Rather than trying to beef up our existing VPC (vertically scale) I made the decision to architect a new solution and rebuild it from the ground up. In this post I will talk about some of the technologies we leveraged, challenges we ran into, and benefits we have seen already.

As you probably guessed from the image above I saw this as a great opportunity to try Elastic Beanstalk. Elastic Beanstalk is what is known as a PaaS (platform as a service) and it acts a container for your applications and their respective environments. After creating an "application" you define a set of parameters corresponding to AWS services like EC2, RDS, ELB, etc and boom EB takes care of creating compute instances, database servers, load balancers, and more! And yes, I know what you are thinking and it is just as magical as it sounds. As a part of this configuration you can setup autoscaling triggers which will scale your fleet of applications up/down based on the metrics you set. It was an absolute joy doing load testing and watching the fleet respond like an accordion based on the amount of CPU being used.

The reason I went with this approach is because I know that we have a very steady load that will only grow based on the number of machines we have connected but on top of that we have a number of highly variable customers which could cause huge spikes for short to medium durations. In the "olden days" we were forced to pay for servers which could not only handle the peak load but also the forecasted load based on our growth. Today with solutions like Elastic Beanstalk and autoscaling you are only paying for the fleet needed to support the current load and not the worst case scenario load. Clearly this is huge for the bottom line.

For this particular application we are leveraging the following AWS services:

  • Elastic Beanstalk
  • CloudFront
  • CloudWatch
  • EC2
  • ElastiCache
  • ELB
  • RDS
  • Route 53
  • S3
  • Trusted Advisor
  • VPC

A really neat aspect of using Elastic Beanstalk is the not only the ability to monitor any component of your environment (via CloudWatch) but to also see a summary of things like CPU utilization and network I/O across your entire fleet.

The only major issue I had with this architecture was with WebSockets. Our mobile application uses Socket.IO for all communication between the client/server and we kept seeing 400 errors on the socket connection. After a lot reading and chatting with AWS engineers we came to find that the proxying being handled by the ELB + Nginx was causing the handshaking to fail on the socket connection. The fix (take note, it could save you a major headache) was to change the ELB configuration (from the EB settings) from HTTP/HTTPS listeners to TCP on port 80 and SSL on 443. The combination of that along with removing Nginx from the stack solved all of the socket issues and even with hundreds of devices connected to dozens of servers in the fleet everything stays in sync.

In closing, we couldn't have landed on a better method for keeping our API scalable, even if we spent a lot more dough. The peace of mind that comes from knowing that the environment will scale up when it needs to without any human intervention is near priceless.