Recently, I configured time-based auto-scaling for one of our production applications running on an Elastic beanstalk environment. Our use case is to scale-out ec2 instances on Monday morning and scale-in every Friday night.
One such Monday morning, we got about 4xx requests more than the threshold we had set for the purpose of monitoring. This triggered an investigation to see what had gone wrong. We found out that as soon as scale-out event triggers, a new instance is created and put behind a load balancer. Elastic beanstalk performs container commands as per configuration.
In this case, it took 3–5 minutes to complete these container commands. Therefore, even though a new instance had been created and put behind a load balancer, it was not ready to service requests for a short period of time. Moreover, we had few users that were trying to access the application during this particular time interval. Some requests were services by already healthy instances which resulted in 2XX requests and some of these were newly created instances that failed with 4XX and 5XX responses.
In this blog post, I am going to share how I fixed this problem for our production system.