In the previous post, we discussed about the nature of the application highlighting lack of some key items. Let’s discuss the deployment infrastructure of the application.
The application is deployed in a jboss domain cluster. There are three nodes in the cluster with a load balancer in the front. Each JBoss server has an Apache Web server that serve’s as a Reverse Proxy. There are a couple of other standalone jboss servers that are used for tasks to run jobs, perform some tasks like document conversions etc., A couple of identity servers used for authentication and authorization implemented using Shibboleth.
The application is file IO intensive. So there’s a SAN based storage volume created just for storing files and the volume is mounted on each server. And there’s ElasticSearch instance and MySql instance in the background.
So far so good!!! Before we jump into the rationale behind using Jboss, domain clusters etc, let’s talk about the challenges.
Running out of Memory
The first and foremost problem is servers running out of memory. Each server is 16 core processor 32GB RAM, bare-metal machine and not a VM. The JVM memory utilization hits 90% every 2 days and stops responding. So there’s a manual restart required. The servers are hosted in a localized infrastructure provider. For a specific set of customers we had the infrastructure in AWS as well.
Cluster crash
When one of the instances is down, the other two instances keep handling the requests. But when you try to bring up the instance that went down, the other two instances also crash. So everytime we’re forced to restart the whole cluster, which eats up sweet 15 minutes as there is no automated way to do this as I mentioned in the earlier post.
Running out of Space in SAN volume
The SAN storage area eats up so much data to the tune of approximately 1 TB per month. So maintaining it, backing it up, cleaning it up is a huge mess. Often it goes out of space and the application becomes unresponsive. Increasing the size or reclaiming is a tedious task.
The Load balancer issue
Ideal job of load balancers is to balance out the load on the servers. It scans all the incoming requests and directs them to the appropriate servers that are capable of handling the requests at any point of time.
This application had a specific problem based on a particular file upload functionality. The front end implemented a file upload functionality using jQuery uploader plugin. When the size of the file is huge, the implementation automatically splits into chunks and uploads the file. So roughly a 10 MB file gets uploaded as 7-8 chunks to the server. In the server, after the chunks are uploaded, a background job merges all the chunks and pushes them to the storage area.
So this is how it works in a non-load-balancer environment.
Let’s wear our thinking hat on for sometime before we delve into the challenges of this particular functionality in a load-balanced environment in our next post.