What function do "gunicorn workers" serve?

bkm · September 20, 2017, 12:56pm

I have read many posts on the topic of “gunicorn workers” and most of them relate to how to change their numbers in ERPNext.

But what are gunicorn workers? What do they represent? Why would one consider changing their numbers? Are they related to the total number of ERPNext users that can be logged in at one time?

Please help me to understand what they are and why I need to know how they affect ERPNext user experience.

BKM

Yakulu · September 20, 2017, 1:00pm

gunicorn is a Python HTTP Server which is used by Frappé / ERPNext. You will find information on official documentation. For example about workers.

felix · September 20, 2017, 1:07pm

And to add, more workers will serve more concurrent users. The nature of the workload will determine the number of workers needed. For most use cases that i see on the forum, 2 workers will be sufficient.

If you need more than that, you will most likely be at the level of needing someone experienced in scaling web apps to know how to scale things further, because there’s not going to be a magic bullet.

The maximum recommended number of gunicorn workers for a system is 2*(number of cores)+ 1, so if you have a 1 core VM, that means the recommendation is that you have a maximum of 3 workers, 2 cores would support 5 workers. However, if you have a single server install, you have to realize that redis, rq, mariadb, and nginx also need cpu time, so that’s important to note when determining how many workers to use.

bkm · September 20, 2017, 1:12pm

@Yakulu
@felix

Thank you both. After all of my reading I assumed that gunicorn was some sort of ERPNext/Frappe proprietary function. SO all of my research was confined to the forum. It never occurred to me to check for it as a universal process in Linux systems.

It is a much lower level function than I had assumed. Thank you for the pointer to the documentation. I understand why it is important now.

It also means that I would need to make sure it is set for some number from 3 to 5 for my system. I use a VM platform that dynamically ramps up my CPU cores to 2 as needed and doubles my memory as needed. Plus I have several custom apps running concurrent to ERPNext

Thank again for the help understanding this.

BKM

felix · September 20, 2017, 1:41pm

How do you do this? Reason I’m asking is if i used something like this, it would potentially save me a lot of money!

bkm · September 20, 2017, 9:35pm

This is in the Amazon AWS service. It it a service level called EC2 which is supposed to stand for “Elastic Cloud” resources. It allows you to setup a compute instance with a lower than peak configuration and it automatically adds the required resources to keep your system running optimally during the high usage intervals. It follows a sort of averaging protocol.

As nice as it is, I will probably go back to Google Cloud Services because I have greater control over the virtual servers there and more choices on how I can use the images I create. (Or maybe I just haven’t given AWS enough exploring time).

Hope that helps

BKM

felix · September 21, 2017, 6:26am

Oh - thanks. I thought you meant some sort of platform that automatically autosizes your EC2 instances, like moving from m3.medium to m3.large based on workload. That would be cool. It looks like you’re manually sizing right now.

Unfortunately, it looks like the only way to deal with spiky ERPNext workloads is moving Redis, mariadb, and storage out of the single erpnext server, and making a stateless Application server, of which you can spin up and down as many as needed at a time. That isn’t simple though.

bkm · September 21, 2017, 12:47pm

Well, on the Amazon AWS system I have an EC2 medium instance running all the time (24/7) and when I set up a training course I have as many as 40 people logged in and doing the same thing at the same time. The EC2 medium instance has always increased the resources enough to keep the class running just as well as if it were only one user.

When I set up a client system, I use GCP (Google Cloud Platform) and I set their CPU resources up to meet their peak usage requirements plus 50% extra memory. It is only their monthly service level cost that is affected. In some cases I show them just how the system performance would be affected by setting up the exact same system except scaled for just below average use resources. I then have them perform some regular daily functions like Stock ledger reports or other database intensive reporting functions while another user is just attempting to create a material transfer. The performance difference is painfully obvious and the client then agrees to the slightly higher monthly cost versus the potential pain caused by being under powered.

Likewise, I almost never hear problems from the clients that are configured with the peak plus resources.

BKM