ERPNext Running Very Slow

keshav · January 3, 2018, 4:25am

Dear All,

I have an instance running ERPNext which has the following configuration
2 Cores, 14GB Ram, 100 GB SSD

There are around 25 users but it has become very slow since the past few days
It takes about 60 seconds to save a Sales Invoice with 10 Items

I have tried restarting the bench and supervisor but it did not help.

We changed the innodb_buffer_pool_size to 9.7GB here is what it looks like now
innodb_buffer_pool_size = 10000M
innodb_log_buffer_size = 8M
innodb_file_per_table = 1
innodb_open_files = 400
innodb_io_capacity = 400
innodb_flush_method = O_DIRECT

Also increased the maxmemory in redis_cache.conf to 1350mb

Here is what the redis_cache.conf

bind 127.0.0.1
port 13000
maxmemory 1350mb
maxmemory-policy allkeys-lru
appendonly no

On checking the web.error.log this is what I found.

[2018-01-02 13:06:42 +0000] [2123] [ERROR] Error handling request /api/method/frappe.async.can_subscribe_doc?sid=e619eed72445b3ac517bddfb55e7b0543c8c42d9c89$
Traceback (most recent call last):
File “/home/frappe/frappe-bench/env/local/lib/python2.7/site-packages/gunicorn/workers/sync.py”, line 135, in handle
self.handle_request(listener, req, client, addr)
File “/home/frappe/frappe-bench/env/local/lib/python2.7/site-packages/gunicorn/workers/sync.py”, line 176, in handle_request
respiter = self.wsgi(environ, resp.start_response)
File “/home/frappe/frappe-bench/env/local/lib/python2.7/site-packages/werkzeug/local.py”, line 228, in application
return ClosingIterator(app(environ, start_response), self.cleanup)
File “/home/frappe/frappe-bench/env/local/lib/python2.7/site-packages/werkzeug/wrappers.py”, line 301, in application
return f(*args[:-2] + (request,))(*args[-2:])
File “/home/frappe/frappe-bench/apps/frappe/frappe/app.py”, line 83, in application
response = handle_exception(e)
File “/home/frappe/frappe-bench/apps/frappe/frappe/app.py”, line 131, in handle_exception
if frappe.local.is_ajax or ‘application/json’ in frappe.get_request_header(‘Accept’):
TypeError: argument of type ‘NoneType’ is not iterable

Here are the memory stats of the system:
total used free shared buffers cached
Mem: 13G 1.7G 12G 440K 22M 153M
-/+ buffers/cache: 1.5G 12G
Swap: 4.0G 0B 4.0G

Here are the file system stats:
Filesystem Size Used Avail Use% Mounted on
udev 6.9G 12K 6.9G 1% /dev
tmpfs 1.4G 416K 1.4G 1% /run
/dev/sda1 30G 13G 16G 46% /
none 4.0K 0 4.0K 0% /sys/fs/cgroup
none 5.0M 0 5.0M 0% /run/lock
none 6.9G 0 6.9G 0% /run/shm
none 100M 0 100M 0% /run/user
/dev/sdb1 99G 60M 94G 1% /mnt

The slow query log is logging this query repeatedly

# Time: 180103  4:36:51
# User@Host: 1bd3e0294da19198[1bd3e0294da19198] @ localhost []
# Thread_id: 343  Schema: 1bd3e0294da19198  QC_hit: No
# Query_time: 17.492096  Lock_time: 0.000037  Rows_sent: 2  Rows_examined: 432185
# Rows_affected: 0
# Full_scan: Yes  Full_join: No  Tmp_table: No  Tmp_table_on_disk: No
# Filesort: Yes  Filesort_on_disk: No  Merge_passes: 0  Priority_queue: Yes
SET timestamp=1514954211;
select name, owner, creation, data from `tabVersion` where `tabVersion`.docname = "Stock Balance" and `tabVersion`.ref_doctype = "Report"
                          order by creation desc limit 0, 10;
# Time: 180103  4:37:41
# User@Host: 1bd3e0294da19198[1bd3e0294da19198] @ localhost []
# Thread_id: 375  Schema: 1bd3e0294da19198  QC_hit: No
# Query_time: 17.127635  Lock_time: 0.000066  Rows_sent: 1  Rows_examined: 432184
# Rows_affected: 0
# Full_scan: Yes  Full_join: No  Tmp_table: No  Tmp_table_on_disk: No
# Filesort: Yes  Filesort_on_disk: No  Merge_passes: 0  Priority_queue: Yes
SET timestamp=1514954261;
select name, owner, creation, data from `tabVersion` where `tabVersion`.docname = "Wide Vision" and `tabVersion`.ref_doctype = "Customer"
                          order by creation desc limit 0, 10;

tlie · January 3, 2018, 10:13am

Looks like you need more CPU because the 14GB RAM looks enough. If you are using cloud provider, try upgrade to 4 CPUs for one hour and see if it can fix your problem. If not, you can always return to the original cloud plan.

keshav · January 3, 2018, 11:20am

@tile I tried updating my server to 4 cores but that still did not solve the issue.

I just noticed that even though I have set the innodb_buffer_pool_size = 10000M it has not taken effect after mysql restart. Here is what it shows after running the query:

lasalesi · January 3, 2018, 12:02pm

Agree with @tlie that it looks like CPU shortage. Can you check you load average (e.g. with top or uptime)? Did the issue gradually occur (increasing load) or at once (configuration)?

keshav · January 3, 2018, 12:21pm

Thanks @lasalesi & @tlie for your suggestion but the issue remains even after switching to a 4 cores. The slowdown has occurred gradually but has become more noticeable since a few days.

To me it looks more like a mysql issue. As pointed out earlier we changed the innodb_buffer_pool_size in /etc/mysql/my.cnf to 9.7 G which is about 70% of the RAM and restarted mysql but the changes are not getting picked up.

The innodb_buffer_pool_size query still shows the buffer as 878706688 (value in bytes)

rmehta · January 3, 2018, 3:23pm

Setup slow query log and share your results.

keshav · January 3, 2018, 6:20pm

Here is the log after the creation, followed by cancellation and deletion of a Sales Invoice .

# Time: 180103 18:09:03
# User@Host: 1bd3e0294da19198[1bd3e0294da19198] @ localhost []
# Thread_id: 220  Schema: 1bd3e0294da19198  QC_hit: No
# Query_time: 10.126399  Lock_time: 0.000042  Rows_sent: 2  Rows_examined: 434303
# Rows_affected: 0
# Full_scan: Yes  Full_join: No  Tmp_table: No  Tmp_table_on_disk: No
# Filesort: Yes  Filesort_on_disk: No  Merge_passes: 0  Priority_queue: Yes
use 1bd3e0294da19198;
SET timestamp=1515002943;
select name, owner, creation, data from `tabVersion` where `tabVersion`.docname = "INV/MAA/09222" and `tabVersion`.ref_doctype = "Sales Invoice"
                          order by creation desc limit 0, 10;
# Time: 180103 18:11:16
# User@Host: 1bd3e0294da19198[1bd3e0294da19198] @ localhost []
# Thread_id: 260  Schema: 1bd3e0294da19198  QC_hit: No
# Query_time: 10.149511  Lock_time: 0.000030  Rows_sent: 3  Rows_examined: 434306
# Rows_affected: 0
# Full_scan: Yes  Full_join: No  Tmp_table: No  Tmp_table_on_disk: No
# Filesort: Yes  Filesort_on_disk: No  Merge_passes: 0  Priority_queue: Yes
SET timestamp=1515003076;
select name, owner, creation, data from `tabVersion` where `tabVersion`.docname = "INV/MAA/09222" and `tabVersion`.ref_doctype = "Sales Invoice"
                          order by creation desc limit 0, 10;
# Time: 180103 18:22:59
# User@Host: 1bd3e0294da19198[1bd3e0294da19198] @ localhost []
# Thread_id: 320  Schema: 1bd3e0294da19198  QC_hit: No
# Query_time: 13.235107  Lock_time: 0.000026  Rows_sent: 0  Rows_examined: 434303
# Rows_affected: 3
SET timestamp=1515003779;
delete from tabVersion where ref_doctype='Sales Invoice' and docname='INV/MAA/09222';

rmehta · January 4, 2018, 2:43am

As a quick fix, disable versioning for sales invoice, since there is a save/submit trail.

Also create a bug report on GitHub. I am surprised docname is not indexed!

Cc @nabinhait

keshav · January 4, 2018, 2:54am

Thanks @rmehta. Will open the github issue and disable versioning for Sales Invoice.

I found the issue with mysql not picking up the settings for innodb_buffer_pool_size , instead of picking it from /etc/mysql/my.cnf it was picking up the value from /etc/mysql/conf.d/frappe.cnf, which I went ahead and changed to 10G.

Now the system is performing much faster. Though the issues indicated by you still remain.

sjain · June 15, 2021, 3:41pm

Version Table is still having the issue and keeps on slowing the system. I have 8.8 Million lines in my table and the system is taking a lot of time to save any document.

Any suggestions would be appreciated.

keshav · June 16, 2021, 5:37am

Please take a look at this post. You can delete the rows in the Version table.

sjain · June 16, 2021, 9:19am

After deletion of data from the version table also there is no effect on slowness. It is still dead slow even saving expense claims is taking more than 30 seconds.

SubhajitDey · June 16, 2021, 9:48am

which version are you using ? How many records in version table ?
Can you make sure there is a index on ref_doctype+docname for version table ?
Regards,
Subhajit

sjain · June 16, 2021, 10:25am

I am using version-13 frappe and erpnext both. I have deleted all the data from Version table already but still loading and data writing speed is insanely slow.

SubhajitDey · June 16, 2021, 10:43am

Sounds like different issue, not because version table.
I haven’t tried v13, but you might wanna check the gunicorn/background worker setting.
Also can use recorder app to find out the performance bottleneck

Jermaine_Gray · June 16, 2021, 10:55am

Am having the smae problem with version 13. I ask for a recommend size document and someone told me to get more CPU and RAM. Am runn the same amount of user in a t2.medium instant in AWS. Am current building a new server with 20 vCPU and 30Gb RAM. am concern this will not fix my problem.

moe01325 · June 16, 2021, 2:17pm

is it running slow all the time or just sometimes? how many users does your system have?
i am currently investigating a similar (?) issue: How to pinpoint CPU spike? - Developers - Discuss Frappe/ERPNext

sjain · June 20, 2021, 7:26am

@rmehta I identified one major issue / bug in connections (Dashboard), if the number of records in Journal Entry are huge then system search becomes too slow as everytime you open a doctype with Journal entry in connections it starts searching for the same record in system and it slow downs the system.

I have experienced the slowness major into Expense Claim, Employee Advance, Purchase Invoice, Sales Invoice (All Doctypes having Journal Entry in Dashboard/Connections)

After commenting Jounal Entry from Dashboard the submission of document showed a tremendous change i.e. from 600+ seconds to 26 seconds.

Please help in understanding the slowness in search?

Below are some references -

32 Core CPU
64 GB RAM
Intel e7 Processor (Physical Server)
Ubuntu 20.04
ERPNext: v13.5.0 (version-13)
Frappe Framework: v13.5.0 (version-13)
Journal Entry Count - 2.3M+
Purchase Invoice - 100k+
Company - 96 (Multi Company)

Not_a_countant · June 20, 2021, 9:58am

The general rule here is not to tag anyone. If you feel someone MUST help you because your problem is very important, then you need to pay for it. If you want ERPNext to fix your issues, here is a link you can sign up. Best wishes.

sjain · June 20, 2021, 10:30am

My intend was not to tag anyone or pinpoint anything. However, I was trying to seek help and have found a major issue and that’s why I tagged. And I wasn’t aware about tagging policy as well. So please take it otherwise.