Preventing Deadlocks [v8]

Hi,

i wondering if there anyone facing sae issues with me…

i had implementing erpnext for my client with around 100 user
im already setting up the database buffer and raising the worker for the gunicorn but in result
more worker the performance of reading is good but many deadlock upon creating document , but lower worker resulting in slow system performance

my server spec is already:
32 Gb Ram and 32 CPU

in my “TOP” in ssh ram usage is below 12 GB

any ideas ?

THanks
Bobby

I will be interested on how the performance improve because currently we are negotiating an implementation that may involve 400 concurrent users

Setup and monitor slow query log for mariadb, it will give you a clue for what is causing the issues.

Plug: You can also contact the Frappe team for help under enterprise support. We have lot of expertise and experience in performance handling! Drop us a mail at hello@erpnext.com

If the problem is deadlocks, log them. https://www.percona.com/blog/2012/09/19/logging-deadlocks-errors/ is one way to do this. There are also other ways to log.

Once problematic queries are identified, they can be targeted. Sometimes the solution is as simple as adding an appropriate index to a table. Sometimes there needs to be some query refactoring.

@rmehta sure will drop a support ticket on it

@felix will try percorna link that you gave

Thanks will update tommorow

@felix here from percona logger

localhost 2018-05-17T09:47:55 206794 0 7 1bd3e0294da19198 localhost 1bd3e0294da19198 tabSeries GEN_CLUST_INDEX RECORD X w 0 select current from tabSeries where name=‘GL’ for update
localhost 2018-05-17T09:47:55 206796 0 6 1bd3e0294da19198 localhost w 1 insert into tabPacked Item (serial_no, qty, owner, target_warehouse, actual_qty, page_break, modified_by, item_name, warehouse, docstatus, uom, creation, description, parent, item_code, projected_qty, name, idx, parent_detail_docname, modified, parent_item, parenttype, batch_no, prevdoc_doctype, parentfield) values (NULL, 1, NULL, NULL, 212, 0, NULL, ‘GONI PACKING-230X230’, ‘Gudang Springbed - OCF’, 0, ‘Pcs’, ‘2018-05-17 16:47:55.134220’, ‘50080002\n

GONI PACKING: 230X230

’, ‘DN-31480’, ‘50080002-230X230’, 212, ‘4ac2d6271e’, 1, ‘fe7d96a948’, ‘2018-05-17 16:47:55.134220’, ‘GPB’, ‘Delivery Note’, ‘’, NULL, ‘packed_items’)

but i dont think this is something we can do in erpnext to change, right?

Naming Series is one of the biggest framework constraints with regards to Frappe/ERPNext - it just doesn’t scale. We avoid it completely when making custom things.

In your case, is your DB on SSD drives, or are you using spinning drives? For 100 concurrent users, a move to SSDs will most likely solve your issue.

@felix its SSD Disk,
may i know what did you do in naming series ?..

When you do an insert into a doctype which uses naming series, it also does an update on tabSeries which stores the increment value.

So, in your log, you’re inserting into Packed Item. When you do that insert, an entry is also made into GL Entry. To make the GL Entry insert, an update is also done on tabSeries to update the GL naming series. You can quickly see how this can spiral out of control as tabSeries is used everywhere and can easily deadlock, even with fast drives and an index on the updated field.

There’s no real way to fix this in stock Frappe/ERPNext beyond just moving to drives which offer faster iops to reduce the chance of deadlocks. The ultimate fix would be to get rid of naming series altogether, but that’s a major core project and would also upset some customers because they would lose human readable primary and foreign keys.

If you’re using an SSD, test IOPS for the drive (not all ssds perform well, and if you do something like RAID5, that’ll make things worse). On AWS for example, you may need to move to provisioned IOPS when using their block storage.

If you test and have very good IOPS, the next step could be to split your mariadb database and log to different drives, which can help speed things up even more.

But basically, you’re getting into the realm of more difficult solutions now. I’ve seen implementations with many times more concurrent users than you mention without serious deadlock issues, so the drive the database is on would be the first place I would look.

got it… will check with erpnext team for solution first…

thanks @felix
this is text increment documetn series problem

Yes, that’s a good idea. It could be possible to optimize or batch the query to handle things better. But since that will delve into the realm of stock valuation, it needs to be done carefully. That seems to be Nabin’s area of expertise, so contacting/contracting a solution from the Frappe team is also another option.

yeah… i dont really have solutionn for this…too many data create really makes too many worker is difficult… or else i got idea to make worker for submt is different with worker to read… i think thats a solution but i dunno how to make it

@felix have another idea ?

Adding read-only db slaves is not going to help solve this issue. Your options are

  1. Rewrite/optimize the queries to prevent/reduce deadlocks
  2. Increase DB performance to minimize the impact and chance of deadlocks

are you sure?@felix
im think of it because lower worker its safe for me…no dead lock…

@codingCoffee can you please check into it?

Well, that works because fewer workers means fewer concurrent processes happening, which reduces the chances of deadlocking. If you and your client are happy with the speed and outcome, then that is a viable solution.

@bobzz_zone have done any customization on the back-end? Something you could share!?

@codingCoffee yes i can PM you for the ssh credetials…

there is a customization … but most of the in separate apps , but still some erpnext files is modified to match the request… but should be no impact with the performance…

will PM you the logins

Thanks

Updating erpnext to the latest version solve the problem

thanks

I am still facing Deadlocks in Version-13. Can someone pls suggest a way to resolve this?