ERPNext Foundation ERPNext Cloud Chat Blog Discuss Frappé* Donate

[Performance] 1 Sales Invoice Save makes ~700+ DB calls. ~1000+ DB calls on Submit

performance

#21

Currently the db.get_single_value method already implemented caching, but the majority of the call to get system configuration value is via frappe.db.get_value or the short cut frappe.get, in both these 2 methods call the other method(multi values) get_values_from_single than the below one, so in my opinion we can simply change the db.get_value as following to utilize the caching

 def get_value(self, doctype, filters=None, fieldname="name", ignore=None, as_dict=False,
	debug=False, order_by=None, cache=False):
	if filters is None:
		return get_single_value(doctype, fieldname)
	else:
		ret = self.get_values(doctype, filters, fieldname, ignore, as_dict, debug,
			order_by, cache=cache)

		return ((len(ret[0]) > 1 or as_dict) and ret[0] or ret[0][0]) if ret else None 

  def get_single_value(self, doctype, fieldname, cache=False):
	"""Get property of Single DocType. Cache locally by default
	:param doctype: DocType of the single object whose value is requested
	:param fieldname: `fieldname` of the property whose value is requested

	Example:

		# Get the default value of the company from the Global Defaults doctype.
		company = frappe.db.get_single_value('Global Defaults', 'default_company')
	"""

	value = self.value_cache.setdefault(doctype, {}).get(fieldname)
	if value is not None:
		return value

	val = self.sql("""select value from
		tabSingles where doctype=%s and field=%s""", (doctype, fieldname))
	val = val[0][0] if val else None

	if val=="0" or val=="1":
		# check type
		val = int(val)

	self.value_cache[doctype][fieldname] = val

	return val

What do you think?


#22

Thanks @Sachin_Mane, Himanshu, @sivankar for coming to our office to fix these issues!

We had an awesome sprint and killed the number of calls to 1/4th! I think we have an unbelievable 4X improvement.

This is the true power of open source communities. :muscle:

This big issue… meta was not cached at redis :see_no_evil: https://github.com/frappe/frappe/pull/5948/files


#23

Really amazing team and community … many thanks


#24

This is phenomenal, great cooperation and outcome. Looking forward to the sprint before the conference.


#25

Thank you @rmehta for opportunity to work with you guys. Lot of new things learned. We will write about it in detail. Look forward for many such collaborative sprints.


#26

@rmehta @Sachin_Mane Awesome work. Can someone write a detailed blog/article describing how this improvement was achieved?


#27

Yes. Will summarise changes


#28

Has this improved just the sales invoice or all doctypes?


#29

~4 times lighter and faster ErpNext and more than 4 times happier Community

This article summarise the code changes done as part of the optimisation sprint we had in Frappe office 2 days back…

To start with, following is short ‘Before’ and ‘After’ summary -

Transaction - Submitting Sales Invoice containing 51 Items.

Bench Console

In [3]: si=frappe.get_last_doc('Sales Invoice')

In [4]: len(si.items)
Out[4]: 51
In [2]: %prun -s cumulative si.submit()

Results

Before optimisations
Time Taken : 4.56s
Number of SQL calls : 3625

After optimisations
Time Taken : 1.511s [First Time] and 0.981s [Subsequent Calls]
Number of SQL calls : 1070

It was evident that in a single transaction, same document was getting read from database again and again. get_doc and get_value method was getting called multiple times for the same document and same field.

General approach was to cache repetitively accessed documents in memory to reduce number of database calls. Following is the list of key changes done.

get_cached_doc instead of get_doc

First optimisation done was to cache documents in the local in-memory request level cache. A new request level cache is added in `frappe/__init__py`
def init(site, sites_path=None, new_site=False):
	"""Initialize frappe for the current site. Reset thread locals `frappe.local`"""
       ....
       ....
       ....
       local.document_cache = {}

NOTE -
frappe.local is a request level cache and gets cleared before each request.
frappe.flags is also request level cache

A new method frappe.get_cached_doc is introduced as follows -

def get_cached_doc(*args, **kwargs):
	if args and len(args) > 1 and isinstance(args[1], text_type):
		key = get_document_cache_key(args[0], args[1])
		# local cache
		doc = local.document_cache.get(key)
		if doc:
			return doc

		# redis cache
		doc = cache().hget('document_cache', key)
		if doc:
			doc = get_doc(doc)
			local.document_cache[key] = doc
			return doc

	# database
	doc = get_doc(*args, **kwargs)

	return doc

The above method implements progressive checks in local request cache then in global redis cache and then falls back to database to fetch a document.

the get_doc method is also modified to put the documents retrieved from database into cache as follows -

def get_doc(*args, **kwargs):
	"""Return a `frappe.model.document.Document` object of the given type and name.
         ...
         ...
         ...
	# set in cache
	if args and len(args) > 1:
		key = get_document_cache_key(args[0], args[1])
		local.document_cache[key] = doc                    <-- set in request level cache
		cache().hset('document_cache', key, doc.as_dict()) <-- set in global redis cache

and then there were selective find and replace based on the %prun output analysis. e.g. One sample %prun record showing lot of time spent in get_item_details method.

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
50    0.002    0.000    **1.809**    0.036 get_item_details.py:16(get_item_details)

All such high time taking methods were identified and get_doc calls were replaced with get_cached_doc in the same.

Following is %prun output after this change

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
50    0.002    0.000    **0.529**    0.011 get_item_details.py:19(get_item_details)

get_cached_value instead of get_value

Similar to ```get_doc```, ```get_value``` was another method which was getting called many time for same doctype and field. This was another candidate to be cache enabled.

Following method is introduced which internally calls get_cahed_doc and avoids a database call altogether.

def get_cached_value(doctype, name, fieldname, as_dict=False):
	doc = get_cached_doc(doctype, name)        <-- Gets cached doc. 1 db call saved per call
	if isinstance(fieldname, text_type):
		if as_dict:
			throw('Cannot make dict for single fieldname')
		return doc.get(fieldname)

	values = [doc.get(f) for f in fieldname]
	if as_dict:
		return _dict(zip(fieldname, values))
	return values

Then another selective find and replaced was done after analysing %prun output. Following number of occurrences of get_value were replaced

➜  apps grep -R 'get_cached_value' . | grep -v pyc | wc -l
     186

Fixed Meta caching

Rushabh found out that doctype meta-data was not actually getting saved in redis cache. It was failing to serialise doctype metadata and silently eating up the exception.

A custom serialisation method was introduced to serialize meta data as follows -

	def as_dict(self, no_nulls = False):
		def serialize(doc):
			out = {}
			for key in doc.__dict__:
				value = doc.__dict__.get(key)

				if isinstance(value, (list, tuple)):
					if len(value) > 0 and hasattr(value[0], '__dict__'):
						value = [serialize(d) for d in value]
					else:
						# non standard list object, skip
						continue

				if (isinstance(value, (frappe.text_type, int, float, datetime, list, tuple))
					or (not no_nulls and value is None)):
					out[key] = value

			return out

		return serialize(self)

After the custom serialization, redis started caching it. This was a big improvement reducing number of database calls significantly as it was getting used very frequently.

Although, these changes would improve performance of most of the transactions because of common codebase, we should %prun analyse every transactions to identify most time taking methods and selectively replace get_doc() and get_value() with get_cached_doc() and get_cached_value().


#30

Great work


#31

I wonder how this will impact recommendations on how much memory to allocate to redis cache. Right now, even for many concurrent users (100+), a small amount of memory for redis cache is enough. It looks like that will change with this.


#32

awesome! way to go, guys!


#33

Have you joined the developer group on Telegram. Here is the link:

https://t.me/joinchat/Ad0sykzhT4J47qZmNzg1oQ