[Performance] 1 Sales Invoice Save makes ~700+ DB calls. ~1000+ DB calls on Submit

Makes sense to me.

A design question: currently when get_cached_doc and get_cached_value are called with a bad key they return a DoesNotExist Error rather than None. Is there a reason it works this way? I would think you’d want the two methods to have symmetrical APIs.

Good point. get_cached_value should return None, while get_cached_doc only should return DoesNotExist Error.

In [2]: x = frappe.get_value("Item", "nope", "item_code")
In [3]: print(x)
None
In [4]: x = frappe.get_cached_value("Item", "nope", "item_code")
---------------------------------------------------------------------------
DoesNotExistError

get_cached_value and get_value do not match.

In [5]: frappe.get_doc("Item", "nope")
---------------------------------------------------------------------------
DoesNotExistError

In [6]: frappe.get_cached_doc("Item", "nope")
---------------------------------------------------------------------------
DoesNotExistError  

get_doc performs consistently, though I still think returning None is probably better than DoesNotExistError

1 Like

Well to me the convention is not clear? None is the general nil or no value exception case, whereas DoesNotExistError is thrown when a doctype or an aspect of it fails to load_from_db

Maybe this is too specific a use case, but here’s how I’m regularly checking for an existing document before creating a new one. To me, the first option is more pythonic and more efficient.

x = frappe.get_doc("Item", nope")
if x:
    return x
else:
    frappe.new_doc("Item")
    # set item attributes

As opposed to:

try: 
    x = frappe.get_doc("Item", nope")
except:   # or except DoesNotExistError:
   x = None
finally:
     if x:
         frappe.new_doc("Item")
         # set item attributes
return x
2 Likes

yes to me the 1st is not ambiguous and preferred

The 2nd use case I question since new_doc executes regardless ie even when Item already exists?

Remember that finally is a ‘clean up’ action…

see 8. Errors and Exceptions — Python 2.7.18 documentation

“'Since the finally clause is always executed before leaving the try statement, whether an exception has occurred or not”

Yeah, my mistake. I have edited my post so that it’s correct. You still have to check for x. If you just put in the except portion it might catch the does not exist error but you still want it to catch other validation errors, so I think finally is better spot for it, to ensure it is outside the try/except pattern.

get_cached_value is returning DoesNotExistError because internally it calls get_cached_doc()

I think this should be handled in get_cahced_value to make the api same as get_value

@Sachin_Mane
Currently if sales invoice has 51 items, when saving the invoice, there will be 51 DB calls(insert) on the tabSales Invoice Item child table, because in the parent doc’s insert method, one db_insert is called for each child table record. the extracted code as below

	# children
	for d in self.get_all_children():
		d.db_insert()

the db_insert method only handles one record, but actually the insert into clause supports inserting multi records in one call like below

        insert into table (columns) values(record1), (record2), ...

based on the above analysis, I propose to copy and adapt the db_insert method to support handling multi records like below: model/base_document.py

def db_insert_multi(self, docs):
	"""INSERT multi documents (with valid columns) in the database."""
	def insert_multi():
		try:
			columns = doc_dict_for_insert[0].keys()
			sql = """insert into `tab{doctype}`
					({columns}) values """.format(
					doctype=doctype,
					columns=", ".join(["`" + c + "`" for c in columns]))
			placeholders = ''
			values =[]
			for d in doc_dict_for_insert:
				rec_placeholder = '(%s)' %(", ".join(["%s"] * len(columns)))
				if placeholders :
					placeholders = '%s,%s' %(placeholders , rec_placeholder)
				else:
					placeholders = rec_placeholder
				values.extend(list(d.values()))
			sql = '%s %s' %(sql, placeholders )
			frappe.db.sql(sql, values)
		except Exception as e:
			raise
		for doc in doc_obj_for_insert:
			doc.set("__islocal", False)
	if not docs:
		return
	doctype = docs[0].doctype
	doc_dict_for_insert = []
	doc_obj_for_insert = []
	count = 0
	for doc in docs:
		count += 1
		if not doc.name:
		# name will be set by document class in most cases
			set_new_name(doc)
		if not doc.creation:
			doc.creation = doc.modified = now()
			doc.created_by = doc.modifield_by = frappe.session.user

		d = self.get_valid_dict(doc=doc)
		if doc.doctype != doctype or count > 500:
			insert_multi()
			doc_dict_for_insert = []
			doc_obj_for_insert = []
			count = 0
		doctype= doc.doctype
		doc_dict_for_insert.append(d)
		doc_obj_for_insert.append(doc)
	insert_multi()

the related method needed to be adapted also

def get_valid_dict(self, sanitize=True, convert_dates_to_str=False, doc = None):
	doc = doc or self
	d = frappe._dict()
	for fieldname in doc.meta.get_valid_columns():
		d[fieldname] = doc.get(fieldname)

		# if no need for sanitization and value is None, continue
		if not sanitize and d[fieldname] is None:
			continue

		df = doc.meta.get_field(fieldname)
		if df:
			if df.fieldtype=="Check":
				if d[fieldname]==None:
					d[fieldname] = 0

				elif (not isinstance(d[fieldname], int) or d[fieldname] > 1):
					d[fieldname] = 1 if cint(d[fieldname]) else 0

			elif df.fieldtype=="Int" and not isinstance(d[fieldname], int):
				d[fieldname] = cint(d[fieldname])

			elif df.fieldtype in ("Currency", "Float", "Percent") and not isinstance(d[fieldname], float):
				d[fieldname] = flt(d[fieldname])

			elif df.fieldtype in ("Datetime", "Date", "Time") and d[fieldname]=="":
				d[fieldname] = None

			elif df.get("unique") and cstr(d[fieldname]).strip()=="":
				# unique empty field should be set to None
				d[fieldname] = None

			if isinstance(d[fieldname], list) and df.fieldtype != 'Table':
				frappe.throw(_('Value for {0} cannot be a list').format(_(df.label)))

			if convert_dates_to_str and isinstance(d[fieldname], (datetime.datetime, datetime.time, datetime.timedelta)):
				d[fieldname] = str(d[fieldname])

	return d

finally the call to insert child table records /model/document.py needs to be changed as following

def insert(self, ignore_permissions=None, ignore_links=None, ignore_if_duplicate=False, ignore_mandatory=None):

	# children
	#for d in self.get_all_children():
	#	d.db_insert()

	self.db_insert_multi(self.get_all_children())

this change will reduce the DB calls significantly for docs with a lot of child items, hence improve the performance.

Any Idea?

2 Likes

Lets start a new thread for this. This thread has gone too far. Closing.