Event-based analytics

This is a question directed mostly at the Frappe team…

One of the important functionalities of an ERP is analytics. Right now, we have basic analytics (stock/financial/sales analytics etc) but most of it is document-driven rather than change-driven. What I mean by this is that we have reports that give us $ booked in a certain time period, but we don’t have anything that gives us, for example, the date that X lead had their status change from “open” to “interested”. I’m not even sure that the data for that is stored anywhere - the row is just updated in the DB.

If one were to go about building something like this - and fitting with the customizable/extensible nature of the framework - would you want it built in Frappe, or as a separate application with doc hooks? The logic for storing information is rather trivial, but the design of a feature like this involves a lot of compromise. Do you have the user select which fields should be tracked? Or do you assume more control and track certain fields? Storing a copy of each iteration of a document is definitely too much data, and storing changes is better, but deciding who decides (for lack of a better phrase) which fields need to be tracked is a rather big decision.

The two ways I am toying with is adding a checkbox in docs similar to “hidden” or “report hide” that is “analytics field”, and run an event on validate to track changes made to that field in a different table. The other is a separate app that hooks into a set of documents (imagine one big table with doctype, docfield) and on any event where that field changes, it writes to a db.

From there, you can generate extremely powerful analytics that really empower business both on an individual level as well as an executive/management level. Check out places like www.insightsquared.com for some examples.

@alec_ruizramon1 can you describe a use-case? Are you talking about more reports or something deeper?

More reports - but they need something deeper to happen (unless the framework does this and I have no idea about it).

We have a custom field that’s used to estimate the close date of an opportunity/quotation. If the date moves back, i.e. the salesperson changes the close date to a later date, the opportunity has been “pushed”. By keeping track of when the field changes, you can generate a report of % of opportunities pushed by salesperson, by deal value, and so on.

This can be extensible to many other fields as well. It unlocks a deeper level of analytics than just the current state of documents.

@rmehta The following links has some details of the budgetary control with some forecasting (not excactl BI per se) in technical terms:

http://wiki.idempiere.org/en/Plugin:_Budgetary_Control

Some opensource BI are Pentaho, SpagoBI, KNIME among others, and BPM are Camunda, Bonita BPM among others. But most of them are written in Java.

Maybe this shows some light, I hope. :wink:

@rmehta @zenny this is what SalesForce calls it : https://help.salesforce.com/HTViewHelpDoc?id=tracking_field_history.htm

@rmehta,

This would be a great feature. Basically the idea would be to add a new attribute to fields that indicates whether or not changes to them should be tracked. Then there would be a doctype (or two) for doctype changes. I think two is better. The first (DocChanges) could be to track documents that have changed and the 2nd (FieldChanges) would be to track fields in the document in DocChanges that changed.

DocChanges
Change No (PK)
Doc Type
Doc Name
Date of Change
Time of Change
Type of Change (creation, update or deletion?)
Changed By

FieldChanges
Change No (PK)
Field Name
Old Value
New Value

Doc Types in which the name is the field that changed would update Doc Name in DocChanges as well as the Field Name, name, in FieldChanges.

Updates to the tables would be fired from a hook on update, delete, create etc.

Regards,
cksgb

The changes can be tracked without a hook if we want to edit the save method of the document class. This is all-encompassing BUT will result in large amounts of data.

I have a working prototype of a field changes table…it holds doctype, docname, field name, old value, new value, and date of change. It’s just done in the save method of the document class. If we want to do hooks + an app, things can get more complex but also more customizable - lots of pros and cons to discuss here. I think that it should end up in the framework, though, because if you later decide you need analytics, you’ll need the data to support it - and without the data, a newly-installed analytics app is useless.

@Chude_Osiegbu your DocChanges table can be made unnecessary by tracking the ‘modified’ and ‘modified_by’ fields of any document, along with the field changes above - unless I have missed something! :smiley:

There is already something called version

https://frappe.github.io/frappe/current/models/core/version.html

This saves every change in a JSON

1 Like

That would work great. When look at the Version List on our production server, it only lists versions of Web Page / Blog Page. Is this intended? Or does more need to be done to track versions of an Opportunity? (Or is it something on my end?)

you need to set save_versions = True in the controller.

But maybe this can be a standard property in DocType that users can set via Customize Form.

2 Likes

Where can I find that in the controllers?

Once I find that I can take a look at setting it as a doctype property.

Thanks :smiley:

1 Like

Storing entire doctype changes as a version in a json blob is not going to scale well for most general purpose analytics.

To actually query things, you will have to do a select * where you can at best filter by ref_doctype, then process in python to find the records you’re looking for. If you have a million rows where only 10% of the rows have information you need, you will still need to pull all million rows. (And a million will seem small once you implement versioning).

It will work, however, if you’re planning to load the blob, process it through ETL, and then load it into a data warehouse of some sort, and then you run your analytics off that data warehouse.Also, if you know you will only have a small number of rows, it will work well.

Just be aware of your use case and assumptions.

1 Like

Right.

The logic to only store the changes is trivial - the problem more is whether or not you want to allow the user to select which documents have change history, and more complicated-ly (not a word…) if you want to allow the user to select which fields have changes.

But, if we decide that all fields should have history, except for possibly setup or whichever else, then creating an analytics module with 1:1 doctypes : field history table becomes a lot easier.

The main compromise I see is that, if we give the users no control and store all (or almost all) field histories, they will use more storage, but implementation will be easier and down the road they’ll have a history they can use if they want a certain set of analytics. If we give the users a lot of control, implementation becomes harder but they may see a marginal savings in disk space.

For something like this, I think a happy medium could be module-level or doctype-level control for the user, but not field-level.

1 Like

@felix @rmehta

What do you think about something like this…
https://github.com/aruizramon/frappe-1/commit/c764e874588992d21b193d3916cd8eb9cdd8aa36#diff-a3f70253d5e0e74b60e2c294dbf6c589

plus an app…

https://github.com/aruizramon/frappe_analytics

This stores the fields that have changed when saving a document all into one big table. If the user wants to get analytics from those, they can install the analytics app, which sorts them all into their own table based on the doctype that has been edited. The after_install method will sort them on installation, and thereafter, the app hooks into changed_field entries, sorts them, and deletes them. (are hooks done as a background task? If not, it would be better to run the event as a background task…)

1 Like

@alec_ruizramon1 I think we can add it in Frappe :slight_smile:

Do you want to send a PR?

1 Like

@rmehta I’d love to!

Let me first add support for the child tables, though - right now, they’re being ignored.

This will be a really powerful feature for Frappe!

EDIT: PR here: https://github.com/frappe/frappe/pull/1722

1 Like

Some fields are getting automatically tracked in erpnext. Fields like assigned to in Lead doctype. How is this happening? Is there a database table to log such edit history.

@rmehta Can you please let me know if changes from @alec_ruizramon1 are added into frappe?

Couldn’t find anything in document.py of model directory.

The changes are in an app - they weren’t integrated into Frappe directly.