Using PostgreSQL as a database for ERPNext

Dear All,
This dreaded question had to come up, and here it is:
As ERPNext is doing beautifully, and is predictably going to be deployed in more, and probably also more complex environments, MySQL/MariaDB may not be anymore the right choice of database for all settings.
How do you think of leveraging (relative) database independence by putting a pinch of SQAlchemy into the mix? After all, ERPNext is already doing great in all other areas of scalability (memcached, session management, deployment by gunicorn, Nginx reverse proxy caching, etc.).
Any thoughts?
Thanks a lot for your thoughts -
Chris

On Tue, Sep 2, 2014 at 11:09 PM, Christoph H. Larsen
ch...@gmail.com wrote:

Dear All,
This dreaded question had to come up, and here it is:
As ERPNext is doing beautifully, and is predictably going to be deployed in
more, and probably also more complex environments, MySQL/MariaDB may not be
anymore the right choice of database for all settings.
How do you think of leveraging (relative) database independence by putting a
pinch of SQAlchemy into the mix? After all, ERPNext is already doing great
in all other areas of scalability (memcached, session management, deployment
by gunicorn, Nginx reverse proxy caching, etc.).
Any thoughts?
Thanks a lot for your thoughts -
Chris

Hi Chris!

Thanks a lot for your kind words.

We’ve discussed this a lot in the team. We agree that Postgres could
be faster and a more easy to administer database in production but the
time and energy required might not really be worth it.

We a have our own ORM (database.py in frappe) and use it in most
places. Also, database independence is elusive as even though a
project might use sqlalchemy or Django ORM, the datastore is mostly
decided because of assumptions/feature_requirements of the database
system itself.

Once there is a concrete need for Postgres, we’ll spend a week and
migrate but we won’t have the energy to support a database independent
product.

Thanks,

Pratik
erpnext

I agree with Christoph H. Larsen  you have made a great job and you have a framework well designed and excellent!!!
But i have too agree that frappe must have the Postgresql as database.
Keep the good work!

Those who are suitably motivated can:

  1. Open a GitHub Issue
  2. Suggest a roadmap
  3. Help fix direct sql queries into the ORM (there are a lot of low hanging fruit here)
  4. Extend ORM so that more complex joins can be composed
  5. Write tests

:slight_smile:
best,
Rushabh

5 Likes

This can a seperated proejct:)

Thanks a lot, everyone, for your thoughts.

I entirely agree with you, Pratik, that you do not want to waste your and your team's time to create and maintain database independence.
However, I do not that the terrific growth of ERPNext renders the deployment of MySQL/MariaDB somewhat obsolete, and as you grow even further, there will be moore and more situations, where the current database just does not cut it any more.
Let me give you a concrete use case:
A company has deployed ERPNext, and people work in the field on their notebook computer, which is all fine, as long as there is reliable internet access.
However, what happens, if that is not the case? With PostgreSQL we can start ot think about doing things the DuckDuckGo way: They use PostgreSQL as database backend, yet their employee can work on their datasets offline, and perform delayed, asynchronous replication of the data they worked on (as well as data changed at the central database instance) using bucardo. Pretty cool stuff - I have used it myself in a multi-centric e-health setting.
So, without hte intention to start a flame war, but the thought of replacing MySQL/MariaDB with the hugely more resilient PostgreSQL may be worth a minute, or two. An added goodie may be PostgreSQL's capability to cater for GIS data - something of increasing importance in the logistics world, and something quite routinely demanded in African donor logistics, for instance.

You mentioned the upcoming object-oriented database for ERPNext. While I do not know anything about this specific one, a object-oriented approach is attractive, as you can travel back in time quite easily - something I have appreciated during my work in the Plone world. However, talking zodb, thing do become hugely slow and unwieldy after some time, and the only way to get up to speed again in the zodb world was to implement relstorage (with PostgreSQL), where the Zope database "pickles" are stored in a PostgreSQL database. Means, you have an objct-oriented (zodb-like) database structure, but an SQL database backend that, however, cannot be queried in the usual SQL fashion.
It work, but is a wee bit more complex. If something goes wrong, I am not entirely sure, whether that kind of setup is easier to fix than a straight RMDB backend. .. my experiences have been truly mixed on that account.

Sorry for the long message, but here you go...
Thank you so much for all your tremendous efforts!

Chris

I'm not part of the team of erpnext. So it is just my opinion, the offline feature generally good for those software(personal user, not enterprise application).

Though the data you mentioned has offline feature, please note it will has a extra burden to avoid conflict of business data which will be a burden to manager of system.

MySQL is widely used among enterprise based application. for current phrase of frappe, i think the framework is good at its design of "Quick Enterprise Application" as well as web application. If you use it as Enterprise saas, it is good to go, if you use it for personal service, it will be not that attractive.

Things can't be balanced, it has its priority, for now, i think the frappe still has thing to do:
1. Debug isn't that transparent
2. Documentation
3. Read/Write separation
4. Patch management(I think existing way still has its flaw, especially, its cloud version, whose client has different timezone)

Thanks,
yang
Thanks everyone for sharing your views on this. Will let @Pratik to give point-to-point replies

Quick response: MySQL / MariaDB suffers from an older perception that it is not feature rich. We are yet to come across a realistic use case to switch away from MySQL. GitHub, Facebook, Basecamp and Wikipedia all heavily depend on MySQL for their products and with the MariaDB / XtraDB forks, I think the product is in safe hands.

@YY - yes we agree with documentation, but we are facing with an ever increasing feature list and with higher expectations, we are not sure how to produce. 

There is always the background thought of looking for external funding to speed up development, but that comes with its baggage too. Will be happy to get views of the community on this.

best,
Rushabh
Dear All,

Just a quick comment: When mentioning (and vastly preferring) PostgreSQL, I am not doing so for its feature richness. Instead, I am after standards compliance, robustness, behaviour under heavy load and easy of replication and synchronisation. I think it is in these areas, where PostgreSQL really shines, and what I have seen over and over again in big data setups. So, why not going for a database backend that behaves well under heavy load, is highly scalable and loves SQL standards?

Bests,

Chris
Hello, everyone!

I have a certain interest in all matters get up above, and as a technical user, I understand both the issues discussed on PostgreSQL and on the complexity that is involved in the change of the standard support database process.

I come only offer a light, yes it is possible to support other databases thanks to the goal structure of frappe, and best of all, maintain compatibility with legacy code.

I'm not saying it is easy, but the grace project maturity and the fine work that the team has done, extend frappe is entirely possible.

Is there a specific ORM project called DAL, which belongs to the web2py project, this is a layer with about 10k lines in a single file, which supports many databases, and raw sql.

And I have discussed a lot in the Realize Group, about the need to support offline due to constant problems of providing internet service operators here in Brazil, and we are trying to trace a way to support this, but Maria is still impossible.

In the past there were already pre-written one like frappe CMS, but left him when know frappe.

The web2py DAL's add one more layer friendly to non-technical users to compose reports, without the need to learn SQL, further decreasing the learning curve.

The biggest problem in starting this journey is to create a parser that is capable of re-writing all the content in SQL design for the new base layer data, however there is a python-sqlparser project, which can be of great help, being a small layer on the "frappe.db.sql" function that will translate the raw SQL for objects and thus make them accessible through an object-oriented layer.

If there are more users looking for ways to do this, please to post your ideas here, so we can develop a range of ideas.




Hi Chris, Max and YY,

They use PostgreSQL as database backend, yet their employee can work on their datasets offline, and perform delayed, asynchronous replication of the data they worked on (as well as data changed at the central database instance) using bucardo.

Bucardo looks interesting. They’re planning an implementation for MySQL too.

However, talking zodb, thing do become hugely slow and unwieldy after some time, and the only way to get up to speed again in the zodb world was to implement relstorage (with PostgreSQL), where the Zope database “pickles” are stored in a PostgreSQL database. Means, you have an objct-oriented (zodb-like) database structure, but an SQL database backend that, however, cannot be queried in the usual SQL fashion.

Yes, we’re never going that road. What I meant was that we access
records in frappe using our objects from our ORM, so migration of code
to any other database system would be a little more standard.

I am after standards compliance, robustness, behaviour under heavy load and easy of replication and synchronisation. I think it is in these areas, where PostgreSQL really shines, and what I have seen over and over again in big data setups. So, why not going for a database backend that behaves well under heavy load, is highly scalable and loves SQL standards?

+1. Having used Postgres in the past, I agree that the pg tools are
great. Although MySQL was in middle of the shared hosting boom and has
also seen very large deployments even in companies like GitHub and
Facebook as Rushabh mentioned.

From Max,

I come only offer a light, yes it is possible to support other databases thanks to the goal structure of frappe, and best of all, maintain compatibility with legacy code.

I really like Max’s approach to the issue. Removing the SQL string
building and replacing it with something like DAL seems like a good
idea. Parsing existing sql and converting it to Postgres’ SQL is also
interesting.

I agree with Yang on the part that most required things right now are
in terms of developer tools, documentation being the most important
from his list.

For us, to switch the db for Frappe isn’t much of issue but the real
challenge is maintaining the erpnext project. In the recent past we
took the version 4 project which got a little extended and took a few
months to reach production. A dbstore migration project would more
surgical than this. I am not sure if we (at Web Notes) have the energy
at the moment to go for a long such break from shipping everyday.

In my opinion, maximum effort has to be made in improving the product
and increase adoption. That would also result in growth of the
developer community and then maybe would be the right time go for
something like what Django project did with database migrations,

At the moment, this is highest amount of interest ever expressed on
this issue. It is no doubt that Postgresql could be the database that
Frappe deserves.

Thanks,

Pratik
erpnext

3 Likes
Dear Pratik,

Thank you so much for you reply. Comment inline:

On Sunday, 7 September 2014 09:23:51 UTC+3, Pratik Vyas wrote:
Hi Chris, Max and YY,

> They use PostgreSQL as database backend, yet their employee can work on their datasets offline, and perform delayed, asynchronous replication of the data they worked on (as well as data changed at the central database instance) using bucardo.

Bucardo looks interesting. They're planning an implementation for MySQL too.

> However, talking zodb, thing do become hugely slow and unwieldy after some time, and the only way to get up to speed again in the zodb world was to implement relstorage (with PostgreSQL), where the Zope database "pickles" are stored in a PostgreSQL database. Means, you have an objct-oriented (zodb-like) database structure, but an SQL database backend that, however, cannot be queried in the usual SQL fashion.

Yes, we're never going that road. What I meant was that we access
records in frappe using our objects from our ORM, so migration of code
to any other database system would be a little more standard.
I am relived to hear that. Sorry I misunderstood you in that respect.

> I am after standards compliance, robustness, behaviour under heavy load and easy of replication and synchronisation. I think it is in these areas, where PostgreSQL really shines, and what I have seen over and over again in big data setups. So, why not going for a database backend that behaves well under heavy load, is highly scalable and loves SQL standards?

+1. Having used Postgres in the past, I agree that the pg tools are
great. Although MySQL was in middle of the shared hosting boom and has
also seen very large deployments even in companies like GitHub and
Facebook as Rushabh mentioned.

From Max,
> I come only offer a light, yes it is possible to support other databases thanks to the goal structure of frappe, and best of all, maintain compatibility with legacy code.

I really like Max's approach to the issue. Removing the SQL string
building and replacing it with something like DAL seems like a good
idea. Parsing existing sql and converting it to Postgres' SQL is also
interesting.

I agree with Yang on the part that most required things right now are
in terms of developer tools, documentation being the most important
from his list.

For us, to switch the db for Frappe isn't much of issue but the real
challenge is maintaining the erpnext project. In the recent past we
took the version 4 project which got a little extended and took a few
months to reach production. A dbstore migration project would more
surgical than this. I am not sure if we (at Web Notes) have the energy
at the moment to go for a long such break from shipping everyday.
I fully realise your constraints. The reason why I brought this up is to keep it on the backburner, and, if possible, adapt future developments to an architecture that allows easy/easier migration to PostgreSQL, PLUS the fact that ERPNext has, quite simple, become a formidable application that already sees, and will see, more and more compkex deployments in the future. Hence, Max's approach sound not only brilliant, but like a plan :-).

In my opinion, maximum effort has to be made in improving the product
and increase adoption. That would also result in growth of the
developer community and then maybe would be the right time go for
something like what Django project did with database migrations,
https://www.kickstarter.com/projects/andrewgodwin/schema-migrations-for-django

At the moment, this is highest amount of interest ever expressed on
this issue. It is no doubt that Postgresql could be the database that
Frappe deserves.

Thanks,
Thank YOU ad your team, as well as Luis, Max and YY for this! Stay well, have a good weekend and till soon.

I write comment to similar issue https://github.com/frappe/frappe/issues/1333#issuecomment-202882725

is there anything new? any new consideration or discussion from frappe team?

Hi @christoph_h_larsen

I found this python library [1]. I don’t know if it can be useful for your idea.

[1]

Thanks

@fabyc the most important point to keep in mind, while select a library is the parsing of the SQL and the support of SQL functions.

Peewee ORM, have support for both, but the SQL functions still being database dependants
While DAL don’t support SQL Parsing, but have a real compatibility layer for functions across databases.

Recently I returned to this field, due some issues while loading some millions of Invoices into ERPNext, so I discovered that frappe code, need some pontual optmization for updates and inserts, what turns each insert and update into the database very slow;

Yup. the Frappe ORM is quite slow at scale. Its often easier (but messier) to just do direct queries when you have to deal with tens of thousands of rows (not to mention millions!).

I’ll start working in some optmizations in the Frappe Database, I think pontual optmization can solve this issue of performance!

I dont will optmize all cases, just while importing huge data!

3 Likes

Was recently trying to look into this using a bottom up approach with first porting bench to Postgresql which i was able to do to a certain extent, but then I hit several roadblocks in Frappe framework. I think a top down approach is better where we put an ORM first to make it consistent with Mariadb ( not to impact existing install base ) and then allow other database to be used via this ORM.

2 Likes

Any updates on the Postgresql stuff?
Also @max_morais_dmm, do you still feel Postgresql is the right path. Your comments about slow performance importing invoices concerns me a bit.