Try ERPNext Buy Support Partners Foundation

Should the testing framework be re-designed?

Continuing the discussion from What happened to the good old reports?:

just a question from someone with almost 0 expertize in software testing …

why (as your issues imply) is it not good that test are running on test sites, where consequently the problem of leftovers in the db you are raising would not be relevant @cjpit ?

Is there any advantage of running tests on non-test instances? Or is there anything else beyond your proposal which I am just not getting?

@vrms Here is a good intro to why you would want to test and the types of tests.

First of all I know the frappe team have worked so hard on the system and, it is awesome. But my personal opinion, is that the thing it’s lacking at the moment, is quality. I’ve seen in the last month that I have been involved so far that the CI pipeline has broken, code has been commited, that then breaks tests, code commited that breaks things for all the users eg quotes etc this week on v11. This does indicate that there is a fundamental issue with the testing, since test cases, unit tests, integration tests etc should all ensure that these type of things don’t happen.

In regards to the frappe framework, it’s not so much running on a test site, or any site - its the fact that a test that fails, it leaves a whole lot of junk data in the database. This then means the next run of the tests you try and run always fail, because data already exists in the database , and it errors out on names already existing etc. - Then you are getting false failures on tests, that don’t have anything to do with the test failing for a reason of a bug etc.

This means the unit test itself has dependencies - a test is dependent, on another test working, for it to have the right inputs to test. Unit tests aren’t meant to be designed in that way. As a developer I actually run tests constantly while writing the code, and failing tests are part of building code.

That’s my main purpose for proposing some genuine discussion on how the tests have been designed, but unfortunately there hasn’t been much feedback so far.

Some other feedback,
Tests are also a form of living documentation, as they explain how the function or code was intended to be used. This is especially important to get other developers involved, and they know they can trust the code.

At the moment, the tests seem to be pretty weak.

For example I wanted to understand the Project and Task code. The first place I went was to have a look at the tests. I scratched my head for a while trying to understand tasks and “depends on” task. How is that different to “Parent Task” for example, aren’t they really the same thing?

I could see from the one test that was there, that a task could depend on another task. However, when you use the actual gui , and you use the tree view, the “depends on” field actually ends up getting used the opposite way. It shows the children that are dependant on the task your looking at. Since there was only one test, which
was correct? The spelling of the field “Depends on” doesn’t match what the system does, and the test doesn’t match the functionality.

So just giving some honest feedback, I think it needs some help.

I had started a separate project to do that, using pytest. It has a lot of great plugins/options/ease of writing tests etc.

I’ve just done a set of tests for the refactoring of the ldap functionality as an example. Writing tests does take discipline, for example I wrote more test code than I did for the ldap functionality, but, it makes you think about how you design the code, and can make things better, understanding what it does and that will only help any other person that comes along and tries to understand what I was attempting.

Does anyone else have any ideas?

2 Likes

My 2 cents here!

I’m the kind of guy that don’t believe that Unit Tests works, for that kind of software we run, like ERPNext.
My major argument is:

  • Unit tests runs on server, users really on browser to access the functionaly of ERPNext, so, Unit Tests can have passed, we can have 100% of coverage of the code, and a missing bracket in the JavaScript, can turn the system unacessible to anybody.

I believe that the realistical solution are integration tests, where we develop tests with the mentality of what the user plan to do, not what the code is supposed to do, also the environment of the tests, should relly on the browser.

Martin Fowler, have a nice discussion about integration tests.

Another, pro about integration tests, is that, we can ensure that an Sales Order is working, also we can ensure that a sales invoice is working, but without integration tests we cant ensure that they togheter wokrs, like in one bug I have seen here

Another point that we need to keep in mind is that.

  • Tests aren’t made for the developer
  • Tests aren’t made for ensure that the code works
  • Tests are made to ensure that the users are fully capable of really into the code you wrote to do what they want to do.

I already started in the past some experiments in that area, like here:

and here

Integration test, can be described in a funcional maner, that allow users to describe what they want to receive, and us as developers we can do our work to ensure that descriptions will pass.

But for some initiative of redesign of testing framework be sucessfull, it’s not one engagement of us, developers, and project mainteiners, it need engagement of the whole community, but fundamentally users, because they are the target of any work we do.

1 Like

Thank you @max_morais_dmm for your comments, but I think you missed my point - I never said integration tests were not needed, I actually mentioned that integration tests are needed.

Also, Integration tests fall down, when you look at the changes for example done with changing the gui for version 12. They would need to have a complete rewrite.

However, if there were unit tests on all of the underlying api’s/docs/ their functionality (And there is definately some amount of tests in there, I just think that they need fleshing out more, and concentration upon, so that they actually bring value to the product) could be tested, and shown that oh, in the version 12 gui tests, it doesnt pass the sales order tests. However, you know, because the unit tests underneath, that the problem lies in the gui, nowhere else.

My main point of this discussion is how to build quality into the code, and allowing the system to be able to handle changes more easily.

Another point that we need to keep in mind is that.
Tests aren’t made for the developer
Tests aren’t made for ensure that the code works
Tests are made to ensure that the users are fully capable of really into the code you wrote to do what they want to do.

I disagree with all of the statements above, - As I explained in my previous post, but here is a good link to also show the advantages of unit testing, which is the opposite of everything you have mentioned here.

The results are that the end user gets a much more stable product. It is the responsibility of the developer, to either keep a clean house, and spend extra time, in writing unit tests, or they don’t, which then results in a fragile system that breaks when things change all the time, and then the user is the one that has to find the bugs, report it, get frustrated, wait for a fix etc.
All of that could have possibly be found, and fixed, early on when the code was being written. It’s really about the investment of time, and if it’s put early on in the development phase, it results in that time not being needed later on, which usually results in a lot more investment, to fix it.

I already started in the past some experiments in that area, like here:

Yeah that is similar to cypress/selenium - which frappe have started to use (Cypress tests that is), so they are already are on board with integration testing.

But for some initiative of redesign of testing framework be sucessfull, it’s not one engagement of us, developers, and project mainteiners, it need engagement of the whole community, but fundamentally users, because they are the target of any work we do.

I disagree strongly with this statement as well - It doesn’t fall upon the user at all. If I build you a house, but i leave bits of building materials lying around inside, a window that doesn’t close, and a hole in the roof - Who’s fault is it? The home owner just expects to be able to use their home as a normal person would. The builder did not finish their job and complete the job properly. Its not up to the home owner to even know how the building was built, or how they should fix the hole, they just know that a hole is there, which means they can’t live in the house.
I think of testing the same, it’s basically a duty of the developer to ensure what is done is of a high standard, for the future, and for the present.

Going back to the reason for this discussion, has anyone used pytest before? I’ve used it quite a lot for the last couple of years, and enjoy using it. The problem of database records being left in the database after a test are solved elegantly, with the below:

in conftest.py :

@pytest.fixture(scope='session')
def db_instance():
    # monkey patch the commit, so that records do not get saved to the database.
    frappe.connect(site=os.environ.get("site"))
    frappe.db.commit = MagicMock()
    yield frappe.db


@pytest.fixture()
def db_transaction(db_instance):
    # create a new transaction.
    db_instance.begin()
    yield db_instance
    db_instance.rollback()

Then I can just use this “db_transaction” parameter in whatever test I wish to write, and I have access to the db, and anything I do in that test will be reverted. As an example below, I test to ensure that there is a placeholder set in a field, and if it isn’t it should throw the specific error to the user.

def test_validation_ldap_search_string_needs_to_contain_a_placeholder_for_format(
db_transaction):
    with pytest.raises(ValidationError,
                       match="LDAP Search String needs to end with a placeholder, eg sAMAccountName={0}"):
        settings = get_doc("LDAP Settings")
        set_dummy_ldap_settings(settings)
        settings.ldap_search_string = 'notvalid'
        settings.save()

This means that for a whole test session that is run, the commiting to database is mocked out and will never be run. However for each test, you can create a new database transaction, and do your test. This means you can add records, and check them that they exist, were updated etc, and once the test has finished, all of those database changes will be rolled back. It does that using a form of dependency injection, which is one of the awesome features of pytest.

This is just one of the ways I was thinking it would be good to re-think and redesign the testing framework.

I appreciate and agree with your thoughts and concerns cjpit and max_morais_dmm thanks

Probably max_morais_dmm you refer to ‘business use case’ tests - that capture end user scenario context requirements?

My rough thoughts are -

Initially to start, users will assume and grant quality to an extent. But to sustain ERPNext growth and favour, such control can’t be left to chance; due care must be paid to ensure quality and stability, and that is for the user community at large to recognize and support test driven process practice discipline.

A PR submission requires a validation test https://github.com/frappe/erpnext/wiki/Pull-Request-Checklist

The appropriate tests can detect and define breaking changes, to point to unforeseen impacts of a developer’s changes or a refactoring that violates a public API.

For an urgent hot fix, a test to reproduce and mark the error would help learners to see for themselves otherwise forgotten problem cases.

At release time, regression tests are especially valuable and efficient to identify and isolate problems with proposed fixes.

To promote quality among users, one good way is to report test results and test coverage Help with Writing Tests - Foundation and community invite to help with 'do nothing' pass tests