ERPNext OCR App

ERPNext OCR App using https://ocrsdk.com/

Use-case: Be able to extract data from your scanned Invoice.

Sponsored by Grynn.in @Not_a_countant

18 Likes

Hi John,
This app looks realy nice. I would like to insall it for a test, but it will not work :frowning:

I always get error´s like:

Command “python setup.py egg_info” failed with error code 1 in /home/frappe/frap pe-bench/apps/erpnext_ocr/

do you have a idea, what´s going wrong?

Best regards,
Matthias

Hi Matthias,

I believe it has something to do with the Pip version. Let me update the app.

Regards,
John

Please try again.

The linked site is not available.

Hello John,

thank you. I tried to install it, but get some errors. The app is installd, but doesn´t work correct :frowning:

This is what i did @ ubuntu18 server:

as local user (or root)

  1. sudo apt-get install tesseract-ocr

    • OK
  2. sudo -H pip install pytesseract

    • OK
  3. sudo -H pip install pillow

    • OK
  4. sudo apt-get install imagemagick

    • OK
  5. sudo -H pip install wand

    • OK
  6. su frappe

  7. cd frappe-bench

  8. bench get-app https://github.com/jvfiel/ERPNext-OCR.git

    frappe@erpnext_srv:~/frappe-bench$ bench get-app https://github.com/jvfiel/ERPNext-OCR.git
    INFO:bench.app:getting app ERPNext-OCR
    INFO:bench.utils:git clone https://github.com/jvfiel/ERPNext-OCR.git --depth 1 --origin upstream
    Cloning into ‘ERPNext-OCR’…
    remote: Enumerating objects: 96, done.
    remote: Counting objects: 100% (96/96), done.
    remote: Compressing objects: 100% (80/80), done.
    remote: Total 96 (delta 15), reused 71 (delta 11), pack-reused 0
    Unpacking objects: 100% (96/96), done.
    (‘installing’, u’erpnext_ocr’)
    INFO:bench.app:installing erpnext_ocr
    INFO:bench.utils:./env/bin/pip install -q -e ./apps/erpnext_ocr
    /home/frappe/frappe-bench/apps/frappe/frappe/build.py:106: UserWarning: Source /home/frappe/frappe-bench/apps/frappe_io/frappe_io/docs does not exists.
    warnings.warn(‘Source {source} does not exists.’.format(source = source))
    /home/frappe/frappe-bench/apps/frappe/frappe/build.py:106: UserWarning: Source /home/frappe/frappe-bench/apps/foundation/foundation/docs does not exists.
    warnings.warn(‘Source {source} does not exists.’.format(source = source))
    /home/frappe/frappe-bench/apps/frappe/frappe/build.py:106: UserWarning: Source /home/frappe/frappe-bench/apps/erpnext_ocr/erpnext_ocr/docs does not exists.
    warnings.warn(‘Source {source} does not exists.’.format(source = source))
    SyntaxError: Unexpected end of JSON input
    at JSON.parse ()
    at make_build_map (/home/frappe/frappe-bench/apps/frappe/frappe/build.js:193:22)
    at Object. (/home/frappe/frappe-bench/apps/frappe/frappe/build.js:22:17)
    at Module._compile (module.js:653:30)
    at Object.Module._extensions…js (module.js:664:10)
    at Module.load (module.js:566:32)
    at tryModuleLoad (module.js:506:12)
    at Function.Module._load (module.js:498:3)
    at Function.Module.runMain (module.js:694:10)
    at startup (bootstrap_node.js:204:16)
    SyntaxError: Unexpected end of JSON input
    at JSON.parse ()
    at make_build_map (/home/frappe/frappe-bench/apps/frappe/frappe/build.js:193:22)
    at Object. (/home/frappe/frappe-bench/apps/frappe/frappe/build.js:22:17)
    at Module._compile (module.js:653:30)
    at Object.Module._extensions…js (module.js:664:10)
    at Module.load (module.js:566:32)
    at tryModuleLoad (module.js:506:12)
    at Function.Module._load (module.js:498:3)
    at Function.Module.runMain (module.js:694:10)
    at startup (bootstrap_node.js:204:16)
    Wrote css/frappe-web.css - 65.11 KB
    Wrote js/frappe-web.min.js - 132.77 KB
    Wrote js/control.min.js - 77.1 KB
    Wrote js/dialog.min.js - 116.97 KB
    Wrote css/desk.min.css - 309.05 KB
    Wrote css/frappe-rtl.css - 32.49 KB
    Wrote js/libs.min.js - 1.13 MB
    Wrote js/desk.min.js - 462.89 KB
    Wrote css/module.min.css - 2.08 KB
    Wrote css/form.min.css - 4.47 KB
    Wrote js/form.min.js - 196.71 KB
    Wrote css/list.min.css - 13.36 KB
    Wrote js/list.min.js - 154.85 KB
    Wrote css/report.min.css - 7.89 KB
    Wrote js/report.min.js - 260.58 KB
    Wrote js/web_form.min.js - 247.55 KB
    Wrote css/web_form.css - 24.42 KB
    Wrote js/print_format_v3.min.js - 23.39 KB
    Wrote css/erpnext.css - 8.1 KB
    Wrote js/erpnext-web.min.js - 3.8 KB
    Wrote js/erpnext.min.js - 166.15 KB
    Wrote js/item-dashboard.min.js - 8.13 KB
    INFO:bench.utils:sudo supervisorctl restart frappe-bench-workers: frappe-bench-web:
    frappe-bench-workers:frappe-bench-frappe-schedule: stopped
    frappe-bench-workers:frappe-bench-frappe-long-worker-0: stopped
    frappe-bench-workers:frappe-bench-frappe-default-worker-0: stopped
    frappe-bench-workers:frappe-bench-frappe-short-worker-0: stopped
    frappe-bench-web:frappe-bench-node-socketio: stopped
    frappe-bench-web:frappe-bench-frappe-web: stopped
    frappe-bench-workers:frappe-bench-frappe-schedule: started
    frappe-bench-workers:frappe-bench-frappe-default-worker-0: started
    frappe-bench-workers:frappe-bench-frappe-long-worker-0: started
    frappe-bench-workers:frappe-bench-frappe-short-worker-0: started
    frappe-bench-web:frappe-bench-frappe-web: started
    frappe-bench-web:frappe-bench-node-socketio: started

  9. bench install-app erpnext_ocr

    frappe@erpnext_srv:~/frappe-bench$ bench install-app erpnext_ocr
    Installing erpnext_ocr…
    Updating DocTypes for erpnext_ocr : [========================================]
    Traceback (most recent call last):
    File “/usr/lib/python2.7/runpy.py”, line 174, in _run_module_as_main
    main”, fname, loader, pkg_name)
    File “/usr/lib/python2.7/runpy.py”, line 72, in _run_code
    exec code in run_globals
    File “/home/frappe/frappe-bench/apps/frappe/frappe/utils/bench_helper.py”, line 94, in
    main()
    File “/home/frappe/frappe-bench/apps/frappe/frappe/utils/bench_helper.py”, line 18, in main
    click.Group(commands=commands)(prog_name=‘bench’)
    File “/home/frappe/frappe-bench/env/local/lib/python2.7/site-packages/click/core.py”, line 764, in call
    return self.main(*args, **kwargs)
    File “/home/frappe/frappe-bench/env/local/lib/python2.7/site-packages/click/core.py”, line 717, in main
    rv = self.invoke(ctx)
    File “/home/frappe/frappe-bench/env/local/lib/python2.7/site-packages/click/core.py”, line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
    File “/home/frappe/frappe-bench/env/local/lib/python2.7/site-packages/click/core.py”, line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
    File “/home/frappe/frappe-bench/env/local/lib/python2.7/site-packages/click/core.py”, line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
    File “/home/frappe/frappe-bench/env/local/lib/python2.7/site-packages/click/core.py”, line 555, in invoke
    return callback(*args, **kwargs)
    File “/home/frappe/frappe-bench/env/local/lib/python2.7/site-packages/click/decorators.py”, line 17, in new_func
    return f(get_current_context(), *args, **kwargs)
    File “/home/frappe/frappe-bench/apps/frappe/frappe/commands/init.py”, line 24, in _func
    ret = f(frappe._dict(ctx.obj), *args, **kwargs)
    File “/home/frappe/frappe-bench/apps/frappe/frappe/commands/site.py”, line 165, in install_app
    install_app(app, verbose=context.verbose)
    File “/home/frappe/frappe-bench/apps/frappe/frappe/installer.py”, line 155, in install_app
    sync_fixtures(name)
    File “/home/frappe/frappe-bench/apps/frappe/frappe/utils/fixtures.py”, line 24, in sync_fixtures
    ignore_links=True, overwrite=True)
    File “/home/frappe/frappe-bench/apps/frappe/frappe/core/doctype/data_import/data_import.py”, line 54, in import_doc
    frappe.modules.import_file.import_file_by_path(f, data_import=True, force=True, pre_process=pre_process, reset_permissions=True)
    File “/home/frappe/frappe-bench/apps/frappe/frappe/modules/import_file.py”, line 58, in import_file_by_path
    ignore_version=ignore_version, reset_permissions=reset_permissions)
    File “/home/frappe/frappe-bench/apps/frappe/frappe/modules/import_file.py”, line 132, in import_doc
    doc.insert()
    File “/home/frappe/frappe-bench/apps/frappe/frappe/model/document.py”, line 248, in insert
    self.run_post_save_methods()
    File “/home/frappe/frappe-bench/apps/frappe/frappe/model/document.py”, line 889, in run_post_save_methods
    self.run_method(“on_update”)
    File “/home/frappe/frappe-bench/apps/frappe/frappe/model/document.py”, line 757, in run_method
    out = Document.hook(fn)(self, *args, **kwargs)
    File “/home/frappe/frappe-bench/apps/frappe/frappe/model/document.py”, line 1026, in composer
    return composed(self, method, *args, **kwargs)
    File “/home/frappe/frappe-bench/apps/frappe/frappe/model/document.py”, line 1009, in runner
    add_to_return_value(self, fn(self, *args, **kwargs))
    File “/home/frappe/frappe-bench/apps/frappe/frappe/model/document.py”, line 751, in
    fn = lambda self, *args, **kwargs: getattr(self, method)(*args, **kwargs)
    File “/home/frappe/frappe-bench/apps/frappe/frappe/custom/doctype/custom_field/custom_field.py”, line 56, in on_update
    validate_fields_for_doctype(self.dt)
    File “/home/frappe/frappe-bench/apps/frappe/frappe/core/doctype/doctype/doctype.py”, line 420, in validate_fields_for_doctype
    validate_fields(frappe.get_meta(doctype, cached=False))
    File “/home/frappe/frappe-bench/apps/frappe/frappe/core/doctype/doctype/doctype.py”, line 662, in validate_fields
    check_unique_fieldname(d.fieldname)
    File “/home/frappe/frappe-bench/apps/frappe/frappe/core/doctype/doctype/doctype.py”, line 448, in check_unique_fieldname
    frappe.throw(
    (“Fieldname {0} appears multiple times in rows {1}”).format(fieldname, ", ".join(duplicates)))
    File “/home/frappe/frappe-bench/apps/frappe/frappe/init.py”, line 323, in throw
    msgprint(msg, raise_exception=exc, title=title, indicator=‘red’)
    File “/home/frappe/frappe-bench/apps/frappe/frappe/init.py”, line 309, in msgprint
    _raise_exception()
    File “/home/frappe/frappe-bench/apps/frappe/frappe/init.py”, line 282, in _raise_exception
    raise raise_exception(encode(msg))
    frappe.exceptions.ValidationError: Fieldname contact_html appears multiple times in rows 40, 48

when i try to use the “OCR Read”, i get this error-log:

Traceback (most recent call last):
File “/home/frappe/frappe-bench/apps/frappe/frappe/app.py”, line 62, in application
response = frappe.handler.handle()
File “/home/frappe/frappe-bench/apps/frappe/frappe/handler.py”, line 22, in handle
data = execute_cmd(cmd)
File “/home/frappe/frappe-bench/apps/frappe/frappe/handler.py”, line 53, in execute_cmd
return frappe.call(method, **frappe.form_dict)
File “/home/frappe/frappe-bench/apps/frappe/frappe/init.py”, line 939, in call
return fn(*args, **newargs)
File “/home/frappe/frappe-bench/apps/frappe/frappe/handler.py”, line 81, in runserverobj
frappe.desk.form.run_method.runserverobj(method, docs=docs, dt=dt, dn=dn, arg=arg, args=args)
File “/home/frappe/frappe-bench/apps/frappe/frappe/desk/form/run_method.py”, line 36, in runserverobj
r = doc.run_method(method)
File “/home/frappe/frappe-bench/apps/frappe/frappe/model/document.py”, line 757, in run_method
out = Document.hook(fn)(self, *args, **kwargs)
File “/home/frappe/frappe-bench/apps/frappe/frappe/model/document.py”, line 1026, in composer
return composed(self, method, *args, **kwargs)
File “/home/frappe/frappe-bench/apps/frappe/frappe/model/document.py”, line 1009, in runner
add_to_return_value(self, fn(self, *args, **kwargs))
File “/home/frappe/frappe-bench/apps/frappe/frappe/model/document.py”, line 751, in
fn = lambda self, *args, **kwargs: getattr(self, method)(*args, **kwargs)
File “/home/frappe/frappe-bench/apps/erpnext_ocr/erpnext_ocr/erpnext_ocr/doctype/ocr_read/ocr_read.py”, line 43, in read_image
import pytesseract
ImportError: No module named pytesseract

i´m totaly new at linux and erp-next. I can´t find out, whats the problem :hushed:
do you have any idea?

Best regards,
Matthias

Try this in the frappe-bench folder.

./env/bin/pip install pytesseract

2 Likes

I believe this error is the version of erpnext. What version are you using?

For those interested, I created a fork of this project for my company : https://github.com/Monogramm/erpnext_ocr

The fork contains several bug fixes, supports python 3 (ERPNext 11 +), can manage PDF files and can support other languages than English.
All ABBYY OCR / Sales Invoice has been removed because it was not working (missing code apparently).

Feel free to take a look or contribute.

6 Likes

Hi,
I’m looking for a similar function.
we receive sales orders from our customers. usually by fax or by mail as pdf file. but the pdf file is more like an image…
I want an app that can extract from this image some fields, such as:

  • customer name
  • delivery address
  • invoice address
  • order lines (article REF, quantity, price)
  • delivery date
    etc.

thies fields then can be directly inserted into ERPNext.
so we avoid a lot of annoying typing jobs.
one should use deeplearning so that the app learns where the fields are. or the user can train the app to find the customs field quicker.

maybe someone has already thought about this function and there exists a solution?

Hi @Fritz_Moser ,

You’re certainly not the first one to think about this. This is basically the direction in which we want to go with our OCR fork and there is an open enhancement request regarding this subject:

Right now, we’re still polishing the application interface, performances and results quality, but after all the next big subject will definitely be to generate some DocType(s) from the extracted text.
Any help or support on this subject will be greatly appreciated :wink:

3 Likes

You need to apply this on https://frappecloud.com/marketplace

1 Like

Hi, I integrated erpnext_ocr and tried to create a OCR Read record but I got not permitted message.
I tried on several accounts(include Administrator) but I got the same message.
Is there anyone who fixed such kind of issue?
Thanks.

I opened an issue on the git repo.

Hi @nicolasjin,

Were you able to implement the OCR based document read and elastic search functionality ? If yes, would be great if you can please share steps considering i am new to ERPNext ? By any chance is this functionality implemented out of the box in erpnext now?
Thanks

I was able to install teserract ocr app but now I am getting the same error as that of yours. “Not permitted”. Were you able to to fix this issue ?