V11, after migrating to Python3.5, getting 'TypeError: Expected object of type bytes or bytearray, got: <class 'str'> Errors on some emails

Installed Apps
ERPNext: v11.1.5 (master)

Frappe Framework: v11.1.5 (master)
Python 3.5

After migrating to Python 3.5 from 2.7 last night, I’ve seen three occurences today where inbound emails caused an error and couldnt be received/parsed due to Python3 compatibility. Note we have received many more emails without issue today, but should not get problems parsing any received email.

The errors appear to be with Python3 parsing of the emails. Previously we would experience one in 6 months perhaps of this type of error coming up. Having 3 errors in less than 18 hrs is not good. Because of this I’m contemplating migrating back to Python2.7 and waiting for these bugs to be fixed.

The traceback is as follows . This is exactly the same error three times for different emails today:-

email_account.receive
Error
Traceback (most recent call last):
  File "/home/frappe/frappe-bench/apps/frappe/frappe/email/doctype/email_account/email_account.py", line 278, in receive
    communication = self.insert_communication(msg, args=args)
  File "/home/frappe/frappe-bench/apps/frappe/frappe/email/doctype/email_account/email_account.py", line 335, in insert_communication
    email = Email(raw)
  File "/home/frappe/frappe-bench/apps/frappe/frappe/email/receive.py", line 370, in __init__
    self.parse()
  File "/home/frappe/frappe-bench/apps/frappe/frappe/email/receive.py", line 391, in parse
    self.process_part(part)
  File "/home/frappe/frappe-bench/apps/frappe/frappe/email/receive.py", line 443, in process_part
    self.text_content += self.get_payload(part)
  File "/home/frappe/frappe-bench/apps/frappe/frappe/email/receive.py", line 487, in get_payload
    charset = self.get_charset(part)
  File "/home/frappe/frappe-bench/apps/frappe/frappe/email/receive.py", line 482, in get_charset
    charset = chardet.detect(str(part))['encoding']
  File "/home/frappe/frappe-bench/env/lib/python3.5/site-packages/chardet/__init__.py", line 34, in detect
    '{0}'.format(type(byte_str)))
TypeError: Expected object of type bytes or bytearray, got: <class 'str'>

I’m happy to submit this as a bug report and can supply the emails that are not received correctly too to help debug why some emails are not being processed correctly.

EDIT: After investigating further, the bug occurs on the same type of read receipt email from same customer in France. There are two accents characters in the email which I’m guessing may be causing the parsing to break.

This is the text from the messages (same message received at three different times today)

À : XXXX@XXX.fr
Objet : Re: Re : Lu : urgent RQ
Date : 13/02/2019 16:01

a été lu le 14/02/2019 08:10.

Subject from email headers : =?utf-8?Q?Lu=C2=A0:_Lu=C2=A0:_urgent_RQ?=

Either the utf encoding in the subject is at fault or the grav and acute accents on the first capital A, and on the ete near the bottom of the email…
This will need fixing as its likely to break email parsing for many others on fairly simple emails auto produced from MS Outlook 16

@chdecultot Bonjour Charles, as you are French I imagine that this is something that could easily cause a problem for you. Have you seen this issue, or could you help fix this? I see a lot of PR for similar issues from Frappe too as below
@AdityaHase I’m sorry to trouble you by pinging you directly. I note you are responsible for many Python3 fixes https://github.com/frappe/erpnext/commits?author=adityahase and I hope you can help with fixing this one if at all possible. Thank you

Hi @Julian_Robbins,

Thanks for the head’s up.
Fortunately I haven’t faced this issue yet, even though I’m running Python 3.5 and v11 like you.

I’ll check to see if I face the same kind of error over the next few days and let you know.

Hi Charles Henri

Thanks for your reply. Yes we received another batch of emails that didn’t get parsed today. Therefore I will try migrating back to python 2.7 for now.

But yes it would be good to get this fixed

Thanks

Julian

Hi again
This fix from rushabh just today https://github.com/frappe/frappe/commit/572edb08ba3b394b203f2a9d41d45adfd6a8760b#commitcomment-32379411

might be linked? Email issue was related to subject being encoded as utf-8 … fix mentions a spelling typo that refers partly to this …