Serving files with non-Latin-1 file names

I need to serve attached files with the file name encoded in non-Latin-1 encoding. I use this code, which I stole from the PDF generator:

frappe.response['filename'] = target_name
frappe.response['filecontent'] = BytesIO(file_content).getvalue()
frappe.response['type'] = 'binary'

Everything works fine if I try to download the file from a browser on my Linux machine with the dev environment, but when I try to download a file from a Windows machine and the file name to be served is not Latin-1 (ASCII?) encoded, I get an internal server error (500 error code). If the file name to be served is Latin-1, everything works just fine.

What am I doing wrong?

Hi what is the locale set for the Linux and the Windows machines?

For example on this ERPNext instance:

frappe@ubuntu:~/frappe-bench$ env | grep LC
LC_ALL=en_US.UTF-8
LC_CTYPE=en_US.UTF-8
frappe@ubuntu:~/frappe-bench$ env | grep LANG
LANG=en_US.utf8

If you can access this log on the server to copy and paste the complete traceback here, that would help

Linux machine locale:

frappe@ubuntu:~/frappe-bench$ env | grep LANG
LANG=en_US.utf8
frappe@ubuntu:~/frappe-bench$ env | grep LC
LC_ALL=en_US.UTF-8
LC_CTYPE=en_US.UTF-8

Windows machine locale:

> Get-Host
CurrentCulture   : ru-RU
CurrentUICulture : ru-RU

Error log:

[2019-10-02 07:48:24 -0700] [1613] [ERROR] Error handling request /api/method/dc_plc.controllers.file_manager.serve_datasheet
Traceback (most recent call last):
  File "/home/frappe/frappe-bench/env/lib/python3.7/site-packages/gunicorn/workers/sync.py", line 135, in handle
    self.handle_request(listener, req, client, addr)
  File "/home/frappe/frappe-bench/env/lib/python3.7/site-packages/gunicorn/workers/sync.py", line 182, in handle_request
    resp.write(item)
  File "/home/frappe/frappe-bench/env/lib/python3.7/site-packages/gunicorn/http/wsgi.py", line 333, in write
    self.send_headers()
  File "/home/frappe/frappe-bench/env/lib/python3.7/site-packages/gunicorn/http/wsgi.py", line 329, in send_headers
    util.write(self.sock, util.to_bytestring(header_str, "ascii"))
  File "/home/frappe/frappe-bench/env/lib/python3.7/site-packages/gunicorn/util.py", line 507, in to_bytestring
    return value.encode(encoding)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 169-174: ordinal not in range(128)

Perhaps for Windows locale specify ru_RU.UTF-8 may resolve your issue?

Here’s a table to choose from https://docs.moodle.org/dev/Table_of_locales

Thanks! My access to the Windows machine configuration is restricted, will try as soon as I can.

EDIT: I tried forcing utf-8 encoding in the util.py file from the traceback. The file name was garbled, but at least I got it to be served.

My take for this X.Y spec - X refers to language (or country) character set and Y refers to how that set is encoded. Both should be specified for data input and output ‘plumbing’ on devices to work and display as expected. UTF-8 plays nice with Unicode, and Unicode spans programming language environments.

Thanks for the help again, solved the problem with urllib.parse.quote():

frappe.response['filename'] = f'{quote(target_name)}'
frappe.response['filecontent'] = BytesIO(file_content).getvalue()
frappe.response['type'] = 'binary'

Found another solution here: http://michal.karzynski.pl/blog/2013/06/09/django-nginx-gunicorn-virtualenv-supervisor/

When Supervisor is installed you can give it programs to start and watch by creating configuration files in the /etc/supervisor/conf.d directory. For our hello application we’ll create a file named /etc/supervisor/conf.d/hello.conf with this content:

[program:hello]
command = /webapps/hello_django/bin/gunicorn_start                    ; Command to start app
user = hello                                                          ; User to run as
stdout_logfile = /webapps/hello_django/logs/gunicorn_supervisor.log   ; Where to write log messages
redirect_stderr = true                                                ; Save stderr in the same log
environment=LANG=en_US.UTF-8,LC_ALL=en_US.UTF-8                       ; Set UTF-8 as default encoding

But I didn’t try it since the easier one worked for me.

2 Likes