Suggestion: retire ERPNext translator and move to Launchpad (or similar)

FunKing · February 10, 2017, 10:14pm

Food for thought: why not leave the translation business to a platform built entirely just for that?

There are lots of crowdsource translation systems, many of them free to open source projects. With a quick search I’ve found POEditor, Transifex, OneSky, and Crowdin, all with an impressive list of clients. Zanata is open source itself, free and supported by Red Hat. And Launchpad is used by Ubuntu and many other projects, and probably the first I would give a try. Of course, there are many more alternatives.

Even though the ERPNext translator is not so bad, it is not as good as alternatives that we could be using. I would seriously consider changing to a better translation platform: it means not only better and faster translations but also more users. I can’t say about others, but the Brazilian translation is currently bad to the point of being unusable for many companies.

FunKing · February 12, 2017, 1:03am

I spent a few hours viewing what those translation platforms have to offer. So far, I’ve really liked Crowdin:

It’s free for FOSS (or US$ 125+/monthly for organizations)
Supports 311 languages
It has an API
Provides machine translation (from Google, Microsoft or Yandex)
It has a “global translation memory”, suggesting translations already done by other projects
Other cool functions: voting, abuse reports, proofreading mode, comments and glossary
Can provide context by text and screenshots
Has thousands of projects and users

It took only a couple of minute to create my account, start a new project and import the CSV files from the ERPNext Translator. Everything went very smoothly and I was impressed with how much faster and easier the translation process was!

Possible problem: one of the criteria to use the system free of charge is that “You do not have Commercial products related to the Open Source project you are requesting a license for”.

I’ll test drive the others later, but in the meantime, is there any similar translation platforms that anyone can recommend?

rmehta · February 13, 2017, 4:05am

@FunKing thanks for taking a lead in this, will be happy to support you in anyway we can.

We will need write a connector to upload / download translations.

FunKing · February 13, 2017, 3:25pm

Awesome, @rmehta!

They have a RESTful API, but that requires a key that gives one full access to the project data, so using a private script to download from Crowdin to translate.erpnext.com/files/* would be safer and require no changes to “bench download-translations”.

FunKing · February 13, 2017, 4:16pm

Here are some points of order:

I already have a Crowdin account, called FunKing, that I can share with or give to anyone else responsible for the translations, and we can rename it.
Unfortunately the name erpnext is already being used for another project. I wrote to the project creator but I’m not sure we will ever get a reply, so we might need to choose a different name: frappe_erpnext, erpnext_official, or any other suggestion?
I can write a couple of scripts to import everything from the current translator into Crowdin, but it would be interesting to decide how long the ERPNext Translator will keep running (to avoid losing translations made after the import). Is there a way to “freeze” it, including a message to the users about the Crowdin migration?

petri · February 15, 2017, 7:41am

PoEdit seems to have some sort of Crowdin integration. That’s a nice bonus, although I have not tried the integration myself so cannot say how useful it is.

FWIW, the biggest i18n issue IMHO is not so much the translation platform though, but that the choice/structuring of message ids makes the job of translating things more difficult than it’d have to be. Which is sad and really bad for an international community project. But that’s another issue so I’ll stop here. I think I already raised an issue or two in the tracker about that so I guess that work can continue there.

strixaluco · February 15, 2017, 10:03am

As per this issue, Weblate doesn’t have such a requirement in the list of conditions¹:

Translated content has to be released under free license.
Source code has to be publicly available in a supported version control system.
There is no guarantee for service availability or quality.

¹ — Weblate Pricing

Pau_Rosello_Van_Scho · February 15, 2017, 10:37am

You are totally right! English being the “id” of a string is quite wrong and does not go well when working with multiple languages. Android style where “string ids” are used is a little bit more tedious but so much better.

Is there any reason not to implement this in Frappe? I would help with the implementation

petri · February 16, 2017, 8:01am

Pau, I think if you (and others) too go and add your thoughts to any tracker issues concerning i18n, at some point we’ll have enough weight that a change will happen. I think the current situation is just a typical case of i18n not having been very well thought out from the beginning.

Fortunately there are things we can do to improve the situation: document a better way to form/choose message ids, and then push for that to be accepted as the recommendation, and perhaps even something that will be enforced (translation/i18n quality controls of some sort).

As for current bad message ids, requesting all translators to accept better ones and change their translations would be so much work that it’s practically impossible. But fortunately it is possible (I believe so, anyway) to write a script that automatically changes message ids in existing source code and translations. For translations, it is easy, but changing the message ids in source code may be more difficult.

FunKing · February 17, 2017, 4:12pm

Thanks, everyone!

@petri, even though you can use PoEdit, you’ll be missing all cool Crowdin features, such as using screenshots for context.

@strixaluco, thanks. I’ve seen Weblate, translatewiki.net, LingoHub, Lokalise and others. So far, I couldn’t find any other service with as many features and easy of use as Crowdin.

@Pau_Rosello_Van_Scho & @petri: Unfortunately, i18n is almost always an afterthought. I do believe that scripts to change the javascript files, python source and DocTypes would be trivial, though (I could help with that) - the real hard work, IMO, would be to create and map the new msg ids to the old ones.

fmorato · February 18, 2017, 4:03pm

For personalized self hosted options, there also is Pootle

Tryton uses it.

FunKing · March 23, 2017, 8:45pm

OK, I’m finally back, and I’ve been working with the translations again. Besides some known problems (such as missing string IDs, mentioned by @petri), I’ve found others.One of them is that I can’t find for sure the source strings to translate, and I don’t know how many are there.

The ERPNext translator currently says there are 7.081 strings to translate, I downloaded the latest translations (bench download-translations) and asked for the missing translations for pt-BR (bench get-untranslated pt-BR missing_pt-BR.txt), but no matter what language I ask for (and no matter if I use the --all parameter or not), the result is always “4953 missing translations of 4953”.

Edit: in ~/frappe-bench/apps/frappe/frappe/translate.py, line 575, I added a print statement to see how many strings were found per app. The result:

2710 strings for frappe
2379 strings for erpnext
Total = 5089 strings
After “deduplication”: 4953 strings

The files with most strings for Frappe and ERPNext that I’ve found are frappe/translations/ja.csv (2732 strings) and erpnext/translations (4991 strings), totaling 7723 (more than the strings in the translator)…

Another count (in ~/frappe-bench/apps), obviously wrong (too simplistic), but I wonder why it is so different:

2934 | grep -r “_(” --include=“*.py” | grep -v __ | wc -l
2151 | grep -r “__(” --include=“*.js” | wc -l
5167 | grep -r ‘“label”:’ --include=“*.json” | wc -l
10252 | Total

It feels like everything is broken!

Can anyone tell me if there is an effective way to get all English source strings without reinventing the wheel?

Pau_Rosello_Van_Scho · March 23, 2017, 10:37pm

Hello!

I will speak with just my experience not based on any fact :

The 4953 is the correct number of strings to translate and the bench commands work perfectly for the current state.
The difference between the 4953 and 7723 might be caused by changing the original string. If you had “Hello” then change it to “Hello world” the original “Hello” is not removed from the translation file.
When you are counting the 10252 you are not taking duplicates into account. _(“Hello”) found in 5 different files are counted as different with your commands when in reality they should not and it is what the bench does.

We need the string IDs but a migration process should be defined to make it as smooth as posible, in some time I might try to do it or will help someone already doing it.

Regards!

rmehta · March 27, 2017, 7:38am

@kickapoo has created an in-app translation UI that also might be a good replacement to the translate portal.

Then you can translate in-app and just contribute. @kickapoo do you want to take that function out from your custom app and add it as a core feature in Frappe?

FunKing · March 30, 2017, 1:42pm

Thanks, @Pau_Rosello_Van_Scho! You’re right, and the current state of the ERPNext translation is in a worse shape than I expected - we are translating strings that doesn’t exist in the program any longer!

I’d love to help in the migration process in any way that I can.

@rmehta, Crowdin also has an in-context translation tool, including some advanced features and integration into their collaborative portal.