Cutting back l10n / Transifex is a bust
Cutting back l10n / Transifex is a bust
Some bad news: Our localization solution (Transifex) has proven to have such extremely poor file format support that most of our actively used translations have been mangled to hell.
We are cutting back our localization support to only those languages we actually have translators for, as well.
Transifex's file format support for both DTD an Properties files is terribly broken. We cannot use this system any longer, and I'll be completely pulling out - Transifex is not usable at all for either format because of automatic conversion of special characters and escaped unicode entities that should NOT be touched.
I'm sorry if I wasted anyone's time with this. I'll be working overtime to get as many languages acceptable for the v26 release from the exports of the languages that have been worked on here as possible, and I do thank everyone for their time. I'll have to look into another solution that actually supports our formats, or set time aside to make a translator application myself.
I'm sad that in 2016, we still have such terribly poor support for relatively simple and straightforward file formats in what could be considered professional frameworks
We are cutting back our localization support to only those languages we actually have translators for, as well.
Transifex's file format support for both DTD an Properties files is terribly broken. We cannot use this system any longer, and I'll be completely pulling out - Transifex is not usable at all for either format because of automatic conversion of special characters and escaped unicode entities that should NOT be touched.
I'm sorry if I wasted anyone's time with this. I'll be working overtime to get as many languages acceptable for the v26 release from the exports of the languages that have been worked on here as possible, and I do thank everyone for their time. I'll have to look into another solution that actually supports our formats, or set time aside to make a translator application myself.
I'm sad that in 2016, we still have such terribly poor support for relatively simple and straightforward file formats in what could be considered professional frameworks
"Sometimes, the best way to get what you want is to be a good person." -- Louis Rossmann
"Seek wisdom, not knowledge. Knowledge is of the past; wisdom is of the future." -- Native American proverb
"Linux makes everything difficult." -- Lyceus Anubite
"Seek wisdom, not knowledge. Knowledge is of the past; wisdom is of the future." -- Native American proverb
"Linux makes everything difficult." -- Lyceus Anubite
Re: Cutting back l10n / Transifex is a bust
That's unfortunate! What is the status of babelzilla being used for localization?
I found one alternative service that sounds promising. Have you heard of it?
http://zanata.org/
https://github.com/zanata/zanata-server
Supported project types are listed in here:
http://docs.zanata.org/en/release/user- ... ect-types/
I found one alternative service that sounds promising. Have you heard of it?
http://zanata.org/
https://github.com/zanata/zanata-server
Supported project types are listed in here:
http://docs.zanata.org/en/release/user- ... ect-types/
Re: Cutting back l10n / Transifex is a bust
Understatement of the yearjumba wrote:That's unfortunate!
We can't. From the start they've been refusing full language packs, and urged me to find a different solution.jumba wrote:What is the status of babelzilla being used for localization?
Their underlying WTS system is available on Github, but I have not been able to get it to work. If someone can help with that, then that would be a good solution - I have server capacity for it.
I haven't heard of it but I'll have to look for alternatives later. The main problem is that most available translation server software doesn't support our language formats.jumba wrote:I found one alternative service that sounds promising. Have you heard of it?
http://zanata.org/
"Sometimes, the best way to get what you want is to be a good person." -- Louis Rossmann
"Seek wisdom, not knowledge. Knowledge is of the past; wisdom is of the future." -- Native American proverb
"Linux makes everything difficult." -- Lyceus Anubite
"Seek wisdom, not knowledge. Knowledge is of the past; wisdom is of the future." -- Native American proverb
"Linux makes everything difficult." -- Lyceus Anubite
Re: Cutting back l10n / Transifex is a bust
Hi, I'm Dimitris, the founder of Transifex.
As a heads up, this bug (DTD support for special characters in `<!ENTITY` tags) has been added to our backlog for development. It's a corner-case to us (the majority of the Transifex projects (open source or not) are not using this), so it hasn't surfaced as something urgent to implement. The issue is that Transifex prefers escaped characters (`&` instead of `&`) in the entities. As a workaround until there is a solution, you can run a small script to escape these characters before pushing to Transifex.
Regarding the .properties file, I believe this was a bug in the Windows Transifex client which has been fixed? I'm not sure though, since this was reported back in October and there was a client release since then.
If you decide to go, we'll understand, and feel bad we couldn't have served you guys better.
As a heads up, this bug (DTD support for special characters in `<!ENTITY` tags) has been added to our backlog for development. It's a corner-case to us (the majority of the Transifex projects (open source or not) are not using this), so it hasn't surfaced as something urgent to implement. The issue is that Transifex prefers escaped characters (`&` instead of `&`) in the entities. As a workaround until there is a solution, you can run a small script to escape these characters before pushing to Transifex.
Regarding the .properties file, I believe this was a bug in the Windows Transifex client which has been fixed? I'm not sure though, since this was reported back in October and there was a client release since then.
If you decide to go, we'll understand, and feel bad we couldn't have served you guys better.
Re: Cutting back l10n / Transifex is a bust
Thanks for poking your head in here, Dimitris.
As i already explained to Nina, you will find both escaped and unescaped tokens in the DTD entities, by design. Any sort of conversion either way will mangle this. You should leave those strings alone!
== Excerpt from my mail to Nina ==
Some strings have escaped entities that would be used literally in XUL windows. e.g.:
Some strings have unescaped entities because the contents are parameters to be replaced at run-time (calling DTD entities within other DTD entities). e.g.:
Some strings have additional unescaped entities because they are literal HTML snippets. e.g.:
If Transifex's format handler unescapes them before inserting them in the database then the former becomes incorrect in the database.
Escaping upon exporting from the database creates another, bigger, problem in that all run-time parameters become escaped and as such invalid for use in the application.
Just blindly un/escaping all regardless of where in the DTD the strings occur is a bug in your format handler, as explained in my previous mail. You should never, never, ever convert entities inside parameter strings as you are doing.
Even in the case of website translation this would be absolutely wrong!
after your translation would become
which is dead wrong.
=== end excerpt ===
The problem I indicated, but may not have come across, is that strings inside tags as parameters should never be converted. Only strings between tags (in the actual document body) should be converted.
1) is a literal string inside a tag
(the common format for Mozilla language DTD files for localization)
2) is a literal string between tags
So, this, too is likely a format handler issue, since this also deals with unintended conversions, e.g. \u0020 for a hard leading or trailing space is stripped, and other \u values are converted to their unicode/UTF-8 characters. They are in the files as escaped sequences for a damned good reason.
100% match repetitions Entities with the same tag don't necessarily have to be translated to the same target string if they end up in a different file or context...
The latter especially has me have to go through and compare against the previous sources and reference material to correct, a rather tedious and time-consuming task for the volume it is. I do not want to ever do that more than once, so whatever we end up using will have to be able to handle it flawlessly after this disaster.
The problem is that these characters should be left alone. They should neither be escaped nor unescaped if they occur as parameters inside tags! Doing so in any document with this kind of XML-based structure will cause problems, not just for DTDs.glezos wrote:As a heads up, this bug (DTD support for special characters in `<!ENTITY` tags) has been added to our backlog for development. It's a corner-case to us (the majority of the Transifex projects (open source or not) are not using this), so it hasn't surfaced as something urgent to implement. The issue is that Transifex prefers escaped characters (`&` instead of `&`) in the entities. As a workaround until there is a solution, you can run a small script to escape these characters before pushing to Transifex.
As i already explained to Nina, you will find both escaped and unescaped tokens in the DTD entities, by design. Any sort of conversion either way will mangle this. You should leave those strings alone!
== Excerpt from my mail to Nina ==
Some strings have escaped entities that would be used literally in XUL windows. e.g.:
Code: Select all
<!ENTITY securityView.privacy.header "Privacy & History">
<!ENTITY button-next-win.label "Next >">
Code: Select all
<!ENTITY brandFullName "Pale Moon">
<!ENTITY aboutDialog.title "About &brandFullName;">
(becomes "About Pale Moon")
Code: Select all
<!ENTITY certerror.introPara1 "You have asked &brandShortName; to connect securely to <b>#1</b>, but we can't confirm that your connection is secure.">
Escaping upon exporting from the database creates another, bigger, problem in that all run-time parameters become escaped and as such invalid for use in the application.
Just blindly un/escaping all regardless of where in the DTD the strings occur is a bug in your format handler, as explained in my previous mail. You should never, never, ever convert entities inside parameter strings as you are doing.
Even in the case of website translation this would be absolutely wrong!
Code: Select all
<a href="http://example.com/file.php?param=value¶m2=value2">
Code: Select all
<a href="http://example.com/file.php?param=value¶m2=value2">
=== end excerpt ===
The problem I indicated, but may not have come across, is that strings inside tags as parameters should never be converted. Only strings between tags (in the actual document body) should be converted.
1)
Code: Select all
<!ENTITY name "Value ¶meter; is OK">
(the common format for Mozilla language DTD files for localization)
2)
Code: Select all
<TAG>Value ¶meter; is OK</TAG>
I never used the client because I could never get it to work; it would either flat out refuse, or it would not be allowed to push source files to the server -- I've done all my file submissions through the web interface directly to your server, after that.glezos wrote:Regarding the .properties file, I believe this was a bug in the Windows Transifex client which has been fixed?
So, this, too is likely a format handler issue, since this also deals with unintended conversions, e.g. \u0020 for a hard leading or trailing space is stripped, and other \u values are converted to their unicode/UTF-8 characters. They are in the files as escaped sequences for a damned good reason.
Considering you can neither offer proper dtd nor properties support, and having run into the additional issues that equally-named entities in different locations in a monolithic file (which has been required because your client never worked) are equalized to a single entry (and context or uniqueness of entries is ignored; that is an essential flaw!), Transifex simply doesn't work for us.glezos wrote:If you decide to go, we'll understand, and feel bad we couldn't have served you guys better.
The latter especially has me have to go through and compare against the previous sources and reference material to correct, a rather tedious and time-consuming task for the volume it is. I do not want to ever do that more than once, so whatever we end up using will have to be able to handle it flawlessly after this disaster.
"Sometimes, the best way to get what you want is to be a good person." -- Louis Rossmann
"Seek wisdom, not knowledge. Knowledge is of the past; wisdom is of the future." -- Native American proverb
"Linux makes everything difficult." -- Lyceus Anubite
"Seek wisdom, not knowledge. Knowledge is of the past; wisdom is of the future." -- Native American proverb
"Linux makes everything difficult." -- Lyceus Anubite