Database for "dtd" files

The l10n of Pale Moon. Rawr.
riiis
Lunatic
Lunatic
Posts: 471
Joined: 2014-05-17, 15:51
Location: USA

Database for "dtd" files

Unread post by riiis » 2016-09-29, 23:47

I've assembled the raw data to create an hierarchical database for Pale Moon's dtd/localization files (including "dtd" files in the 30+ Language Packs). Following is what this database might look like:
FileName Database
FileName Database
The "FileName Database" contains a record for each "ENTITY" included in that file. The database does not include the ENTITY statements themselves, but merely a pointer or link to those ENTITY statements, kept in another database file (ENTITY's are obtained from either the "ENTITY" database (for the default "en-US" locale), or from the "Locale" database for the Language Packs).
ENTITY Database
ENTITY Database
Locale Database
Locale Database
The "EntityName" is both a record key-field to a specific "ENTITY" and also part of the "ENTITY" data itself. Thus, this EntityName or key-field must be unique. The "EntityName" key remains unique, even though the same "EntityName" is used in multiple "dtd" files, as long as its string value is the same for all these files. That is, the same "EntityName" cannot be used to say one thing in one dtd file (like "chocolate"), and say something quite different in another dtd file (like "vanilla"). There are a number of such duplicate keys in Pale Moon's dtd files. But fortunately these duplicates mostly appear in a small number of pairs of files (such as "colors.dtd" and "fonts.dtd"). The EntityName's would have to be changed for one or both of these files; or, alternatively, one or both of these files could be removed from/not included in the database.

Obtaining the data for this database is administratively very simple, taking just a few minutes for each locale. And, the database can be kept in LibreOffice or Microsoft Office, instead of a more fancy or expensive solution. To create the database data:
1. Obtain the text editor "Notepad ++". Add the "Column sorting" and "Combine" plugins to Notepad ++. Also obtain the "Bandizip Archive Manager" if you intend to open "omni.ja" files. For Language Packs or other "xpi" files, most other archive managers will work.
2. Copy the locale files from the Language Pack to a separate folder. There are two groups of locale files in each Language Pack (corresponding to the two "omni.ja" files in Pale Moon. Delete all files in this new folder, that isn't a "dtd" file.
3. Drag and drop the folder with your "dtd" files to an open Notepad++. All "dtd" files should now be open in Notepad++. Optionally clean up the files by replacing all the double spaces <space><space> in all open files, with a single space. Repeat until there are "0" instances of double spaces. Then, click "Combine" on the plugins menu, to combine all the dtd files into one new file. Then, click "Column sorting" to sort the file. Delete all lines that are not "ENTITY" lines. You're basicly done at this point, except for the issue of "ENTITY" statements that occupied more than one-line prior to the sort. "ENTITY" statements must be terminated by "> and sorting chops off this termination. It's usually easier to find and fix these truncated "ENTITY" statements, prior to moving your combined dtd file to a spreadsheet (which is the next step).
4. The Notepad ++ file is now a 1-column list of ENTITY statements. Copy the list and paste to a spreadsheet. Next edit The ENTITY statements in Notepad++, until all that remains is the "EntityName", followed by one space, followed by one quote mark, followed by the ENTITY value (there should be no "> terminations). Next, replace the space and quote mark with a tab for each line. Then paste this edited ENTITY list to an empty column to the right of the ENTITY statements previously posted. The "EntityName" should now be split out in its own column, followed by the ENTITY string value, also in its own column.
5. Next repeat these steps for the remaining group of locale files.
Last edited by riiis on 2016-09-30, 14:10, edited 1 time in total.

User avatar
Moonchild
Pale Moon guru
Pale Moon guru
Posts: 35474
Joined: 2011-08-28, 17:27
Location: Motala, SE
Contact:

Re: Database for "dtd" files

Unread post by Moonchild » 2016-09-30, 00:39

...and then you end up with data in a spreadsheet which is not meant to be used as a database, no matter how many people end up doing this :P

Also... What exactly is your suggestion/request? Because I fail to see the relevance here.
"Sometimes, the best way to get what you want is to be a good person." -- Louis Rossmann
"Seek wisdom, not knowledge. Knowledge is of the past; wisdom is of the future." -- Native American proverb
"Linux makes everything difficult." -- Lyceus Anubite

riiis
Lunatic
Lunatic
Posts: 471
Joined: 2014-05-17, 15:51
Location: USA

Re: Database for "dtd" files

Unread post by riiis » 2016-09-30, 14:19

For fixing issues you may be having with maintaining the 33 Language Packs at GitHub, I was suggesting using, for database software, Libre Office "Base" (not the spreadsheet software, Libre Office "Calc"). Or, using the database software Microsoft "Access" (not the spreadsheet software Microsoft "Excel"). I don't have any experience with "Base". But, "Access" is sufficient, many times over, for your apparent need for a Localization Database. And, I assume "Base" would also be more than sufficient. The database will require additional forms and/or reports and additional virtual databases also be written. But, once done, a well-designed database, for your limited record volume and activity, should require little time or maintenance.

At the click of a button, a Localization Database can write-to-file (perfectly formatted and as frequently as needed) the over 270,000 data records and over 8600 dtd and properties files, for the 33 Language Packs available on GitHub. A Localization Database can also provide translators simple lists of just those items requiring translation (such as English strings in the left-hand column, and space for entering their translated strings in the right-hand column).

riiis
Lunatic
Lunatic
Posts: 471
Joined: 2014-05-17, 15:51
Location: USA

Fixing Language Pack Localizations

Unread post by riiis » 2016-10-01, 17:20

The "dtd" files, in the 33 Language Packs available at GitHub and palemoon.org, contain perhaps thousands of erroneous ENTITY records. The records are erroneous because they provide strings or text in English, which strings instead should be in a language other than English. These erroneous records result from failsafe procedures apparently in effect for Language Packs (for Firefox as well as for Pale Moon). That is, these erroneous records result from deliberate procedures to prevent malfunctioning of the Pale Moon browser (by substituting English strings for non-English strings, that for whatever reason, may be unavailable). Actually, the deliberate failsafe procedures are much worse than this. For new strings that are needed, the new strings are added to the 33 Language Packs' "dtd" files, first in English. Then, translators are expected (or hoped for) to translate these hundreds of new erroneous "dtd" file records, into the Language Packs' respective languages. The sad thing is that these failsafe procedures are totally unnecessary. Creating thousands of erroneous records in the Language Packs' "dtd" files is totally unnecessary. And, the solution to Language Pack issues is not to make Localizations "not a priority". Instead, the solution to Language Pack issues is to make Localizations "not a problem". For making Localizations "not a problem", my suggestion consist of the following steps:

1. Rename all the Language Pack "dtd" files to something else. For example, rename "browser.dtd" to say "browserLP.dtd".
2. Copy the "dtd" files from Tyco's default "en-US" locale to each Language Pack. For example, each Language Pack will now have both a "browser.dtd" file (with English strings) and a "browserLP.dtd" file (with non-English strings).
3. Add the non-English "dtd" files to the browser files which use these Entities. For example, change the appropriate section of "browser.xul" from this:

<!DOCTYPE window [
<!ENTITY % brandDTD SYSTEM "chrome://branding/locale/brand.dtd" >
%brandDTD;
<!ENTITY % browserDTD SYSTEM "chrome://browser/locale/browser.dtd" >
%browserDTD;
<!ENTITY % palemoonDTD SYSTEM "chrome://browser/locale/palemoon.dtd" >
%palemoonDTD;
<!ENTITY % baseMenuDTD SYSTEM "chrome://browser/locale/baseMenuOverlay.dtd" >
%baseMenuDTD;
<!ENTITY % charsetDTD SYSTEM "chrome://browser/locale/charsetMenu.dtd" >
%charsetDTD;
<!ENTITY % textcontextDTD SYSTEM "chrome://global/locale/textcontext.dtd" >
%textcontextDTD;
<!ENTITY % customizeToolbarDTD SYSTEM "chrome://global/locale/customizeToolbar.dtd">
%customizeToolbarDTD;
<!ENTITY % placesDTD SYSTEM "chrome://browser/locale/places/places.dtd">
%placesDTD;
<!ENTITY % aboutHomeDTD SYSTEM "chrome://browser/locale/aboutHome.dtd">
%aboutHomeDTD;
]>

To perhaps this:

<!DOCTYPE window [
<!ENTITY % brandDTD SYSTEM "chrome://branding/locale/brand.dtd" >
%brandDTD;
<!ENTITY % browserlpDTD SYSTEM "chrome://browser/locale/browserLP.dtd" >
%browserlpDTD;
<!ENTITY % palemoonlpDTD SYSTEM "chrome://browser/locale/palemoonLP.dtd" >
%palemoonlpDTD;
<!ENTITY % baseMenulpDTD SYSTEM "chrome://browser/locale/baseMenuOverlayLP.dtd" >
%baseMenulpDTD;
<!ENTITY % charsetlpDTD SYSTEM "chrome://browser/locale/charsetMenuLP.dtd" >
%charsetlpDTD;
<!ENTITY % textcontextlpDTD SYSTEM "chrome://global/locale/textcontextLP.dtd" >
%textcontextlpDTD;
<!ENTITY % customizeToolbarlpDTD SYSTEM "chrome://global/locale/customizeToolbarLP.dtd">
%customizeToolbarlpDTD;
<!ENTITY % placeslpDTD SYSTEM "chrome://browser/locale/places/placesLP.dtd">
%placeslpDTD;
<!ENTITY % aboutHomelpDTD SYSTEM "chrome://browser/locale/aboutHomeLP.dtd">
%aboutHomelpDTD;
<!ENTITY % browserDTD SYSTEM "chrome://browser/locale/browser.dtd" >
%browserDTD;
<!ENTITY % palemoonDTD SYSTEM "chrome://browser/locale/palemoon.dtd" >
%palemoonDTD;
<!ENTITY % baseMenuDTD SYSTEM "chrome://browser/locale/baseMenuOverlay.dtd" >
%baseMenuDTD;
<!ENTITY % charsetDTD SYSTEM "chrome://browser/locale/charsetMenu.dtd" >
%charsetDTD;
<!ENTITY % textcontextDTD SYSTEM "chrome://global/locale/textcontext.dtd" >
%textcontextDTD;
<!ENTITY % customizeToolbarDTD SYSTEM "chrome://global/locale/customizeToolbar.dtd">
%customizeToolbarDTD;
<!ENTITY % placesDTD SYSTEM "chrome://browser/locale/places/places.dtd">
%placesDTD;
<!ENTITY % aboutHomeDTD SYSTEM "chrome://browser/locale/aboutHome.dtd">
%aboutHomeDTD;
]>

4. If necessary, add empty files to "en-US" corresponding to the renamed files in the Language Packs ("browserLP.dtd", "palemoonLP.dtd", etc.). Also, add empty "...LP.dtd" files to the Language Packs corresponding to the new "dtd" files added to "en-US".

The browser will now search for non-English strings first (among the "...LP.dtd" files). Finding none and for "en-US", the browser will default to the ENTITY strings in "en-US". All 33 Language Packs will now be good-to-go in the Tyco version of Pale Moon. New strings, needing translations, could be translated at any time in the future, or not at all. Localizations are no longer a problem, as well as no longer a priority.

New Tobin Paradigm

Re: Database for "dtd" files

Unread post by New Tobin Paradigm » 2016-10-01, 18:52

Those langpacks are invalid for tycho.

We are also not going to rearchitect l10n. Nor are we gonna rely on an office suite which is a piss poor solution at best.

riiis
Lunatic
Lunatic
Posts: 471
Joined: 2014-05-17, 15:51
Location: USA

Re: Database for "dtd" files

Unread post by riiis » 2016-10-01, 21:06

Matt A Tobin wrote:Those langpacks are invalid for tycho.
Your reply is somewhat curious.

Pale Moon 27 (Tycho) uses the exact same DTD format and processes as does Pale Moon 26, as does Firefox extensions, indeed as does existing Language Packs. If DTD format and processes were different in Pale Moon 27 from Pale Moon 26, no Firefox or Pale Moon add-ons currently available would run in Tycho, not just Language Packs. Yet, many Firefox and Pale Moon extensions for Pale Moon 26 clearly do run in Tycho,

Approximately 177 entities were added to Tycho since Pale Moon 26, 3178 entities remained the same. By adding the 177 new entities to existing Language Packs' DTD files (and doing the same for new strings in the properties files), these Language Packs will absolutely work in Tycho.

New Tobin Paradigm

Re: Database for "dtd" files

Unread post by New Tobin Paradigm » 2016-10-01, 21:26

There is no base to work from but Tycho en-US because there are no valid langpacks for tycho and trying to mangle from previous versions isn't practical. I thing you underestimate what has actually changed and what Tycho is.

User avatar
Moonchild
Pale Moon guru
Pale Moon guru
Posts: 35474
Joined: 2011-08-28, 17:27
Location: Motala, SE
Contact:

Re: Database for "dtd" files

Unread post by Moonchild » 2016-10-01, 22:06

Having a master database for a localization effort is a good idea, as long as it is handled responsible and any changes to it are properly version-tracked with blame logs.
In database form, this can't be done with source code repository tracking, though, so a different tool will be needed for it.

I'm moving this thread to the localization board; where it belongs. Also, please take note of the announcement I made there.
"Sometimes, the best way to get what you want is to be a good person." -- Louis Rossmann
"Seek wisdom, not knowledge. Knowledge is of the past; wisdom is of the future." -- Native American proverb
"Linux makes everything difficult." -- Lyceus Anubite

Locked