Internationalization and Localization

Overview

This set of documents is based on these fundamental definitions:

  • Internationalization. is designing a software application so that adaptations in various languages and regions can be done without the need for engineering changes.

  • Localization Is adapting software for a specific region or language by translating text and adding locale-specific components for different markets worldwide. Localization can be performed multiple times, once for each locale, using the flexibility and infrastructure provided by internationalization (which is preferably performed once only).

Note

These terms are frequently abbreviated to i18n and L10n respectively that reflect the length of the words.

In some cases, internationalization is simple. For example, making a US application accessible to Australian or British users may require a little more than a few spelling corrections. But to make a US application usable by Japanese users, or make a Korean application usable by German users, the software must not only operate different languages, but also use different input techniques and presentation conventions.

APS tries to make internationalization as painless as possible for developers.

For more details about internationalization and localization, refer to other sources, such as Wikipedia.

Supported Charsets

The only supported charset is UTF-8. All data across the APS framework is kept in its UTF-8 form, both in the input and output forms.

Definitions

Locale Name

A locale name can define either a language only in the form ll, or language and country in the form ll_CC. For example:

Name

Description

en

English

en_US

English used in United States

en_CA

English used in Canada

de

German

de_AT

German used in Austria

  • Two-letter primary (ll) code is defined by the ISO 639 language specification.

  • Two-letter subcode (CC) is interpreted according to the ISO 3166 country specification.

The language part is always written in lower case and the country part in upper case. The separator is an underscore (“_”).

Language Code in the HTTP Header

Browsers send the names of the languages they accept in the Accept-Language* field of the HTTP header using the format generally defined by RFC 1766. For example:

Code

Description

ar

Arabic

en-au

English used in Australia

en-ca

English used in Canada

ar-aa

Arabic (Unitag)

Both the language and the country parts are in lower case. The separator is a dash (“-“).

Translation String

Translation string is a literal in a source file that can be translated. In some cases (JavaScript and HTML code), to make a string translatable, you need to add a hook to it. Such hooks tell the system: “This string should be translated into the end user’s language, if a translation for this string is available in that language.” It is the responsibility of the package developer to mark translation strings. The system can only translate those strings that it has the translation in the required language for.

Message File

A message file is a plain-text file, representing a single language that contains all available translation strings and their translation in the given language. A message file name has the .po name extension.

This format is used by many known projects, for example:

Please refer to the Gettext Manual for more details.

Translation Entry

A PO file is made up of many entries, with each entry describing the relationship between an original untranslated string and its corresponding translation:

  • msgid “Original translation string is here”

  • msgstr “Translated string is here”

For example:

msgid "Diskspace - Usage Only"
msgstr "Espacio en disco - Solamente Uso"

A typical PO file entry has the following schematic structure as defined at The Format of PO Files:

#  translator-comments
#. extracted-comments
#: reference...
#, flag...
#| msgid previous-untranslated-string
msgid "untranslated-string"
msgstr "translated-string"

The #: comment contains a reference to the translated lines in the source code.

Plural Forms

Optionally, it is possible to define a plural of translations. For the additional msgid_plural string, the translator can create one or more msgstr plural translations, since in some languages the plural form may depend on the actual number of objects presented in a string as a parameter. For example, the following PO entry illustrates three possible translated messages in Russian:

msgid "Found __itemsCount__ item"
msgid_plural "Found __itemsCount__ items"
msgstr[0] "Найден __itemsCount__ элемент"
msgstr[1] "Найдено __itemsCount__ элемента"
msgstr[2] "Найдено __itemsCount__ элементов"

A PO file is composed of multiple PO entries whose layout (before the messages are translated) looks as follows:

# FIRST AUTHOR <EMAIL@ADDRESS>, YEAR.
#
msgid ""
msgstr ""
"Project-Id-Version: PACKAGE VERSION\n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2010-06-08 10:12+0300\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language-Team: LANGUAGE <LL@li.org>\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"

#: actionlog/templates/object_action_list.html:7 txpermissions/forms.py:18
msgid "User"
msgstr ""

#: actionlog/templates/object_action_list.html:8
msgid "Action"
msgstr ""

#: foo/templates/bar.html:180
msgid "{0} result"
msgid_plural "{0} results"
msgstr[0] ""
msgstr[1] ""

Translation Strings in APS Packages

The following text elements of APS applications are translation strings:

  • In Metadata Descriptor (APP-META.xml), user-visible properties:

    • Application Name

    • Homepage

    • Software Vendor Information

    • Software Packager Information

    • Presentation

    • License Agreement

  • In JSON Schema - user-visible attributes:

  • In PHP scripts based on APS PHP runtime, the elements that are used for creating JSON attributes:

    • Titles

    • Descriptions

    • Enum titles

  • In HTML and JavaScript files based on APS JavaScript SDK:

    • Some default strings in widgets have a predefined translation for some languages mentioned later. They are not translation strings, but if it is necessary to provide a custom translation, these predefined strings can be added to PO files manually.

    • For custom translations, use the function _("translation_string") from JavaScript runtime, which helps with hooking the “translation string” when creating PO files. This function is also responsible for translating the string in the user’s browser screen on the fly.

The aps msgmake command extracts the translation strings into PO message files, one file per language. This file is a convenient way for translators to provide the translation of the translation strings in the target language. This process relies on the GNU gettext toolset.

Once the translators have filled out the message files, the whole project must be compiled into the APS package. The APS functions and localization mechanism on the hosting management systems translate the translation strings on the fly into available languages, according to users’ language preferences.

Development Workflow

If you have completed all the steps required to internationalize your package, its translation to a certain language requires additional steps as presented on the following chart:

../../_images/i18n-workflow.png
  1. Create the PO file in the i18n folder by applying the aps msgmake command to the package folder. For example, to start localizing the VPScloud package into Spanish enter:

    $ aps msgmake -l es_ES VPScloud
    

    This will create the es_ES.po file inside the i18n folder.

  2. Fill in the msgstr strings inside the es_ES.po file in Spanish.

  3. Build the package:

    $ aps build VPScloud