Mercurial > genshi > genshi-test
diff doc/i18n.txt @ 902:09cc3627654c experimental-inline
Sync `experimental/inline` branch with [source:trunk@1126].
author | cmlenz |
---|---|
date | Fri, 23 Apr 2010 21:08:26 +0000 |
parents | 1837f39efd6f |
children |
line wrap: on
line diff
--- a/doc/i18n.txt +++ b/doc/i18n.txt @@ -4,15 +4,15 @@ Internationalization and Localization ===================================== -Genshi provides basic supporting infrastructure for internationalizing -and localizing templates. That includes functionality for extracting localizable -strings from templates, as well as a template filter that can apply translations -to templates as they get rendered. +Genshi provides comprehensive supporting infrastructure for internationalizing +and localizing templates. That includes functionality for extracting +localizable strings from templates, as well as a template filter and special +directives that can apply translations to templates as they get rendered. This support is based on `gettext`_ message catalogs and the `gettext Python -module`_. The extraction process can be used from the API level, or through the -front-ends implemented by the `Babel`_ project, for which Genshi provides a -plugin. +module`_. The extraction process can be used from the API level, or through +the front-ends implemented by the `Babel`_ project, for which Genshi provides +a plugin. .. _`gettext`: http://www.gnu.org/software/gettext/ .. _`gettext python module`: http://docs.python.org/lib/module-gettext.html @@ -28,18 +28,19 @@ ====== The simplest way to internationalize and translate templates would be to wrap -all localizable strings in a ``gettext()`` function call (which is often aliased -to ``_()`` for brevity). In that case, no extra template filter is required. +all localizable strings in a ``gettext()`` function call (which is often +aliased to ``_()`` for brevity). In that case, no extra template filter is +required. .. code-block:: genshi <p>${_("Hello, world!")}</p> -However, this approach results in significant “character noise” in templates, +However, this approach results in significant “character noise” in templates, making them harder to read and preview. The ``genshi.filters.Translator`` filter allows you to get rid of the -explicit `gettext`_ function calls, so you can continue to just write: +explicit `gettext`_ function calls, so you can (often) just continue to write: .. code-block:: genshi @@ -48,23 +49,28 @@ This text will still be extracted and translated as if you had wrapped it in a ``_()`` call. -.. note:: For parameterized or pluralizable messages, you need to continue using - the appropriate ``gettext`` functions. +.. note:: For parameterized or pluralizable messages, you need to use the + special `template directives`_ described below, or use the + corresponding ``gettext`` function in embedded Python expressions. -You can control which tags should be ignored by this process; for example, it +You can control which tags should be ignored by this process; for example, it doesn't really make sense to translate the content of the HTML ``<script></script>`` element. Both ``<script>`` and ``<style>`` are excluded by default. Attribute values can also be automatically translated. The default is to -consider the attributes ``abbr``, ``alt``, ``label``, ``prompt``, ``standby``, -``summary``, and ``title``, which is a list that makes sense for HTML documents. -Of course, you can tell the translator to use a different set of attribute -names, or none at all. +consider the attributes ``abbr``, ``alt``, ``label``, ``prompt``, ``standby``, +``summary``, and ``title``, which is a list that makes sense for HTML +documents. Of course, you can tell the translator to use a different set of +attribute names, or none at all. -In addition, you can control automatic translation in your templates using the -``xml:lang`` attribute. If the value of that attribute is a literal string, the -contents and attributes of the element will be ignored: +---------------- +Language Tagging +---------------- + +You can control automatic translation in your templates using the ``xml:lang`` +attribute. If the value of that attribute is a literal string, the contents and +attributes of the element will be ignored: .. code-block:: genshi @@ -81,6 +87,256 @@ </html> +.. _`template directives`: + +Template Directives +=================== + +Sometimes localizable strings in templates may contain dynamic parameters, or +they may depend on the numeric value of some variable to choose a proper +plural form. Sometimes the strings contain embedded markup, such as tags for +emphasis or hyperlinks, and you don't want to rely on the people doing the +translations to know the syntax and escaping rules of HTML and XML. + +In those cases the simple text extraction and translation process described +above is not sufficient. You could just use ``gettext`` API functions in +embedded Python expressions for parameters and pluralization, but that does +not help when messages contain embedded markup. Genshi provides special +template directives for internationalization that attempt to provide a +comprehensive solution for this problem space. + +To enable these directives, you'll need to register them with the templates +they are used in. You can do this by adding them manually via the +``Template.add_directives(namespace, factory)`` (where ``namespace`` would be +“http://genshi.edgewall.org/i18n” and ``factory`` would be an instance of the +``Translator`` class). Or you can just call the ``Translator.setup(template)`` +class method, which both registers the directives and adds the translation +filter. + +After the directives have been registered with the template engine on the +Python side of your application, you need to declare the corresponding +directive namespace in all markup templates that use them. For example: + +.. code-block:: genshi + + <html xmlns:py="http://genshi.edgewall.org/" + xmlns:i18n="http://genshi.edgewall.org/i18n/"> + … + </html> + +These directives only make sense in the context of `markup templates`_. For +`text templates`_, you can just use the corresponding ``gettext`` API calls as needed. + +.. note:: The internationalization directives are still somewhat experimental + and have some known issues. However, the attribute language they + implement should be stable and is not subject to change + substantially in future versions. + +.. _`markup templates`: xml-templates.html +.. _`text templates`: text-templates.html + +-------- +Messages +-------- + +``i18n:msg`` +------------ + +This is the basic directive for defining localizable text passages that +contain parameters and/or markup. + +For example, consider the following template snippet: + +.. code-block:: genshi + + <p> + Please visit <a href="${site.url}">${site.name}</a> for help. + </p> + +Without further annotation, the translation filter would treat this sentence +as two separate messages (“Please visit” and “for help”), and the translator +would have no control over the position of the link in the sentence. + +However, when you use the Genshi internationalization directives, you simply +add an ``i18n:msg`` attribute to the enclosing ``<p>`` element: + +.. code-block:: genshi + + <p i18n:msg="name"> + Please visit <a href="${site.url}">${site.name}</a> for help. + </p> + +Genshi is then able to identify the text in the ``<p>`` element as a single +message for translation purposes. You'll see the following string in your +message catalog:: + + Please visit [1:%(name)s] for help. + +The `<a>` element with its attribute has been replaced by a part in square +brackets, which does not include the tag name or the attributes of the element. + +The value of the ``i18n:msg`` attribute is a comma-separated list of parameter +names, which serve as simplified aliases for the actual Python expressions the +message contains. The order of the paramer names in the list must correspond +to the order of the expressions in the text. In this example, there is only +one parameter: its alias for translation is “name”, while the corresponding +expression is ``${site.name}``. + +The translator now has complete control over the structure of the sentence. He +or she certainly does need to make sure that any bracketed parts are not +removed, and that the ``name`` parameter is preserved correctly. But those are +things that can be easily checked by validating the message catalogs. The +important thing is that the translator can change the sentence structure, and +has no way to break the application by forgetting to close a tag, for example. + +So if the German translator of this snippet decided to translate it to:: + + Um Hilfe zu erhalten, besuchen Sie bitte [1:%(name)s] + +The resulting output might be: + +.. code-block:: xml + + <p> + Um Hilfe zu erhalten, besuchen Sie bitte + <a href="http://example.com/">Example</a> + </p> + +Messages may contain multiple tags, and they may also be nested. For example: + +.. code-block:: genshi + + <p i18n:msg="name"> + <i>Please</i> visit <b>the site <a href="${site.url}">${site.name}</a></b> + for help. + </p> + +This would result in the following message ID:: + + [1:Please] visit [2:the site [3:%(name)s]] for help. + +Again, the translator has full control over the structure of the sentence. So +the German translation could actually look like this:: + + Um Hilfe zu erhalten besuchen Sie [1:bitte] + [3:%(name)s], [2:das ist eine Web-Site] + +Which Genshi would recompose into the following outout: + +.. code-block:: xml + + <p> + Um Hilfe zu erhalten besuchen Sie <i>bitte</i> + <a href="http://example.com/">Example</a>, <b>das ist eine Web-Site</b> + </p> + +Note how the translation has changed the order and even the nesting of the +tags. + +.. warning:: Please note that ``i18n:msg`` directives do not support other + nested directives. Directives commonly change the structure of + the generated markup dynamically, which often would result in the + structure of the text changing, thus making translation as a + single message ineffective. + +``i18n:choose``, ``i18n:singular``, ``i18n:plural`` +--------------------------------------------------- + +Translatable strings that vary based on some number of objects, such as “You +have 1 new message” or “You have 3 new messages”, present their own challenge, +in particular when you consider that different languages have different rules +for pluralization. For example, while English and most western languages have +two plural forms (one for ``n=1`` and an other for ``n<>1``), Welsh has five +different plural forms, while Hungarian only has one. + +The ``gettext`` framework has long supported this via the ``ngettext()`` +family of functions. You specify two default messages, one singular and one +plural, and the number of items. The translations however may contain any +number of plural forms for the message, depending on how many are commonly +used in the language. ``ngettext`` will choose the correct plural form of the +translated message based on the specified number of items. + +Genshi provides a variant of the ``i18n:msg`` directive described above that +allows choosing the proper plural form based on the numeric value of a given +variable. The pluralization support is implemented in a set of three +directives that must be used together: ``i18n:choose``, ``i18n:singular``, and +``i18n:plural``. + +The ``i18n:choose`` directive is used to set up the context of the message: it +simply wraps the singular and plural variants. + +The value of this directive is split into two parts: the first is the +*numeral*, a Python expression that evaluates to a number to determine which +plural form should be chosen. The second part, separated by a semicolon, lists +the parameter names. This part is equivalent to the value of the ``i18n:msg`` +directive. + +For example: + +.. code-block:: genshi + + <p i18n:choose="len(messages); num"> + <i18n:singular>You have <b>${len(messages)}</b> new message.</i18n:singular> + <i18n:plural>You have <b>${len(messages)}</b> new messages.</i18n:plural> + </p> + +All three directives can be used either as elements or attribute. So the above +example could also be written as follows: + +.. code-block:: genshi + + <i18n:choose numeral="len(messages)" params="num"> + <p i18n:singular="">You have <b>${len(messages)}</b> new message.</p> + <p i18n:plural="">You have <b>${len(messages)}</b> new messages.</p> + </i18n:choose> + +When used as an element, the two parts of the ``i18n:choose`` value are split +into two different attributes: ``numeral`` and ``params``. The +``i18n:singular`` and ``i18n:plural`` directives do not require or support any +value (or any extra attributes). + +-------------------- +Comments and Domains +-------------------- + +``i18n:comment`` +---------------- + +The ``i18n:comment`` directive can be used to supply a comment for the +translator. For example, if a template snippet is not easily understood +outside of its context, you can add a translator comment to help the +translator understand in what context the message will be used: + +.. code-block:: genshi + + <p i18n:msg="name" i18n:comment="Link to the relevant support site"> + Please visit <a href="${site.url}">${site.name}</a> for help. + </p> + +This comment will be extracted together with the message itself, and will +commonly be placed along the message in the message catalog, so that it is +easily visible to the person doing the translation. + +This directive has no impact on how the template is rendered, and is ignored +outside of the extraction process. + +``i18n:domain`` +--------------- + +In larger projects, message catalogs are commonly split up into different +*domains*. For example, you might have a core application domain, and then +separate domains for extensions or libraries. + +Genshi provides a directive called ``i18n:domain`` that lets you choose the +translation domain for a particular scope. For example: + +.. code-block:: genshi + + <div i18n:domain="examples"> + <p>Hello, world!</p> + </div> + + Extraction ========== @@ -90,7 +346,11 @@ as well as strings in ``gettext()`` calls in embedded Python code. See the API documentation for details on how to use this method directly. -This functionality is integrated into the message extraction framework provided +----------------- +Babel Integration +----------------- + +This functionality is integrated with the message extraction framework provided by the `Babel`_ project. Babel provides a command-line interface as well as commands that can be used from ``setup.py`` scripts using `Setuptools`_ or `Distutils`_. @@ -123,10 +383,6 @@ If all goes well, running the extraction with Babel should create a POT file containing the strings from your Genshi templates and your Python source files. -.. note:: Genshi currently does not support “translator comments”, i.e. text in - template comments that would get added to the POT file. This support - may or may not be added in future versions. - --------------------- Configuration Options @@ -166,9 +422,10 @@ the ``include_attrs`` list are extracted. If this option is disabled, only strings in ``gettext`` function calls are extracted. -.. note:: If you disable this option, it's not necessary to add the translation - filter as described above. You only need to make sure that the - template has access to the ``gettext`` functions it uses. +.. note:: If you disable this option, and do not make use of the + internationalization directives, it's not necessary to add the + translation filter as described above. You only need to make sure + that the template has access to the ``gettext`` functions it uses. Translation @@ -193,22 +450,24 @@ template = MarkupTemplate("...") template.filters.insert(0, Translator(translations.ugettext)) -If you're using `TemplateLoader`, you should specify a callback function in -which you add the filter: +The ``Translator`` class also provides the convenience method ``setup()``, +which will both add the filter and register the i18n directives: .. code-block:: python from genshi.filters import Translator - from genshi.template import TemplateLoader - - def template_loaded(template): - template.filters.insert(0, Translator(translations.ugettext)) + from genshi.template import MarkupTemplate - loader = TemplateLoader('templates', callback=template_loaded) - template = loader.load("...") + template = MarkupTemplate("...") + translator = Translator(translations.ugettext) + translator.setup(template) -This approach ensures that the filter is not added everytime the template is -loaded, and thus being applied multiple times. +.. warning:: If you're using ``TemplateLoader``, you should specify a + `callback function`_ in which you add the filter. That ensures + that the filter is not added everytime the template is rendered, + thereby being applied multiple times. + +.. _`callback function`: loader.html#callback-interface Related Considerations