diff doc/i18n.txt @ 902:09cc3627654c experimental-inline

Sync `experimental/inline` branch with [source:trunk@1126].
author cmlenz
date Fri, 23 Apr 2010 21:08:26 +0000
parents 1837f39efd6f
children
line wrap: on
line diff
--- a/doc/i18n.txt
+++ b/doc/i18n.txt
@@ -4,15 +4,15 @@
 Internationalization and Localization
 =====================================
 
-Genshi provides basic supporting infrastructure for internationalizing
-and localizing templates. That includes functionality for extracting localizable 
-strings from templates, as well as a template filter that can apply translations 
-to templates as they get rendered.
+Genshi provides comprehensive supporting infrastructure for internationalizing
+and localizing templates. That includes functionality for extracting
+localizable strings from templates, as well as a template filter and special
+directives that can apply translations to templates as they get rendered.
 
 This support is based on `gettext`_ message catalogs and the `gettext Python 
-module`_. The extraction process can be used from the API level, or through the
-front-ends implemented by the `Babel`_ project, for which Genshi provides a
-plugin.
+module`_. The extraction process can be used from the API level, or through
+the front-ends implemented by the `Babel`_ project, for which Genshi provides
+a plugin.
 
 .. _`gettext`: http://www.gnu.org/software/gettext/
 .. _`gettext python module`: http://docs.python.org/lib/module-gettext.html
@@ -28,18 +28,19 @@
 ======
 
 The simplest way to internationalize and translate templates would be to wrap
-all localizable strings in a ``gettext()`` function call (which is often aliased 
-to ``_()`` for brevity). In that case, no extra template filter is required.
+all localizable strings in a ``gettext()`` function call (which is often
+aliased to ``_()`` for brevity). In that case, no extra template filter is
+required.
 
 .. code-block:: genshi
 
   <p>${_("Hello, world!")}</p>
 
-However, this approach results in significant “character noise” in templates, 
+However, this approach results in significant “character noise” in templates,
 making them harder to read and preview.
 
 The ``genshi.filters.Translator`` filter allows you to get rid of the 
-explicit `gettext`_ function calls, so you can continue to just write:
+explicit `gettext`_ function calls, so you can (often) just continue to write:
 
 .. code-block:: genshi
 
@@ -48,23 +49,28 @@
 This text will still be extracted and translated as if you had wrapped it in a
 ``_()`` call.
 
-.. note:: For parameterized or pluralizable messages, you need to continue using
-          the appropriate ``gettext`` functions.
+.. note:: For parameterized or pluralizable messages, you need to use the
+          special `template directives`_ described below, or use the
+          corresponding ``gettext`` function in embedded Python expressions.
 
-You can control which tags should be ignored by this process; for example, it  
+You can control which tags should be ignored by this process; for example, it
 doesn't really make sense to translate the content of the HTML 
 ``<script></script>`` element. Both ``<script>`` and ``<style>`` are excluded
 by default.
 
 Attribute values can also be automatically translated. The default is to 
-consider the attributes ``abbr``, ``alt``, ``label``, ``prompt``, ``standby``, 
-``summary``, and ``title``, which is a list that makes sense for HTML documents. 
-Of course, you can tell the translator to use a different set of attribute 
-names, or none at all.
+consider the attributes ``abbr``, ``alt``, ``label``, ``prompt``, ``standby``,
+``summary``, and ``title``, which is a list that makes sense for HTML
+documents.  Of course, you can tell the translator to use a different set of
+attribute names, or none at all.
 
-In addition, you can control automatic translation in your templates using the
-``xml:lang`` attribute. If the value of that attribute is a literal string, the
-contents and attributes of the element will be ignored:
+----------------
+Language Tagging
+----------------
+
+You can control automatic translation in your templates using the ``xml:lang``
+attribute. If the value of that attribute is a literal string, the contents and
+attributes of the element will be ignored:
 
 .. code-block:: genshi
 
@@ -81,6 +87,256 @@
   </html>
 
 
+.. _`template directives`:
+
+Template Directives
+===================
+
+Sometimes localizable strings in templates may contain dynamic parameters, or
+they may depend on the numeric value of some variable to choose a proper
+plural form. Sometimes the strings contain embedded markup, such as tags for
+emphasis or hyperlinks, and you don't want to rely on the people doing the
+translations to know the syntax and escaping rules of HTML and XML.
+
+In those cases the simple text extraction and translation process described
+above is not sufficient. You could just use ``gettext`` API functions in
+embedded Python expressions for parameters and pluralization, but that does
+not help when messages contain embedded markup. Genshi provides special
+template directives for internationalization that attempt to provide a
+comprehensive solution for this problem space.
+
+To enable these directives, you'll need to register them with the templates
+they are used in. You can do this by adding them manually via the
+``Template.add_directives(namespace, factory)`` (where ``namespace`` would be
+“http://genshi.edgewall.org/i18n” and ``factory`` would be an instance of the
+``Translator`` class). Or you can just call the ``Translator.setup(template)``
+class method, which both registers the directives and adds the translation
+filter.
+
+After the directives have been registered with the template engine on the
+Python side of your application, you need to declare the corresponding
+directive namespace in all markup templates that use them. For example:
+
+.. code-block:: genshi
+
+  <html xmlns:py="http://genshi.edgewall.org/"
+        xmlns:i18n="http://genshi.edgewall.org/i18n/">
+    …
+  </html>
+
+These directives only make sense in the context of `markup templates`_. For
+`text templates`_, you can just use the corresponding ``gettext`` API calls as needed.
+
+.. note:: The internationalization directives are still somewhat experimental
+          and have some known issues. However, the attribute language they
+          implement should be stable and is not subject to change
+          substantially in future versions.
+
+.. _`markup templates`: xml-templates.html
+.. _`text templates`: text-templates.html
+
+--------
+Messages
+--------
+
+``i18n:msg``
+------------
+
+This is the basic directive for defining localizable text passages that
+contain parameters and/or markup.
+
+For example, consider the following template snippet:
+
+.. code-block:: genshi
+
+  <p>
+    Please visit <a href="${site.url}">${site.name}</a> for help.
+  </p>
+
+Without further annotation, the translation filter would treat this sentence
+as two separate messages (“Please visit” and “for help”), and the translator
+would have no control over the position of the link in the sentence.
+
+However, when you use the Genshi internationalization directives, you simply
+add an ``i18n:msg`` attribute to the enclosing ``<p>`` element:
+
+.. code-block:: genshi
+
+  <p i18n:msg="name">
+    Please visit <a href="${site.url}">${site.name}</a> for help.
+  </p>
+
+Genshi is then able to identify the text in the ``<p>`` element as a single
+message for translation purposes. You'll see the following string in your
+message catalog::
+
+  Please visit [1:%(name)s] for help.
+
+The `<a>` element with its attribute has been replaced by a part in square
+brackets, which does not include the tag name or the attributes of the element.
+
+The value of the ``i18n:msg`` attribute is a comma-separated list of parameter
+names, which serve as simplified aliases for the actual Python expressions the
+message contains. The order of the paramer names in the list must correspond
+to the order of the expressions in the text. In this example, there is only
+one parameter: its alias for translation is “name”, while the corresponding
+expression is ``${site.name}``.
+
+The translator now has complete control over the structure of the sentence. He
+or she certainly does need to make sure that any bracketed parts are not
+removed, and that the ``name`` parameter is preserved correctly. But those are
+things that can be easily checked by validating the message catalogs. The
+important thing is that the translator can change the sentence structure, and
+has no way to break the application by forgetting to close a tag, for example.
+
+So if the German translator of this snippet decided to translate it to::
+
+  Um Hilfe zu erhalten, besuchen Sie bitte [1:%(name)s]
+
+The resulting output might be:
+
+.. code-block:: xml
+
+  <p>
+    Um Hilfe zu erhalten, besuchen Sie bitte
+    <a href="http://example.com/">Example</a>
+  </p>
+
+Messages may contain multiple tags, and they may also be nested. For example:
+
+.. code-block:: genshi
+
+  <p i18n:msg="name">
+    <i>Please</i> visit <b>the site <a href="${site.url}">${site.name}</a></b>
+    for help.
+  </p>
+
+This would result in the following message ID::
+
+  [1:Please] visit [2:the site [3:%(name)s]] for help.
+
+Again, the translator has full control over the structure of the sentence. So
+the German translation could actually look like this::
+
+  Um Hilfe zu erhalten besuchen Sie [1:bitte]
+  [3:%(name)s], [2:das ist eine Web-Site]
+
+Which Genshi would recompose into the following outout:
+
+.. code-block:: xml
+
+  <p>
+    Um Hilfe zu erhalten besuchen Sie <i>bitte</i>
+    <a href="http://example.com/">Example</a>, <b>das ist eine Web-Site</b>
+  </p>
+
+Note how the translation has changed the order and even the nesting of the
+tags.
+
+.. warning:: Please note that ``i18n:msg`` directives do not support other
+             nested directives. Directives commonly change the structure of
+             the generated markup dynamically, which often would result in the
+             structure of the text changing, thus making translation as a
+             single message ineffective.
+
+``i18n:choose``, ``i18n:singular``, ``i18n:plural``
+---------------------------------------------------
+
+Translatable strings that vary based on some number of objects, such as “You
+have 1 new message” or “You have 3 new messages”, present their own challenge,
+in particular when you consider that different languages have different rules
+for pluralization. For example, while English and most western languages have
+two plural forms (one for ``n=1`` and an other for ``n<>1``), Welsh has five
+different plural forms, while Hungarian only has one.
+
+The ``gettext`` framework has long supported this via the ``ngettext()``
+family of functions. You specify two default messages, one singular and one
+plural, and the number of items. The translations however may contain any
+number of plural forms for the message, depending on how many are commonly
+used in the language. ``ngettext`` will choose the correct plural form of the
+translated message based on the specified number of items.
+
+Genshi provides a variant of the ``i18n:msg`` directive described above that
+allows choosing the proper plural form based on the numeric value of a given
+variable. The pluralization support is implemented in a set of three
+directives that must be used together: ``i18n:choose``, ``i18n:singular``, and
+``i18n:plural``.
+
+The ``i18n:choose`` directive is used to set up the context of the message: it
+simply wraps the singular and plural variants.
+
+The value of this directive is split into two parts: the first is the
+*numeral*, a Python expression that evaluates to a number to determine which
+plural form should be chosen. The second part, separated by a semicolon, lists
+the parameter names. This part is equivalent to the value of the ``i18n:msg``
+directive.
+
+For example:
+
+.. code-block:: genshi
+
+  <p i18n:choose="len(messages); num">
+    <i18n:singular>You have <b>${len(messages)}</b> new message.</i18n:singular>
+    <i18n:plural>You have <b>${len(messages)}</b> new messages.</i18n:plural>
+  </p>
+
+All three directives can be used either as elements or attribute. So the above
+example could also be written as follows:
+
+.. code-block:: genshi
+
+  <i18n:choose numeral="len(messages)" params="num">
+    <p i18n:singular="">You have <b>${len(messages)}</b> new message.</p>
+    <p i18n:plural="">You have <b>${len(messages)}</b> new messages.</p>
+  </i18n:choose>
+
+When used as an element, the two parts of the ``i18n:choose`` value are split
+into two different attributes: ``numeral`` and ``params``. The
+``i18n:singular`` and ``i18n:plural`` directives do not require or support any
+value (or any extra attributes).
+
+--------------------
+Comments and Domains
+--------------------
+
+``i18n:comment``
+----------------
+
+The ``i18n:comment`` directive can be used to supply a comment for the
+translator. For example, if a template snippet is not easily understood
+outside of its context, you can add a translator comment to help the
+translator understand in what context the message will be used:
+
+.. code-block:: genshi
+
+  <p i18n:msg="name" i18n:comment="Link to the relevant support site">
+    Please visit <a href="${site.url}">${site.name}</a> for help.
+  </p>
+
+This comment will be extracted together with the message itself, and will
+commonly be placed along the message in the message catalog, so that it is
+easily visible to the person doing the translation.
+
+This directive has no impact on how the template is rendered, and is ignored
+outside of the extraction process.
+
+``i18n:domain``
+---------------
+
+In larger projects, message catalogs are commonly split up into different
+*domains*. For example, you might have a core application domain, and then
+separate domains for extensions or libraries.
+
+Genshi provides a directive called ``i18n:domain`` that lets you choose the
+translation domain for a particular scope. For example:
+
+.. code-block:: genshi
+
+  <div i18n:domain="examples">
+    <p>Hello, world!</p>
+  </div>
+
+
 Extraction
 ==========
 
@@ -90,7 +346,11 @@
 as well as strings in ``gettext()`` calls in embedded Python code. See the API
 documentation for details on how to use this method directly.
 
-This functionality is integrated into the message extraction framework provided
+-----------------
+Babel Integration
+-----------------
+
+This functionality is integrated with the message extraction framework provided
 by the `Babel`_ project. Babel provides a command-line interface as well as 
 commands that can be used from ``setup.py`` scripts using `Setuptools`_ or 
 `Distutils`_.
@@ -123,10 +383,6 @@
 If all goes well, running the extraction with Babel should create a POT file
 containing the strings from your Genshi templates and your Python source files.
 
-.. note:: Genshi currently does not support “translator comments”, i.e. text in 
-          template comments that would get added to the POT file. This support
-          may or may not be added in future versions.
-
 
 ---------------------
 Configuration Options
@@ -166,9 +422,10 @@
 the ``include_attrs`` list are extracted. If this option is disabled, only 
 strings in ``gettext`` function calls are extracted.
 
-.. note:: If you disable this option, it's not necessary to add the translation
-          filter as described above. You only need to make sure that the
-          template has access to the ``gettext`` functions it uses.
+.. note:: If you disable this option, and do not make use of the
+          internationalization directives, it's not necessary to add the
+          translation filter as described above. You only need to make sure
+          that the template has access to the ``gettext`` functions it uses.
 
 
 Translation
@@ -193,22 +450,24 @@
   template = MarkupTemplate("...")
   template.filters.insert(0, Translator(translations.ugettext))
 
-If you're using `TemplateLoader`, you should specify a callback function in 
-which you add the filter:
+The ``Translator`` class also provides the convenience method ``setup()``,
+which will both add the filter and register the i18n directives:
 
 .. code-block:: python
 
   from genshi.filters import Translator
-  from genshi.template import TemplateLoader
-  
-  def template_loaded(template):
-      template.filters.insert(0, Translator(translations.ugettext))
+  from genshi.template import MarkupTemplate
   
-  loader = TemplateLoader('templates', callback=template_loaded)
-  template = loader.load("...")
+  template = MarkupTemplate("...")
+  translator = Translator(translations.ugettext)
+  translator.setup(template)
 
-This approach ensures that the filter is not added everytime the template is 
-loaded, and thus being applied multiple times.
+.. warning:: If you're using ``TemplateLoader``, you should specify a
+            `callback function`_ in which you add the filter. That ensures
+            that the filter is not added everytime the template is rendered,
+            thereby being applied multiple times.
+
+.. _`callback function`: loader.html#callback-interface
 
 
 Related Considerations
Copyright (C) 2012-2017 Edgewall Software