diff doc/i18n.txt @ 528:f38ce008ab0a

Integrated [http://babel.edgewall.org/ Babel] message extraction plugin, and added I18n doc page.
author cmlenz
date Wed, 20 Jun 2007 09:48:55 +0000
parents
children 2a6cf641cb5e
line wrap: on
line diff
new file mode 100644
--- /dev/null
+++ b/doc/i18n.txt
@@ -0,0 +1,237 @@
+.. -*- mode: rst; encoding: utf-8 -*-
+
+=====================================
+Internationalization and Localization
+=====================================
+
+Genshi provides basic supporting infrastructure for internationalizing
+and localizing templates. That includes functionality for extracting localizable 
+strings from templates, as well as a template filter that can apply translations 
+to templates as they get rendered.
+
+This support is based on `gettext`_ message catalogs and the `gettext Python 
+module`_. The extraction process can be used from the API level, or through the
+front-ends implemented by the `Babel`_ project, for which Genshi provides a
+plugin.
+
+.. _`gettext`: http://www.gnu.org/software/gettext/
+.. _`gettext python module`: http://docs.python.org/lib/module-gettext.html
+.. _`babel`: http://babel.edgewall.org/
+
+
+.. contents:: Contents
+   :depth: 2
+.. sectnum::
+
+
+Basics
+======
+
+The simplest way to internationalize and translate templates would be to wrap
+all localizable strings in a ``gettext()`` function call (which is often aliased 
+to ``_()`` for brevity). In that case, no extra template filter is required.
+
+.. code-block:: genshi
+
+  <p>${_("Hello, world!")}</p>
+
+However, this approach results in significant “character noise” in templates, 
+making them harder to read and preview.
+
+The ``genshi.filters.Translator`` filter allows you to get rid of the 
+explicit `gettext`_ function calls, so you can continue to just write:
+
+.. code-block:: genshi
+
+  <p>Hello, world!</p>
+
+This text will still be extracted and translated as if you had wrapped it in a
+``_()`` call.
+
+.. note:: For parameterized or pluralizable messages, you need to continue using
+          the appropriate ``gettext`` functions.
+
+You can control which tags should be ignored by this process; for example, it  
+doesn't really make sense to translate the content of the HTML 
+``<script></script>`` element. Both ``<script>`` and ``<style>`` are excluded
+by default.
+
+Attribute values can also be automatically translated. The default is to 
+consider the attributes ``abbr``, ``alt``, ``label``, ``prompt``, ``standby``, 
+``summary``, and ``title``, which is a list that makes sense for HTML documents. 
+Of course, you can tell the translator to use a different set of attribute 
+names, or none at all.
+
+In addition, you can control automatic translation in your templates using the
+``xml:lang`` attribute. If the value of that attribute is a literal string, the
+contents and attributes of the element will be ignored:
+
+.. code-block:: genshi
+
+  <p xml:lang="en">Hello, world!</p>
+
+On the other hand, if the value of the ``xml:lang`` attribute contains a Python
+expression, the element contents and attributes are still considered for 
+automatic translation:
+
+.. code-block:: genshi
+
+  <html xml:lang="$locale">
+    ...
+  </html>
+
+
+Extraction
+==========
+
+The ``Translator`` class provides a class method called ``extract``, which is
+a generator yielding all localizable strings found in a template or markup 
+stream. This includes both literal strings in text nodes and attribute values,
+as well as strings in ``gettext()`` calls in embedded Python code. See the API
+documentation for details on how to use this method directly.
+
+This functionality is integrated into the message extraction framework provided
+by the `Babel`_ project. Babel provides a command-line interface as well as 
+commands that can be used from ``setup.py`` scripts using `Setuptools`_ or 
+`Distutils`_.
+
+.. _`setuptools`: http://peak.telecommunity.com/DevCenter/setuptools
+.. _`distutils`: http://docs.python.org/dist/dist.html
+
+The first thing you need to do to make Babel extract messages from Genshi 
+templates is to let Babel know which files are Genshi templates. This is done
+using a “mapping configuration”, which can be stored in a configuration file,
+or specified directly in your ``setup.py``.
+
+In a configuration file, the mapping may look like this:
+
+.. code-block:: ini
+
+  # Python souce
+  [python:**.py]
+
+  # Genshi templates
+  [genshi:**/templates/**.html]
+  include_attrs = title
+
+  [genshi:**/templates/**.txt]
+  template_class = genshi.template.TextTemplate
+  encoding = latin-1
+
+Please consult the Babel documentation for details on configuration.
+
+If all goes well, running the extraction with Babel should create a POT file
+containing the strings from your Genshi templates and your Python source files.
+
+.. note:: Genshi currently does not support “translator comments”, i.e. text in 
+          template comments that would get added to the POT file. This support
+          may or may not be added in future versions.
+
+
+---------------------
+Configuration Options
+---------------------
+
+The Genshi extraction plugin for Babel supports the following options:
+
+``template_class``
+------------------
+The concrete ``Template`` class that the file should be loaded with. Specify
+the package/module name and the class name, separated by a colon.
+
+The default is to use ``genshi.template:MarkupTemplate``, and you'll want to
+set it to ``genshi.template:TextTemplate`` for `text templates`_.
+
+.. _`text templates`: text-templates.html
+
+``encoding``
+------------------
+The encoding of the template file. This is only used for text templates. The
+default is to assume “utf-8”.
+
+``include_attrs``
+------------------
+Comma-separated list of attribute names that should be considered to have 
+localizable values. Only used for markup templates.
+
+``include_tags``
+------------------
+Comma-separated list of tag names that should be ignored. Only used for markup 
+templates.
+
+
+Translation
+===========
+
+If you have prepared MO files for use with Genshi using the appropriate tools,
+you can access the message catalogs with the `gettext Python module`_. You'll
+probably want to create a ``gettext.GNUTranslations`` instance, and make the
+translation functions it provides available to your templates by putting them
+in the template context.
+
+The ``Translator`` filter needs to be added to the filters of the template
+(applying it as a stream filter will likely not have the desired effect).
+Furthermore it needs to be the first filter in the list, including the internal
+filters that Genshi adds itself:
+
+.. code-block:: python
+
+  from genshi.filters import Translator
+  from genshi.template import MarkupTemplate
+  
+  template = MarkupTemplate("...")
+  template.filters.insert(0, Translator(translations.ugettext))
+
+If you're using `TemplateLoader`, you should specify a callback function in 
+which you add the filter:
+
+.. code-block:: python
+
+  from genshi.filters import Translator
+  from genshi.template import TemplateLoader
+  
+  def template_loaded(template):
+      template.filters.insert(0, , Translator(translations.ugettext))
+  
+  loader = TemplateLoader('templates', callback=template_loaded)
+  template = loader.load("...")
+
+This approach ensures that the filter is not added everytime the template is 
+loaded, and thus being applied multiple times.
+
+
+Related Considerations
+======================
+
+If you intend to produce an application that is fully prepared for an 
+international audience, there are a couple of other things to keep in mind:
+
+-------
+Unicode
+-------
+
+Use ``unicode`` internally, not encoded bytestrings. Only encode/decode where
+data enters or exits the system. This means that your code works with characters
+and not just with bytes, which is an important distinction for example when 
+calculating the length of a piece of text. When you need to decode/encode, it's
+probably a good idea to use UTF-8.
+
+-------------
+Date and Time
+-------------
+
+If your application uses datetime information that should be displayed to users 
+in different timezones, you should try to work with UTC (universal time) 
+internally. Do the conversion from and to "local time" when the data enters or 
+exits the system. Make use the Python `datetime`_ module and the third-party 
+`pytz`_ package.
+
+--------------------------
+Formatting and Locale Data
+--------------------------
+
+Make sure you check out the functionality provided by the `Babel`_ project for 
+things like number and date formatting, locale display strings, etc.
+
+.. _`datetime`: http://docs.python.org/lib/module-datetime.html
+.. _`pytz`: http://pytz.sourceforge.net/
Copyright (C) 2012-2017 Edgewall Software