Mercurial > babel > mirror
view doc/catalogs.txt @ 59:d9c6ea20904b trunk
Fix 2nd typo of [58].
author | palgarvio |
---|---|
date | Fri, 08 Jun 2007 11:35:06 +0000 |
parents | 37bd476dafe4 |
children | 9bc73c0bf7e5 |
line wrap: on
line source
.. -*- mode: rst; encoding: utf-8 -*- ============================= Working with Message Catalogs ============================= .. contents:: Contents :depth: 2 .. sectnum:: Introduction ============ The ``gettext`` translation system enables you to mark any strings used in your application as subject to localization, by wrapping them in functions such as ``gettext(str)`` and ``ngettext(singular, plural, num)``. For brevity, the ``gettext`` function is often aliased to ``_(str)``, so you can write: .. code-block:: python print _("Hello") instead of just: .. code-block:: python print "Hello" to make the string "Hello" localizable. Message catalogs are collections of translations for such localizable messages used in an application. They are commonly stored in PO (Portable Object) and MO (Machine Object) files, the formats of which are defined by the GNU `gettext`_ tools and the GNU `translation project`_. .. _`gettext`: http://www.gnu.org/software/gettext/ .. _`translation project`: http://sourceforge.net/projects/translation The general procedure for building message catalogs looks something like this: * use a tool (such as ``xgettext``) to extract localizable strings from the code base and write them to a POT (PO Template) file. * make a copy of the POT file for a specific locale (for example, "en_US") and start translating the messages * use a tool such as ``msgfmt`` to compile the locale PO file into an binary MO file * later, when code changes make it necessary to update the translations, you regenerate the POT file and merge the changes into the various locale-specific PO files, for example using ``msgmerge`` Python provides the `gettext module`_ as part of the standard library, which enables applications to work with appropriately generated MO files. .. _`gettext module`: http://docs.python.org/lib/module-gettext.html As ``gettext`` provides a solid and well supported foundation for translating application messages, Babel does not reinvent the wheel, but rather reuses this infrastructure, and makes it easier to build message catalogs for Python applications. Message Extraction ================== Babel provides functionality similar to that of the ``xgettext`` program, except that only extraction from Python source files is built-in, while support for other file formats can be added using a simple extension mechanism. Unlike ``xgettext``, which is usually invoked once for every file, the routines for message extraction in Babel operate on directories. While the per-file approach of ``xgettext`` works nicely with projects using a ``Makefile``, Python projects rarely use ``make``, and thus a different mechanism is needed for extracting messages from the heterogeneous collection of source files that many Python projects are composed of. When message extraction is based on directories instead of individual files, there needs to be a way to configure which files should be treated in which manner. For example, while many projects may contain ``.html`` files, some of those files may be static HTML files that don't contain localizable message, while others may be `Django`_ templates, and still others may contain `Genshi`_ markup templates. Some projects may even mix HTML files for different templates languages (for whatever reason). Therefore the way in which messages are extracted from source files can not only depend on the file extension, but needs to be controllable in a precise manner. .. _`Django`: http://www.djangoproject.com/ .. _`Genshi`: http://genshi.edgewall.org/ Babel accepts a configuration file to specify this mapping of files to extraction methods, which is described below. .. _`mapping`: ------------------------------------------- Extraction Method Mapping and Configuration ------------------------------------------- The mapping of extraction methods to files in Babel is done via a configuration file. This file maps extended glob patterns to the names of the extraction methods, and can also set various options for each pattern (which options are available depends on the specific extraction method). For example, the following configuration adds extraction of messages from both Genshi markup templates and text templates: .. code-block:: ini # Extraction from Python source files [python: foobar/**.py] # Extraction from Genshi HTML and text templates [genshi: foobar/**/templates/**.html] ignore_tags = script,style include_attrs = alt title summary [genshi: foobar/**/templates/**.txt] template_class = genshi.template.text:TextTemplate encoding = ISO-8819-15 The configuration file syntax is based on the format commonly found in ``.INI`` files on Windows systems, and as supported by the ``ConfigParser`` module in the Python standard libraries. Section names (the strings enclosed in square brackets) specify both the name of the extraction method, and the extended glob pattern to specify the files that this extraction method should be used for, separated by a colon. The options in the sections are passed to the extraction method. Which options are available is specific to the extraction method used. The extended glob patterns used in this configuration are similar to the glob patterns provided by most shells. A single asterisk (``*``) is a wildcard for any number of characters (except for the pathname component separator "/"), while a question mark (``?``) only matches a single character. In addition, two subsequent asterisk characters (``**``) can be used to make the wildcard match any directory level, so the pattern ``**.txt`` matches any file with the extension ``.txt`` in any directory. Lines that start with a ``#`` or ``;`` character are ignored and can be used for comments. Empty lines are also ignored, too. .. note:: if you're performing message extraction using the command Babel provides for integration into ``setup.py`` scripts (see below), you can also provide this configuration in a different way, namely as a keyword argument to the ``setup()`` function. ---------- Front-Ends ---------- Babel provides two different front-ends to access its functionality for working with message catalogs: * A `Command-line interface <cmdline.html>`_, and * `Integeration with distutils/setuptools <setup.html>`_ Which one you choose depends on the nature of your project. For most modern Python projects, the distutils/setuptools integration is probably more convenient. -------------------------- Writing Extraction Methods -------------------------- (TODO: write) Extended ``Translations`` Class =============================== Many web-based applications are composed of a variety of different components (possibly using some kind of plugin system), and some of those components may provide their own message catalogs that need to be integrated into the larger system. To support this usage pattern, Babel provides a ``Translations`` class that is derived from the ``GNUTranslations`` class in the ``gettext`` module. This class adds a ``merge()`` method that takes another ``Translations`` instance, and merges its contents into the catalog: .. code-block:: python translations = Translations.load('main') translations.merge(Translations.load('plugin1'))