comparison doc/i18n.txt @ 528:f38ce008ab0a

Integrated [http://babel.edgewall.org/ Babel] message extraction plugin, and added I18n doc page.
author cmlenz
date Wed, 20 Jun 2007 09:48:55 +0000
parents
children 2a6cf641cb5e
comparison
equal deleted inserted replaced
526:bd13c96cbfe4 528:f38ce008ab0a
1 .. -*- mode: rst; encoding: utf-8 -*-
2
3 =====================================
4 Internationalization and Localization
5 =====================================
6
7 Genshi provides basic supporting infrastructure for internationalizing
8 and localizing templates. That includes functionality for extracting localizable
9 strings from templates, as well as a template filter that can apply translations
10 to templates as they get rendered.
11
12 This support is based on `gettext`_ message catalogs and the `gettext Python
13 module`_. The extraction process can be used from the API level, or through the
14 front-ends implemented by the `Babel`_ project, for which Genshi provides a
15 plugin.
16
17 .. _`gettext`: http://www.gnu.org/software/gettext/
18 .. _`gettext python module`: http://docs.python.org/lib/module-gettext.html
19 .. _`babel`: http://babel.edgewall.org/
20
21
22 .. contents:: Contents
23 :depth: 2
24 .. sectnum::
25
26
27 Basics
28 ======
29
30 The simplest way to internationalize and translate templates would be to wrap
31 all localizable strings in a ``gettext()`` function call (which is often aliased
32 to ``_()`` for brevity). In that case, no extra template filter is required.
33
34 .. code-block:: genshi
35
36 <p>${_("Hello, world!")}</p>
37
38 However, this approach results in significant “character noise” in templates,
39 making them harder to read and preview.
40
41 The ``genshi.filters.Translator`` filter allows you to get rid of the
42 explicit `gettext`_ function calls, so you can continue to just write:
43
44 .. code-block:: genshi
45
46 <p>Hello, world!</p>
47
48 This text will still be extracted and translated as if you had wrapped it in a
49 ``_()`` call.
50
51 .. note:: For parameterized or pluralizable messages, you need to continue using
52 the appropriate ``gettext`` functions.
53
54 You can control which tags should be ignored by this process; for example, it
55 doesn't really make sense to translate the content of the HTML
56 ``<script></script>`` element. Both ``<script>`` and ``<style>`` are excluded
57 by default.
58
59 Attribute values can also be automatically translated. The default is to
60 consider the attributes ``abbr``, ``alt``, ``label``, ``prompt``, ``standby``,
61 ``summary``, and ``title``, which is a list that makes sense for HTML documents.
62 Of course, you can tell the translator to use a different set of attribute
63 names, or none at all.
64
65 In addition, you can control automatic translation in your templates using the
66 ``xml:lang`` attribute. If the value of that attribute is a literal string, the
67 contents and attributes of the element will be ignored:
68
69 .. code-block:: genshi
70
71 <p xml:lang="en">Hello, world!</p>
72
73 On the other hand, if the value of the ``xml:lang`` attribute contains a Python
74 expression, the element contents and attributes are still considered for
75 automatic translation:
76
77 .. code-block:: genshi
78
79 <html xml:lang="$locale">
80 ...
81 </html>
82
83
84 Extraction
85 ==========
86
87 The ``Translator`` class provides a class method called ``extract``, which is
88 a generator yielding all localizable strings found in a template or markup
89 stream. This includes both literal strings in text nodes and attribute values,
90 as well as strings in ``gettext()`` calls in embedded Python code. See the API
91 documentation for details on how to use this method directly.
92
93 This functionality is integrated into the message extraction framework provided
94 by the `Babel`_ project. Babel provides a command-line interface as well as
95 commands that can be used from ``setup.py`` scripts using `Setuptools`_ or
96 `Distutils`_.
97
98 .. _`setuptools`: http://peak.telecommunity.com/DevCenter/setuptools
99 .. _`distutils`: http://docs.python.org/dist/dist.html
100
101 The first thing you need to do to make Babel extract messages from Genshi
102 templates is to let Babel know which files are Genshi templates. This is done
103 using a “mapping configuration”, which can be stored in a configuration file,
104 or specified directly in your ``setup.py``.
105
106 In a configuration file, the mapping may look like this:
107
108 .. code-block:: ini
109
110 # Python souce
111 [python:**.py]
112
113 # Genshi templates
114 [genshi:**/templates/**.html]
115 include_attrs = title
116
117 [genshi:**/templates/**.txt]
118 template_class = genshi.template.TextTemplate
119 encoding = latin-1
120
121 Please consult the Babel documentation for details on configuration.
122
123 If all goes well, running the extraction with Babel should create a POT file
124 containing the strings from your Genshi templates and your Python source files.
125
126 .. note:: Genshi currently does not support “translator comments”, i.e. text in
127 template comments that would get added to the POT file. This support
128 may or may not be added in future versions.
129
130
131 ---------------------
132 Configuration Options
133 ---------------------
134
135 The Genshi extraction plugin for Babel supports the following options:
136
137 ``template_class``
138 ------------------
139 The concrete ``Template`` class that the file should be loaded with. Specify
140 the package/module name and the class name, separated by a colon.
141
142 The default is to use ``genshi.template:MarkupTemplate``, and you'll want to
143 set it to ``genshi.template:TextTemplate`` for `text templates`_.
144
145 .. _`text templates`: text-templates.html
146
147 ``encoding``
148 ------------------
149 The encoding of the template file. This is only used for text templates. The
150 default is to assume “utf-8”.
151
152 ``include_attrs``
153 ------------------
154 Comma-separated list of attribute names that should be considered to have
155 localizable values. Only used for markup templates.
156
157 ``include_tags``
158 ------------------
159 Comma-separated list of tag names that should be ignored. Only used for markup
160 templates.
161
162
163 Translation
164 ===========
165
166 If you have prepared MO files for use with Genshi using the appropriate tools,
167 you can access the message catalogs with the `gettext Python module`_. You'll
168 probably want to create a ``gettext.GNUTranslations`` instance, and make the
169 translation functions it provides available to your templates by putting them
170 in the template context.
171
172 The ``Translator`` filter needs to be added to the filters of the template
173 (applying it as a stream filter will likely not have the desired effect).
174 Furthermore it needs to be the first filter in the list, including the internal
175 filters that Genshi adds itself:
176
177 .. code-block:: python
178
179 from genshi.filters import Translator
180 from genshi.template import MarkupTemplate
181
182 template = MarkupTemplate("...")
183 template.filters.insert(0, Translator(translations.ugettext))
184
185 If you're using `TemplateLoader`, you should specify a callback function in
186 which you add the filter:
187
188 .. code-block:: python
189
190 from genshi.filters import Translator
191 from genshi.template import TemplateLoader
192
193 def template_loaded(template):
194 template.filters.insert(0, , Translator(translations.ugettext))
195
196 loader = TemplateLoader('templates', callback=template_loaded)
197 template = loader.load("...")
198
199 This approach ensures that the filter is not added everytime the template is
200 loaded, and thus being applied multiple times.
201
202
203 Related Considerations
204 ======================
205
206 If you intend to produce an application that is fully prepared for an
207 international audience, there are a couple of other things to keep in mind:
208
209 -------
210 Unicode
211 -------
212
213 Use ``unicode`` internally, not encoded bytestrings. Only encode/decode where
214 data enters or exits the system. This means that your code works with characters
215 and not just with bytes, which is an important distinction for example when
216 calculating the length of a piece of text. When you need to decode/encode, it's
217 probably a good idea to use UTF-8.
218
219 -------------
220 Date and Time
221 -------------
222
223 If your application uses datetime information that should be displayed to users
224 in different timezones, you should try to work with UTC (universal time)
225 internally. Do the conversion from and to "local time" when the data enters or
226 exits the system. Make use the Python `datetime`_ module and the third-party
227 `pytz`_ package.
228
229 --------------------------
230 Formatting and Locale Data
231 --------------------------
232
233 Make sure you check out the functionality provided by the `Babel`_ project for
234 things like number and date formatting, locale display strings, etc.
235
236 .. _`datetime`: http://docs.python.org/lib/module-datetime.html
237 .. _`pytz`: http://pytz.sourceforge.net/
Copyright (C) 2012-2017 Edgewall Software