532
|
1 .. -*- mode: rst; encoding: utf-8 -*-
|
|
2
|
|
3 =====================================
|
|
4 Internationalization and Localization
|
|
5 =====================================
|
|
6
|
|
7 Genshi provides basic supporting infrastructure for internationalizing
|
|
8 and localizing templates. That includes functionality for extracting localizable
|
|
9 strings from templates, as well as a template filter that can apply translations
|
|
10 to templates as they get rendered.
|
|
11
|
|
12 This support is based on `gettext`_ message catalogs and the `gettext Python
|
|
13 module`_. The extraction process can be used from the API level, or through the
|
|
14 front-ends implemented by the `Babel`_ project, for which Genshi provides a
|
|
15 plugin.
|
|
16
|
|
17 .. _`gettext`: http://www.gnu.org/software/gettext/
|
|
18 .. _`gettext python module`: http://docs.python.org/lib/module-gettext.html
|
|
19 .. _`babel`: http://babel.edgewall.org/
|
|
20
|
|
21
|
|
22 .. contents:: Contents
|
|
23 :depth: 2
|
|
24 .. sectnum::
|
|
25
|
|
26
|
|
27 Basics
|
|
28 ======
|
|
29
|
|
30 The simplest way to internationalize and translate templates would be to wrap
|
|
31 all localizable strings in a ``gettext()`` function call (which is often aliased
|
|
32 to ``_()`` for brevity). In that case, no extra template filter is required.
|
|
33
|
|
34 .. code-block:: genshi
|
|
35
|
|
36 <p>${_("Hello, world!")}</p>
|
|
37
|
|
38 However, this approach results in significant “character noise” in templates,
|
|
39 making them harder to read and preview.
|
|
40
|
|
41 The ``genshi.filters.Translator`` filter allows you to get rid of the
|
|
42 explicit `gettext`_ function calls, so you can continue to just write:
|
|
43
|
|
44 .. code-block:: genshi
|
|
45
|
|
46 <p>Hello, world!</p>
|
|
47
|
|
48 This text will still be extracted and translated as if you had wrapped it in a
|
|
49 ``_()`` call.
|
|
50
|
|
51 .. note:: For parameterized or pluralizable messages, you need to continue using
|
|
52 the appropriate ``gettext`` functions.
|
|
53
|
|
54 You can control which tags should be ignored by this process; for example, it
|
|
55 doesn't really make sense to translate the content of the HTML
|
|
56 ``<script></script>`` element. Both ``<script>`` and ``<style>`` are excluded
|
|
57 by default.
|
|
58
|
|
59 Attribute values can also be automatically translated. The default is to
|
|
60 consider the attributes ``abbr``, ``alt``, ``label``, ``prompt``, ``standby``,
|
|
61 ``summary``, and ``title``, which is a list that makes sense for HTML documents.
|
|
62 Of course, you can tell the translator to use a different set of attribute
|
|
63 names, or none at all.
|
|
64
|
|
65 In addition, you can control automatic translation in your templates using the
|
|
66 ``xml:lang`` attribute. If the value of that attribute is a literal string, the
|
|
67 contents and attributes of the element will be ignored:
|
|
68
|
|
69 .. code-block:: genshi
|
|
70
|
|
71 <p xml:lang="en">Hello, world!</p>
|
|
72
|
|
73 On the other hand, if the value of the ``xml:lang`` attribute contains a Python
|
|
74 expression, the element contents and attributes are still considered for
|
|
75 automatic translation:
|
|
76
|
|
77 .. code-block:: genshi
|
|
78
|
|
79 <html xml:lang="$locale">
|
|
80 ...
|
|
81 </html>
|
|
82
|
|
83
|
|
84 Extraction
|
|
85 ==========
|
|
86
|
|
87 The ``Translator`` class provides a class method called ``extract``, which is
|
|
88 a generator yielding all localizable strings found in a template or markup
|
|
89 stream. This includes both literal strings in text nodes and attribute values,
|
|
90 as well as strings in ``gettext()`` calls in embedded Python code. See the API
|
|
91 documentation for details on how to use this method directly.
|
|
92
|
|
93 This functionality is integrated into the message extraction framework provided
|
|
94 by the `Babel`_ project. Babel provides a command-line interface as well as
|
|
95 commands that can be used from ``setup.py`` scripts using `Setuptools`_ or
|
|
96 `Distutils`_.
|
|
97
|
|
98 .. _`setuptools`: http://peak.telecommunity.com/DevCenter/setuptools
|
|
99 .. _`distutils`: http://docs.python.org/dist/dist.html
|
|
100
|
|
101 The first thing you need to do to make Babel extract messages from Genshi
|
|
102 templates is to let Babel know which files are Genshi templates. This is done
|
|
103 using a “mapping configuration”, which can be stored in a configuration file,
|
|
104 or specified directly in your ``setup.py``.
|
|
105
|
|
106 In a configuration file, the mapping may look like this:
|
|
107
|
|
108 .. code-block:: ini
|
|
109
|
|
110 # Python souce
|
|
111 [python:**.py]
|
|
112
|
|
113 # Genshi templates
|
|
114 [genshi:**/templates/**.html]
|
|
115 include_attrs = title
|
|
116
|
|
117 [genshi:**/templates/**.txt]
|
|
118 template_class = genshi.template.TextTemplate
|
|
119 encoding = latin-1
|
|
120
|
|
121 Please consult the Babel documentation for details on configuration.
|
|
122
|
|
123 If all goes well, running the extraction with Babel should create a POT file
|
|
124 containing the strings from your Genshi templates and your Python source files.
|
|
125
|
|
126 .. note:: Genshi currently does not support “translator comments”, i.e. text in
|
|
127 template comments that would get added to the POT file. This support
|
|
128 may or may not be added in future versions.
|
|
129
|
|
130
|
|
131 ---------------------
|
|
132 Configuration Options
|
|
133 ---------------------
|
|
134
|
|
135 The Genshi extraction plugin for Babel supports the following options:
|
|
136
|
|
137 ``template_class``
|
|
138 ------------------
|
|
139 The concrete ``Template`` class that the file should be loaded with. Specify
|
|
140 the package/module name and the class name, separated by a colon.
|
|
141
|
|
142 The default is to use ``genshi.template:MarkupTemplate``, and you'll want to
|
|
143 set it to ``genshi.template:TextTemplate`` for `text templates`_.
|
|
144
|
|
145 .. _`text templates`: text-templates.html
|
|
146
|
|
147 ``encoding``
|
|
148 ------------------
|
|
149 The encoding of the template file. This is only used for text templates. The
|
|
150 default is to assume “utf-8”.
|
|
151
|
|
152 ``include_attrs``
|
|
153 ------------------
|
|
154 Comma-separated list of attribute names that should be considered to have
|
|
155 localizable values. Only used for markup templates.
|
|
156
|
|
157 ``include_tags``
|
|
158 ------------------
|
|
159 Comma-separated list of tag names that should be ignored. Only used for markup
|
|
160 templates.
|
|
161
|
|
162
|
|
163 Translation
|
|
164 ===========
|
|
165
|
|
166 If you have prepared MO files for use with Genshi using the appropriate tools,
|
|
167 you can access the message catalogs with the `gettext Python module`_. You'll
|
|
168 probably want to create a ``gettext.GNUTranslations`` instance, and make the
|
|
169 translation functions it provides available to your templates by putting them
|
|
170 in the template context.
|
|
171
|
|
172 The ``Translator`` filter needs to be added to the filters of the template
|
|
173 (applying it as a stream filter will likely not have the desired effect).
|
|
174 Furthermore it needs to be the first filter in the list, including the internal
|
|
175 filters that Genshi adds itself:
|
|
176
|
|
177 .. code-block:: python
|
|
178
|
|
179 from genshi.filters import Translator
|
|
180 from genshi.template import MarkupTemplate
|
|
181
|
|
182 template = MarkupTemplate("...")
|
|
183 template.filters.insert(0, Translator(translations.ugettext))
|
|
184
|
|
185 If you're using `TemplateLoader`, you should specify a callback function in
|
|
186 which you add the filter:
|
|
187
|
|
188 .. code-block:: python
|
|
189
|
|
190 from genshi.filters import Translator
|
|
191 from genshi.template import TemplateLoader
|
|
192
|
|
193 def template_loaded(template):
|
559
|
194 template.filters.insert(0, Translator(translations.ugettext))
|
532
|
195
|
|
196 loader = TemplateLoader('templates', callback=template_loaded)
|
|
197 template = loader.load("...")
|
|
198
|
|
199 This approach ensures that the filter is not added everytime the template is
|
|
200 loaded, and thus being applied multiple times.
|
|
201
|
|
202
|
|
203 Related Considerations
|
|
204 ======================
|
|
205
|
|
206 If you intend to produce an application that is fully prepared for an
|
|
207 international audience, there are a couple of other things to keep in mind:
|
|
208
|
|
209 -------
|
|
210 Unicode
|
|
211 -------
|
|
212
|
|
213 Use ``unicode`` internally, not encoded bytestrings. Only encode/decode where
|
|
214 data enters or exits the system. This means that your code works with characters
|
|
215 and not just with bytes, which is an important distinction for example when
|
|
216 calculating the length of a piece of text. When you need to decode/encode, it's
|
|
217 probably a good idea to use UTF-8.
|
|
218
|
|
219 -------------
|
|
220 Date and Time
|
|
221 -------------
|
|
222
|
|
223 If your application uses datetime information that should be displayed to users
|
|
224 in different timezones, you should try to work with UTC (universal time)
|
|
225 internally. Do the conversion from and to "local time" when the data enters or
|
|
226 exits the system. Make use the Python `datetime`_ module and the third-party
|
|
227 `pytz`_ package.
|
|
228
|
|
229 --------------------------
|
|
230 Formatting and Locale Data
|
|
231 --------------------------
|
|
232
|
|
233 Make sure you check out the functionality provided by the `Babel`_ project for
|
|
234 things like number and date formatting, locale display strings, etc.
|
|
235
|
|
236 .. _`datetime`: http://docs.python.org/lib/module-datetime.html
|
|
237 .. _`pytz`: http://pytz.sourceforge.net/
|