Mercurial > genshi > genshi-test
comparison doc/i18n.txt @ 902:09cc3627654c experimental-inline
Sync `experimental/inline` branch with [source:trunk@1126].
author | cmlenz |
---|---|
date | Fri, 23 Apr 2010 21:08:26 +0000 |
parents | 1837f39efd6f |
children |
comparison
equal
deleted
inserted
replaced
830:de82830f8816 | 902:09cc3627654c |
---|---|
2 | 2 |
3 ===================================== | 3 ===================================== |
4 Internationalization and Localization | 4 Internationalization and Localization |
5 ===================================== | 5 ===================================== |
6 | 6 |
7 Genshi provides basic supporting infrastructure for internationalizing | 7 Genshi provides comprehensive supporting infrastructure for internationalizing |
8 and localizing templates. That includes functionality for extracting localizable | 8 and localizing templates. That includes functionality for extracting |
9 strings from templates, as well as a template filter that can apply translations | 9 localizable strings from templates, as well as a template filter and special |
10 to templates as they get rendered. | 10 directives that can apply translations to templates as they get rendered. |
11 | 11 |
12 This support is based on `gettext`_ message catalogs and the `gettext Python | 12 This support is based on `gettext`_ message catalogs and the `gettext Python |
13 module`_. The extraction process can be used from the API level, or through the | 13 module`_. The extraction process can be used from the API level, or through |
14 front-ends implemented by the `Babel`_ project, for which Genshi provides a | 14 the front-ends implemented by the `Babel`_ project, for which Genshi provides |
15 plugin. | 15 a plugin. |
16 | 16 |
17 .. _`gettext`: http://www.gnu.org/software/gettext/ | 17 .. _`gettext`: http://www.gnu.org/software/gettext/ |
18 .. _`gettext python module`: http://docs.python.org/lib/module-gettext.html | 18 .. _`gettext python module`: http://docs.python.org/lib/module-gettext.html |
19 .. _`babel`: http://babel.edgewall.org/ | 19 .. _`babel`: http://babel.edgewall.org/ |
20 | 20 |
26 | 26 |
27 Basics | 27 Basics |
28 ====== | 28 ====== |
29 | 29 |
30 The simplest way to internationalize and translate templates would be to wrap | 30 The simplest way to internationalize and translate templates would be to wrap |
31 all localizable strings in a ``gettext()`` function call (which is often aliased | 31 all localizable strings in a ``gettext()`` function call (which is often |
32 to ``_()`` for brevity). In that case, no extra template filter is required. | 32 aliased to ``_()`` for brevity). In that case, no extra template filter is |
33 required. | |
33 | 34 |
34 .. code-block:: genshi | 35 .. code-block:: genshi |
35 | 36 |
36 <p>${_("Hello, world!")}</p> | 37 <p>${_("Hello, world!")}</p> |
37 | 38 |
38 However, this approach results in significant “character noise” in templates, | 39 However, this approach results in significant “character noise” in templates, |
39 making them harder to read and preview. | 40 making them harder to read and preview. |
40 | 41 |
41 The ``genshi.filters.Translator`` filter allows you to get rid of the | 42 The ``genshi.filters.Translator`` filter allows you to get rid of the |
42 explicit `gettext`_ function calls, so you can continue to just write: | 43 explicit `gettext`_ function calls, so you can (often) just continue to write: |
43 | 44 |
44 .. code-block:: genshi | 45 .. code-block:: genshi |
45 | 46 |
46 <p>Hello, world!</p> | 47 <p>Hello, world!</p> |
47 | 48 |
48 This text will still be extracted and translated as if you had wrapped it in a | 49 This text will still be extracted and translated as if you had wrapped it in a |
49 ``_()`` call. | 50 ``_()`` call. |
50 | 51 |
51 .. note:: For parameterized or pluralizable messages, you need to continue using | 52 .. note:: For parameterized or pluralizable messages, you need to use the |
52 the appropriate ``gettext`` functions. | 53 special `template directives`_ described below, or use the |
53 | 54 corresponding ``gettext`` function in embedded Python expressions. |
54 You can control which tags should be ignored by this process; for example, it | 55 |
56 You can control which tags should be ignored by this process; for example, it | |
55 doesn't really make sense to translate the content of the HTML | 57 doesn't really make sense to translate the content of the HTML |
56 ``<script></script>`` element. Both ``<script>`` and ``<style>`` are excluded | 58 ``<script></script>`` element. Both ``<script>`` and ``<style>`` are excluded |
57 by default. | 59 by default. |
58 | 60 |
59 Attribute values can also be automatically translated. The default is to | 61 Attribute values can also be automatically translated. The default is to |
60 consider the attributes ``abbr``, ``alt``, ``label``, ``prompt``, ``standby``, | 62 consider the attributes ``abbr``, ``alt``, ``label``, ``prompt``, ``standby``, |
61 ``summary``, and ``title``, which is a list that makes sense for HTML documents. | 63 ``summary``, and ``title``, which is a list that makes sense for HTML |
62 Of course, you can tell the translator to use a different set of attribute | 64 documents. Of course, you can tell the translator to use a different set of |
63 names, or none at all. | 65 attribute names, or none at all. |
64 | 66 |
65 In addition, you can control automatic translation in your templates using the | 67 ---------------- |
66 ``xml:lang`` attribute. If the value of that attribute is a literal string, the | 68 Language Tagging |
67 contents and attributes of the element will be ignored: | 69 ---------------- |
70 | |
71 You can control automatic translation in your templates using the ``xml:lang`` | |
72 attribute. If the value of that attribute is a literal string, the contents and | |
73 attributes of the element will be ignored: | |
68 | 74 |
69 .. code-block:: genshi | 75 .. code-block:: genshi |
70 | 76 |
71 <p xml:lang="en">Hello, world!</p> | 77 <p xml:lang="en">Hello, world!</p> |
72 | 78 |
77 .. code-block:: genshi | 83 .. code-block:: genshi |
78 | 84 |
79 <html xml:lang="$locale"> | 85 <html xml:lang="$locale"> |
80 ... | 86 ... |
81 </html> | 87 </html> |
88 | |
89 | |
90 .. _`template directives`: | |
91 | |
92 Template Directives | |
93 =================== | |
94 | |
95 Sometimes localizable strings in templates may contain dynamic parameters, or | |
96 they may depend on the numeric value of some variable to choose a proper | |
97 plural form. Sometimes the strings contain embedded markup, such as tags for | |
98 emphasis or hyperlinks, and you don't want to rely on the people doing the | |
99 translations to know the syntax and escaping rules of HTML and XML. | |
100 | |
101 In those cases the simple text extraction and translation process described | |
102 above is not sufficient. You could just use ``gettext`` API functions in | |
103 embedded Python expressions for parameters and pluralization, but that does | |
104 not help when messages contain embedded markup. Genshi provides special | |
105 template directives for internationalization that attempt to provide a | |
106 comprehensive solution for this problem space. | |
107 | |
108 To enable these directives, you'll need to register them with the templates | |
109 they are used in. You can do this by adding them manually via the | |
110 ``Template.add_directives(namespace, factory)`` (where ``namespace`` would be | |
111 “http://genshi.edgewall.org/i18n” and ``factory`` would be an instance of the | |
112 ``Translator`` class). Or you can just call the ``Translator.setup(template)`` | |
113 class method, which both registers the directives and adds the translation | |
114 filter. | |
115 | |
116 After the directives have been registered with the template engine on the | |
117 Python side of your application, you need to declare the corresponding | |
118 directive namespace in all markup templates that use them. For example: | |
119 | |
120 .. code-block:: genshi | |
121 | |
122 <html xmlns:py="http://genshi.edgewall.org/" | |
123 xmlns:i18n="http://genshi.edgewall.org/i18n/"> | |
124 … | |
125 </html> | |
126 | |
127 These directives only make sense in the context of `markup templates`_. For | |
128 `text templates`_, you can just use the corresponding ``gettext`` API calls as needed. | |
129 | |
130 .. note:: The internationalization directives are still somewhat experimental | |
131 and have some known issues. However, the attribute language they | |
132 implement should be stable and is not subject to change | |
133 substantially in future versions. | |
134 | |
135 .. _`markup templates`: xml-templates.html | |
136 .. _`text templates`: text-templates.html | |
137 | |
138 -------- | |
139 Messages | |
140 -------- | |
141 | |
142 ``i18n:msg`` | |
143 ------------ | |
144 | |
145 This is the basic directive for defining localizable text passages that | |
146 contain parameters and/or markup. | |
147 | |
148 For example, consider the following template snippet: | |
149 | |
150 .. code-block:: genshi | |
151 | |
152 <p> | |
153 Please visit <a href="${site.url}">${site.name}</a> for help. | |
154 </p> | |
155 | |
156 Without further annotation, the translation filter would treat this sentence | |
157 as two separate messages (“Please visit” and “for help”), and the translator | |
158 would have no control over the position of the link in the sentence. | |
159 | |
160 However, when you use the Genshi internationalization directives, you simply | |
161 add an ``i18n:msg`` attribute to the enclosing ``<p>`` element: | |
162 | |
163 .. code-block:: genshi | |
164 | |
165 <p i18n:msg="name"> | |
166 Please visit <a href="${site.url}">${site.name}</a> for help. | |
167 </p> | |
168 | |
169 Genshi is then able to identify the text in the ``<p>`` element as a single | |
170 message for translation purposes. You'll see the following string in your | |
171 message catalog:: | |
172 | |
173 Please visit [1:%(name)s] for help. | |
174 | |
175 The `<a>` element with its attribute has been replaced by a part in square | |
176 brackets, which does not include the tag name or the attributes of the element. | |
177 | |
178 The value of the ``i18n:msg`` attribute is a comma-separated list of parameter | |
179 names, which serve as simplified aliases for the actual Python expressions the | |
180 message contains. The order of the paramer names in the list must correspond | |
181 to the order of the expressions in the text. In this example, there is only | |
182 one parameter: its alias for translation is “name”, while the corresponding | |
183 expression is ``${site.name}``. | |
184 | |
185 The translator now has complete control over the structure of the sentence. He | |
186 or she certainly does need to make sure that any bracketed parts are not | |
187 removed, and that the ``name`` parameter is preserved correctly. But those are | |
188 things that can be easily checked by validating the message catalogs. The | |
189 important thing is that the translator can change the sentence structure, and | |
190 has no way to break the application by forgetting to close a tag, for example. | |
191 | |
192 So if the German translator of this snippet decided to translate it to:: | |
193 | |
194 Um Hilfe zu erhalten, besuchen Sie bitte [1:%(name)s] | |
195 | |
196 The resulting output might be: | |
197 | |
198 .. code-block:: xml | |
199 | |
200 <p> | |
201 Um Hilfe zu erhalten, besuchen Sie bitte | |
202 <a href="http://example.com/">Example</a> | |
203 </p> | |
204 | |
205 Messages may contain multiple tags, and they may also be nested. For example: | |
206 | |
207 .. code-block:: genshi | |
208 | |
209 <p i18n:msg="name"> | |
210 <i>Please</i> visit <b>the site <a href="${site.url}">${site.name}</a></b> | |
211 for help. | |
212 </p> | |
213 | |
214 This would result in the following message ID:: | |
215 | |
216 [1:Please] visit [2:the site [3:%(name)s]] for help. | |
217 | |
218 Again, the translator has full control over the structure of the sentence. So | |
219 the German translation could actually look like this:: | |
220 | |
221 Um Hilfe zu erhalten besuchen Sie [1:bitte] | |
222 [3:%(name)s], [2:das ist eine Web-Site] | |
223 | |
224 Which Genshi would recompose into the following outout: | |
225 | |
226 .. code-block:: xml | |
227 | |
228 <p> | |
229 Um Hilfe zu erhalten besuchen Sie <i>bitte</i> | |
230 <a href="http://example.com/">Example</a>, <b>das ist eine Web-Site</b> | |
231 </p> | |
232 | |
233 Note how the translation has changed the order and even the nesting of the | |
234 tags. | |
235 | |
236 .. warning:: Please note that ``i18n:msg`` directives do not support other | |
237 nested directives. Directives commonly change the structure of | |
238 the generated markup dynamically, which often would result in the | |
239 structure of the text changing, thus making translation as a | |
240 single message ineffective. | |
241 | |
242 ``i18n:choose``, ``i18n:singular``, ``i18n:plural`` | |
243 --------------------------------------------------- | |
244 | |
245 Translatable strings that vary based on some number of objects, such as “You | |
246 have 1 new message” or “You have 3 new messages”, present their own challenge, | |
247 in particular when you consider that different languages have different rules | |
248 for pluralization. For example, while English and most western languages have | |
249 two plural forms (one for ``n=1`` and an other for ``n<>1``), Welsh has five | |
250 different plural forms, while Hungarian only has one. | |
251 | |
252 The ``gettext`` framework has long supported this via the ``ngettext()`` | |
253 family of functions. You specify two default messages, one singular and one | |
254 plural, and the number of items. The translations however may contain any | |
255 number of plural forms for the message, depending on how many are commonly | |
256 used in the language. ``ngettext`` will choose the correct plural form of the | |
257 translated message based on the specified number of items. | |
258 | |
259 Genshi provides a variant of the ``i18n:msg`` directive described above that | |
260 allows choosing the proper plural form based on the numeric value of a given | |
261 variable. The pluralization support is implemented in a set of three | |
262 directives that must be used together: ``i18n:choose``, ``i18n:singular``, and | |
263 ``i18n:plural``. | |
264 | |
265 The ``i18n:choose`` directive is used to set up the context of the message: it | |
266 simply wraps the singular and plural variants. | |
267 | |
268 The value of this directive is split into two parts: the first is the | |
269 *numeral*, a Python expression that evaluates to a number to determine which | |
270 plural form should be chosen. The second part, separated by a semicolon, lists | |
271 the parameter names. This part is equivalent to the value of the ``i18n:msg`` | |
272 directive. | |
273 | |
274 For example: | |
275 | |
276 .. code-block:: genshi | |
277 | |
278 <p i18n:choose="len(messages); num"> | |
279 <i18n:singular>You have <b>${len(messages)}</b> new message.</i18n:singular> | |
280 <i18n:plural>You have <b>${len(messages)}</b> new messages.</i18n:plural> | |
281 </p> | |
282 | |
283 All three directives can be used either as elements or attribute. So the above | |
284 example could also be written as follows: | |
285 | |
286 .. code-block:: genshi | |
287 | |
288 <i18n:choose numeral="len(messages)" params="num"> | |
289 <p i18n:singular="">You have <b>${len(messages)}</b> new message.</p> | |
290 <p i18n:plural="">You have <b>${len(messages)}</b> new messages.</p> | |
291 </i18n:choose> | |
292 | |
293 When used as an element, the two parts of the ``i18n:choose`` value are split | |
294 into two different attributes: ``numeral`` and ``params``. The | |
295 ``i18n:singular`` and ``i18n:plural`` directives do not require or support any | |
296 value (or any extra attributes). | |
297 | |
298 -------------------- | |
299 Comments and Domains | |
300 -------------------- | |
301 | |
302 ``i18n:comment`` | |
303 ---------------- | |
304 | |
305 The ``i18n:comment`` directive can be used to supply a comment for the | |
306 translator. For example, if a template snippet is not easily understood | |
307 outside of its context, you can add a translator comment to help the | |
308 translator understand in what context the message will be used: | |
309 | |
310 .. code-block:: genshi | |
311 | |
312 <p i18n:msg="name" i18n:comment="Link to the relevant support site"> | |
313 Please visit <a href="${site.url}">${site.name}</a> for help. | |
314 </p> | |
315 | |
316 This comment will be extracted together with the message itself, and will | |
317 commonly be placed along the message in the message catalog, so that it is | |
318 easily visible to the person doing the translation. | |
319 | |
320 This directive has no impact on how the template is rendered, and is ignored | |
321 outside of the extraction process. | |
322 | |
323 ``i18n:domain`` | |
324 --------------- | |
325 | |
326 In larger projects, message catalogs are commonly split up into different | |
327 *domains*. For example, you might have a core application domain, and then | |
328 separate domains for extensions or libraries. | |
329 | |
330 Genshi provides a directive called ``i18n:domain`` that lets you choose the | |
331 translation domain for a particular scope. For example: | |
332 | |
333 .. code-block:: genshi | |
334 | |
335 <div i18n:domain="examples"> | |
336 <p>Hello, world!</p> | |
337 </div> | |
82 | 338 |
83 | 339 |
84 Extraction | 340 Extraction |
85 ========== | 341 ========== |
86 | 342 |
88 a generator yielding all localizable strings found in a template or markup | 344 a generator yielding all localizable strings found in a template or markup |
89 stream. This includes both literal strings in text nodes and attribute values, | 345 stream. This includes both literal strings in text nodes and attribute values, |
90 as well as strings in ``gettext()`` calls in embedded Python code. See the API | 346 as well as strings in ``gettext()`` calls in embedded Python code. See the API |
91 documentation for details on how to use this method directly. | 347 documentation for details on how to use this method directly. |
92 | 348 |
93 This functionality is integrated into the message extraction framework provided | 349 ----------------- |
350 Babel Integration | |
351 ----------------- | |
352 | |
353 This functionality is integrated with the message extraction framework provided | |
94 by the `Babel`_ project. Babel provides a command-line interface as well as | 354 by the `Babel`_ project. Babel provides a command-line interface as well as |
95 commands that can be used from ``setup.py`` scripts using `Setuptools`_ or | 355 commands that can be used from ``setup.py`` scripts using `Setuptools`_ or |
96 `Distutils`_. | 356 `Distutils`_. |
97 | 357 |
98 .. _`setuptools`: http://peak.telecommunity.com/DevCenter/setuptools | 358 .. _`setuptools`: http://peak.telecommunity.com/DevCenter/setuptools |
120 | 380 |
121 Please consult the Babel documentation for details on configuration. | 381 Please consult the Babel documentation for details on configuration. |
122 | 382 |
123 If all goes well, running the extraction with Babel should create a POT file | 383 If all goes well, running the extraction with Babel should create a POT file |
124 containing the strings from your Genshi templates and your Python source files. | 384 containing the strings from your Genshi templates and your Python source files. |
125 | |
126 .. note:: Genshi currently does not support “translator comments”, i.e. text in | |
127 template comments that would get added to the POT file. This support | |
128 may or may not be added in future versions. | |
129 | 385 |
130 | 386 |
131 --------------------- | 387 --------------------- |
132 Configuration Options | 388 Configuration Options |
133 --------------------- | 389 --------------------- |
164 Whether text outside explicit ``gettext`` function calls should be extracted. | 420 Whether text outside explicit ``gettext`` function calls should be extracted. |
165 By default, any text nodes not inside ignored tags, and values of attribute in | 421 By default, any text nodes not inside ignored tags, and values of attribute in |
166 the ``include_attrs`` list are extracted. If this option is disabled, only | 422 the ``include_attrs`` list are extracted. If this option is disabled, only |
167 strings in ``gettext`` function calls are extracted. | 423 strings in ``gettext`` function calls are extracted. |
168 | 424 |
169 .. note:: If you disable this option, it's not necessary to add the translation | 425 .. note:: If you disable this option, and do not make use of the |
170 filter as described above. You only need to make sure that the | 426 internationalization directives, it's not necessary to add the |
171 template has access to the ``gettext`` functions it uses. | 427 translation filter as described above. You only need to make sure |
428 that the template has access to the ``gettext`` functions it uses. | |
172 | 429 |
173 | 430 |
174 Translation | 431 Translation |
175 =========== | 432 =========== |
176 | 433 |
191 from genshi.template import MarkupTemplate | 448 from genshi.template import MarkupTemplate |
192 | 449 |
193 template = MarkupTemplate("...") | 450 template = MarkupTemplate("...") |
194 template.filters.insert(0, Translator(translations.ugettext)) | 451 template.filters.insert(0, Translator(translations.ugettext)) |
195 | 452 |
196 If you're using `TemplateLoader`, you should specify a callback function in | 453 The ``Translator`` class also provides the convenience method ``setup()``, |
197 which you add the filter: | 454 which will both add the filter and register the i18n directives: |
198 | 455 |
199 .. code-block:: python | 456 .. code-block:: python |
200 | 457 |
201 from genshi.filters import Translator | 458 from genshi.filters import Translator |
202 from genshi.template import TemplateLoader | 459 from genshi.template import MarkupTemplate |
203 | 460 |
204 def template_loaded(template): | 461 template = MarkupTemplate("...") |
205 template.filters.insert(0, Translator(translations.ugettext)) | 462 translator = Translator(translations.ugettext) |
206 | 463 translator.setup(template) |
207 loader = TemplateLoader('templates', callback=template_loaded) | 464 |
208 template = loader.load("...") | 465 .. warning:: If you're using ``TemplateLoader``, you should specify a |
209 | 466 `callback function`_ in which you add the filter. That ensures |
210 This approach ensures that the filter is not added everytime the template is | 467 that the filter is not added everytime the template is rendered, |
211 loaded, and thus being applied multiple times. | 468 thereby being applied multiple times. |
469 | |
470 .. _`callback function`: loader.html#callback-interface | |
212 | 471 |
213 | 472 |
214 Related Considerations | 473 Related Considerations |
215 ====================== | 474 ====================== |
216 | 475 |