Mercurial > babel > old > mirror
comparison doc/messages.txt @ 126:b12c6a776c44
Split docs on date and number formatting.
author | cmlenz |
---|---|
date | Mon, 18 Jun 2007 15:15:31 +0000 |
parents | doc/catalogs.txt@a4389948c992 |
children | 24b5de939850 |
comparison
equal
deleted
inserted
replaced
125:0053443e7cb2 | 126:b12c6a776c44 |
---|---|
1 .. -*- mode: rst; encoding: utf-8 -*- | |
2 | |
3 ============================= | |
4 Working with Message Catalogs | |
5 ============================= | |
6 | |
7 .. contents:: Contents | |
8 :depth: 2 | |
9 .. sectnum:: | |
10 | |
11 | |
12 Introduction | |
13 ============ | |
14 | |
15 The ``gettext`` translation system enables you to mark any strings used in your | |
16 application as subject to localization, by wrapping them in functions such as | |
17 ``gettext(str)`` and ``ngettext(singular, plural, num)``. For brevity, the | |
18 ``gettext`` function is often aliased to ``_(str)``, so you can write: | |
19 | |
20 .. code-block:: python | |
21 | |
22 print _("Hello") | |
23 | |
24 instead of just: | |
25 | |
26 .. code-block:: python | |
27 | |
28 print "Hello" | |
29 | |
30 to make the string "Hello" localizable. | |
31 | |
32 Message catalogs are collections of translations for such localizable messages | |
33 used in an application. They are commonly stored in PO (Portable Object) and MO | |
34 (Machine Object) files, the formats of which are defined by the GNU `gettext`_ | |
35 tools and the GNU `translation project`_. | |
36 | |
37 .. _`gettext`: http://www.gnu.org/software/gettext/ | |
38 .. _`translation project`: http://sourceforge.net/projects/translation | |
39 | |
40 The general procedure for building message catalogs looks something like this: | |
41 | |
42 * use a tool (such as ``xgettext``) to extract localizable strings from the | |
43 code base and write them to a POT (PO Template) file. | |
44 * make a copy of the POT file for a specific locale (for example, "en_US") | |
45 and start translating the messages | |
46 * use a tool such as ``msgfmt`` to compile the locale PO file into an binary | |
47 MO file | |
48 * later, when code changes make it necessary to update the translations, you | |
49 regenerate the POT file and merge the changes into the various | |
50 locale-specific PO files, for example using ``msgmerge`` | |
51 | |
52 Python provides the `gettext module`_ as part of the standard library, which | |
53 enables applications to work with appropriately generated MO files. | |
54 | |
55 .. _`gettext module`: http://docs.python.org/lib/module-gettext.html | |
56 | |
57 As ``gettext`` provides a solid and well supported foundation for translating | |
58 application messages, Babel does not reinvent the wheel, but rather reuses this | |
59 infrastructure, and makes it easier to build message catalogs for Python | |
60 applications. | |
61 | |
62 | |
63 Message Extraction | |
64 ================== | |
65 | |
66 Babel provides functionality similar to that of the ``xgettext`` program, | |
67 except that only extraction from Python source files is built-in, while support | |
68 for other file formats can be added using a simple extension mechanism. | |
69 | |
70 Unlike ``xgettext``, which is usually invoked once for every file, the routines | |
71 for message extraction in Babel operate on directories. While the per-file | |
72 approach of ``xgettext`` works nicely with projects using a ``Makefile``, | |
73 Python projects rarely use ``make``, and thus a different mechanism is needed | |
74 for extracting messages from the heterogeneous collection of source files that | |
75 many Python projects are composed of. | |
76 | |
77 When message extraction is based on directories instead of individual files, | |
78 there needs to be a way to configure which files should be treated in which | |
79 manner. For example, while many projects may contain ``.html`` files, some of | |
80 those files may be static HTML files that don't contain localizable message, | |
81 while others may be `Django`_ templates, and still others may contain `Genshi`_ | |
82 markup templates. Some projects may even mix HTML files for different templates | |
83 languages (for whatever reason). Therefore the way in which messages are | |
84 extracted from source files can not only depend on the file extension, but | |
85 needs to be controllable in a precise manner. | |
86 | |
87 .. _`Django`: http://www.djangoproject.com/ | |
88 .. _`Genshi`: http://genshi.edgewall.org/ | |
89 | |
90 Babel accepts a configuration file to specify this mapping of files to | |
91 extraction methods, which is described below. | |
92 | |
93 | |
94 .. _`mapping`: | |
95 | |
96 ------------------------------------------- | |
97 Extraction Method Mapping and Configuration | |
98 ------------------------------------------- | |
99 | |
100 The mapping of extraction methods to files in Babel is done via a configuration | |
101 file. This file maps extended glob patterns to the names of the extraction | |
102 methods, and can also set various options for each pattern (which options are | |
103 available depends on the specific extraction method). | |
104 | |
105 For example, the following configuration adds extraction of messages from both | |
106 Genshi markup templates and text templates: | |
107 | |
108 .. code-block:: ini | |
109 | |
110 # Extraction from Python source files | |
111 | |
112 [python: foobar/**.py] | |
113 | |
114 # Extraction from Genshi HTML and text templates | |
115 | |
116 [genshi: foobar/**/templates/**.html] | |
117 ignore_tags = script,style | |
118 include_attrs = alt title summary | |
119 | |
120 [genshi: foobar/**/templates/**.txt] | |
121 template_class = genshi.template.text:TextTemplate | |
122 encoding = ISO-8819-15 | |
123 | |
124 The configuration file syntax is based on the format commonly found in ``.INI`` | |
125 files on Windows systems, and as supported by the ``ConfigParser`` module in | |
126 the Python standard libraries. Section names (the strings enclosed in square | |
127 brackets) specify both the name of the extraction method, and the extended glob | |
128 pattern to specify the files that this extraction method should be used for, | |
129 separated by a colon. The options in the sections are passed to the extraction | |
130 method. Which options are available is specific to the extraction method used. | |
131 | |
132 The extended glob patterns used in this configuration are similar to the glob | |
133 patterns provided by most shells. A single asterisk (``*``) is a wildcard for | |
134 any number of characters (except for the pathname component separator "/"), | |
135 while a question mark (``?``) only matches a single character. In addition, | |
136 two subsequent asterisk characters (``**``) can be used to make the wildcard | |
137 match any directory level, so the pattern ``**.txt`` matches any file with the | |
138 extension ``.txt`` in any directory. | |
139 | |
140 Lines that start with a ``#`` or ``;`` character are ignored and can be used | |
141 for comments. Empty lines are also ignored, too. | |
142 | |
143 .. note:: if you're performing message extraction using the command Babel | |
144 provides for integration into ``setup.py`` scripts (see below), you | |
145 can also provide this configuration in a different way, namely as a | |
146 keyword argument to the ``setup()`` function. | |
147 | |
148 | |
149 ---------- | |
150 Front-Ends | |
151 ---------- | |
152 | |
153 Babel provides two different front-ends to access its functionality for working | |
154 with message catalogs: | |
155 | |
156 * A `Command-line interface <cmdline.html>`_, and | |
157 * `Integration with distutils/setuptools <setup.html>`_ | |
158 | |
159 Which one you choose depends on the nature of your project. For most modern | |
160 Python projects, the distutils/setuptools integration is probably more | |
161 convenient. | |
162 | |
163 | |
164 -------------------------- | |
165 Writing Extraction Methods | |
166 -------------------------- | |
167 | |
168 Adding new methods for extracting localizable methods is easy. First, you'll | |
169 need to implement a function that complies with the following interface: | |
170 | |
171 .. code-block:: python | |
172 | |
173 def extract_xxx(fileobj, keywords, comment_tags, options): | |
174 """Extract messages from XXX files. | |
175 | |
176 :param fileobj: the file-like object the messages should be extracted | |
177 from | |
178 :param keywords: a list of keywords (i.e. function names) that should | |
179 be recognized as translation functions | |
180 :param comment_tags: a list of translator tags to search for and | |
181 include in the results | |
182 :param options: a dictionary of additional options (optional) | |
183 :return: an iterator over ``(lineno, funcname, message, comments)`` | |
184 tuples | |
185 :rtype: ``iterator`` | |
186 """ | |
187 | |
188 .. note:: Any strings in the tuples produced by this function must be either | |
189 ``unicode`` objects, or ``str`` objects using plain ASCII characters. | |
190 That means that if sources contain strings using other encodings, it | |
191 is the job of the extractor implementation to do the decoding to | |
192 ``unicode`` objects. | |
193 | |
194 Next, you should register that function as an entry point. This requires your | |
195 ``setup.py`` script to use `setuptools`_, and your package to be installed with | |
196 the necessary metadata. If that's taken care of, add something like the | |
197 following to your ``setup.py`` script: | |
198 | |
199 .. code-block:: python | |
200 | |
201 def setup(... | |
202 | |
203 entry_points = """ | |
204 [babel.extractors] | |
205 xxx = your.package:extract_xxx | |
206 """, | |
207 | |
208 That is, add your extraction method to the entry point group | |
209 ``babel.extractors``, where the name of the entry point is the name that people | |
210 will use to reference the extraction method, and the value being the module and | |
211 the name of the function (separated by a colon) implementing the actual | |
212 extraction. | |
213 | |
214 .. _`setuptools`: http://peak.telecommunity.com/DevCenter/setuptools | |
215 | |
216 Comments Tags And Translator Comments Explanation | |
217 ................................................. | |
218 | |
219 First of all what are comments tags. Comments tags are excerpts of text to | |
220 search for in comments, only comments, right before the `python gettext`_ | |
221 calls, as shown on the following example: | |
222 | |
223 .. _`python gettext`: http://docs.python.org/lib/module-gettext.html | |
224 | |
225 .. code-block:: python | |
226 | |
227 # NOTE: This is a comment about `Foo Bar` | |
228 _('Foo Bar') | |
229 | |
230 The comments tag for the above example would be ``NOTE:``, and the translator | |
231 comment for that tag would be ``This is a comment about `Foo Bar```. | |
232 | |
233 The resulting output in the catalog template would be something like:: | |
234 | |
235 #. This is a comment about `Foo Bar` | |
236 #: main.py:2 | |
237 msgid "Foo Bar" | |
238 msgstr "" | |
239 | |
240 Now, you might ask, why would I need that? | |
241 | |
242 Consider this simple case; you have a menu item called “Manual”. You know what | |
243 it means, but when the translator sees this they will wonder did you mean: | |
244 | |
245 1. a document or help manual, or | |
246 2. a manual process? | |
247 | |
248 This is the simplest case where a translation comment such as | |
249 “The installation manual” helps to clarify the situation and makes a translator | |
250 more productive. | |
251 | |
252 **More examples of the need for translation comments** | |
253 | |
254 Real world examples are best. This is a discussion over the use of the word | |
255 “Forward” in Northern Sotho: | |
256 | |
257 “When you go forward. You go ‘Pele’, but when you forward the document, | |
258 you ‘Fetišetša pele’. So if you just say forward, we don’t know what you are | |
259 talking about. | |
260 It is better if it's in a sentence. But in this case i think we will use ‘pele’ | |
261 because on the string no. 86 and 88 there is “show previous page in history” | |
262 and “show next page in history”. | |
263 | |
264 Were the translators guess correct? I think so, but it makes it so much easier | |
265 if they don’t need to be super `sleuths`_ as well as translators. | |
266 | |
267 .. _`sleuths`: http://www.thefreedictionary.com/sleuth | |
268 | |
269 | |
270 *Explanation Borrowed From:* `Wordforge`_ | |
271 | |
272 .. _`Wordforge`: http://www.wordforge.org/static/translation_comments.html | |
273 | |
274 **Note**: Translator comments are currently only supported in python source | |
275 code. | |
276 |