comparison doc/messages.txt @ 124:98dcabc99308 trunk

Split docs on date and number formatting.
author cmlenz
date Mon, 18 Jun 2007 15:15:31 +0000
parents doc/catalogs.txt@781ee9295757
children 14fe2a8fb842
comparison
equal deleted inserted replaced
123:31ca37101a24 124:98dcabc99308
1 .. -*- mode: rst; encoding: utf-8 -*-
2
3 =============================
4 Working with Message Catalogs
5 =============================
6
7 .. contents:: Contents
8 :depth: 2
9 .. sectnum::
10
11
12 Introduction
13 ============
14
15 The ``gettext`` translation system enables you to mark any strings used in your
16 application as subject to localization, by wrapping them in functions such as
17 ``gettext(str)`` and ``ngettext(singular, plural, num)``. For brevity, the
18 ``gettext`` function is often aliased to ``_(str)``, so you can write:
19
20 .. code-block:: python
21
22 print _("Hello")
23
24 instead of just:
25
26 .. code-block:: python
27
28 print "Hello"
29
30 to make the string "Hello" localizable.
31
32 Message catalogs are collections of translations for such localizable messages
33 used in an application. They are commonly stored in PO (Portable Object) and MO
34 (Machine Object) files, the formats of which are defined by the GNU `gettext`_
35 tools and the GNU `translation project`_.
36
37 .. _`gettext`: http://www.gnu.org/software/gettext/
38 .. _`translation project`: http://sourceforge.net/projects/translation
39
40 The general procedure for building message catalogs looks something like this:
41
42 * use a tool (such as ``xgettext``) to extract localizable strings from the
43 code base and write them to a POT (PO Template) file.
44 * make a copy of the POT file for a specific locale (for example, "en_US")
45 and start translating the messages
46 * use a tool such as ``msgfmt`` to compile the locale PO file into an binary
47 MO file
48 * later, when code changes make it necessary to update the translations, you
49 regenerate the POT file and merge the changes into the various
50 locale-specific PO files, for example using ``msgmerge``
51
52 Python provides the `gettext module`_ as part of the standard library, which
53 enables applications to work with appropriately generated MO files.
54
55 .. _`gettext module`: http://docs.python.org/lib/module-gettext.html
56
57 As ``gettext`` provides a solid and well supported foundation for translating
58 application messages, Babel does not reinvent the wheel, but rather reuses this
59 infrastructure, and makes it easier to build message catalogs for Python
60 applications.
61
62
63 Message Extraction
64 ==================
65
66 Babel provides functionality similar to that of the ``xgettext`` program,
67 except that only extraction from Python source files is built-in, while support
68 for other file formats can be added using a simple extension mechanism.
69
70 Unlike ``xgettext``, which is usually invoked once for every file, the routines
71 for message extraction in Babel operate on directories. While the per-file
72 approach of ``xgettext`` works nicely with projects using a ``Makefile``,
73 Python projects rarely use ``make``, and thus a different mechanism is needed
74 for extracting messages from the heterogeneous collection of source files that
75 many Python projects are composed of.
76
77 When message extraction is based on directories instead of individual files,
78 there needs to be a way to configure which files should be treated in which
79 manner. For example, while many projects may contain ``.html`` files, some of
80 those files may be static HTML files that don't contain localizable message,
81 while others may be `Django`_ templates, and still others may contain `Genshi`_
82 markup templates. Some projects may even mix HTML files for different templates
83 languages (for whatever reason). Therefore the way in which messages are
84 extracted from source files can not only depend on the file extension, but
85 needs to be controllable in a precise manner.
86
87 .. _`Django`: http://www.djangoproject.com/
88 .. _`Genshi`: http://genshi.edgewall.org/
89
90 Babel accepts a configuration file to specify this mapping of files to
91 extraction methods, which is described below.
92
93
94 .. _`mapping`:
95
96 -------------------------------------------
97 Extraction Method Mapping and Configuration
98 -------------------------------------------
99
100 The mapping of extraction methods to files in Babel is done via a configuration
101 file. This file maps extended glob patterns to the names of the extraction
102 methods, and can also set various options for each pattern (which options are
103 available depends on the specific extraction method).
104
105 For example, the following configuration adds extraction of messages from both
106 Genshi markup templates and text templates:
107
108 .. code-block:: ini
109
110 # Extraction from Python source files
111
112 [python: foobar/**.py]
113
114 # Extraction from Genshi HTML and text templates
115
116 [genshi: foobar/**/templates/**.html]
117 ignore_tags = script,style
118 include_attrs = alt title summary
119
120 [genshi: foobar/**/templates/**.txt]
121 template_class = genshi.template.text:TextTemplate
122 encoding = ISO-8819-15
123
124 The configuration file syntax is based on the format commonly found in ``.INI``
125 files on Windows systems, and as supported by the ``ConfigParser`` module in
126 the Python standard libraries. Section names (the strings enclosed in square
127 brackets) specify both the name of the extraction method, and the extended glob
128 pattern to specify the files that this extraction method should be used for,
129 separated by a colon. The options in the sections are passed to the extraction
130 method. Which options are available is specific to the extraction method used.
131
132 The extended glob patterns used in this configuration are similar to the glob
133 patterns provided by most shells. A single asterisk (``*``) is a wildcard for
134 any number of characters (except for the pathname component separator "/"),
135 while a question mark (``?``) only matches a single character. In addition,
136 two subsequent asterisk characters (``**``) can be used to make the wildcard
137 match any directory level, so the pattern ``**.txt`` matches any file with the
138 extension ``.txt`` in any directory.
139
140 Lines that start with a ``#`` or ``;`` character are ignored and can be used
141 for comments. Empty lines are also ignored, too.
142
143 .. note:: if you're performing message extraction using the command Babel
144 provides for integration into ``setup.py`` scripts (see below), you
145 can also provide this configuration in a different way, namely as a
146 keyword argument to the ``setup()`` function.
147
148
149 ----------
150 Front-Ends
151 ----------
152
153 Babel provides two different front-ends to access its functionality for working
154 with message catalogs:
155
156 * A `Command-line interface <cmdline.html>`_, and
157 * `Integration with distutils/setuptools <setup.html>`_
158
159 Which one you choose depends on the nature of your project. For most modern
160 Python projects, the distutils/setuptools integration is probably more
161 convenient.
162
163
164 --------------------------
165 Writing Extraction Methods
166 --------------------------
167
168 Adding new methods for extracting localizable methods is easy. First, you'll
169 need to implement a function that complies with the following interface:
170
171 .. code-block:: python
172
173 def extract_xxx(fileobj, keywords, comment_tags, options):
174 """Extract messages from XXX files.
175
176 :param fileobj: the file-like object the messages should be extracted
177 from
178 :param keywords: a list of keywords (i.e. function names) that should
179 be recognized as translation functions
180 :param comment_tags: a list of translator tags to search for and
181 include in the results
182 :param options: a dictionary of additional options (optional)
183 :return: an iterator over ``(lineno, funcname, message, comments)``
184 tuples
185 :rtype: ``iterator``
186 """
187
188 .. note:: Any strings in the tuples produced by this function must be either
189 ``unicode`` objects, or ``str`` objects using plain ASCII characters.
190 That means that if sources contain strings using other encodings, it
191 is the job of the extractor implementation to do the decoding to
192 ``unicode`` objects.
193
194 Next, you should register that function as an entry point. This requires your
195 ``setup.py`` script to use `setuptools`_, and your package to be installed with
196 the necessary metadata. If that's taken care of, add something like the
197 following to your ``setup.py`` script:
198
199 .. code-block:: python
200
201 def setup(...
202
203 entry_points = """
204 [babel.extractors]
205 xxx = your.package:extract_xxx
206 """,
207
208 That is, add your extraction method to the entry point group
209 ``babel.extractors``, where the name of the entry point is the name that people
210 will use to reference the extraction method, and the value being the module and
211 the name of the function (separated by a colon) implementing the actual
212 extraction.
213
214 .. _`setuptools`: http://peak.telecommunity.com/DevCenter/setuptools
215
216 Comments Tags And Translator Comments Explanation
217 .................................................
218
219 First of all what are comments tags. Comments tags are excerpts of text to
220 search for in comments, only comments, right before the `python gettext`_
221 calls, as shown on the following example:
222
223 .. _`python gettext`: http://docs.python.org/lib/module-gettext.html
224
225 .. code-block:: python
226
227 # NOTE: This is a comment about `Foo Bar`
228 _('Foo Bar')
229
230 The comments tag for the above example would be ``NOTE:``, and the translator
231 comment for that tag would be ``This is a comment about `Foo Bar```.
232
233 The resulting output in the catalog template would be something like::
234
235 #. This is a comment about `Foo Bar`
236 #: main.py:2
237 msgid "Foo Bar"
238 msgstr ""
239
240 Now, you might ask, why would I need that?
241
242 Consider this simple case; you have a menu item called “Manual”. You know what
243 it means, but when the translator sees this they will wonder did you mean:
244
245 1. a document or help manual, or
246 2. a manual process?
247
248 This is the simplest case where a translation comment such as
249 “The installation manual” helps to clarify the situation and makes a translator
250 more productive.
251
252 **More examples of the need for translation comments**
253
254 Real world examples are best. This is a discussion over the use of the word
255 “Forward” in Northern Sotho:
256
257 “When you go forward. You go ‘Pele’, but when you forward the document,
258 you ‘Fetišetša pele’. So if you just say forward, we don’t know what you are
259 talking about.
260 It is better if it's in a sentence. But in this case i think we will use ‘pele’
261 because on the string no. 86 and 88 there is “show previous page in history”
262 and “show next page in history”.
263
264 Were the translators guess correct? I think so, but it makes it so much easier
265 if they don’t need to be super `sleuths`_ as well as translators.
266
267 .. _`sleuths`: http://www.thefreedictionary.com/sleuth
268
269
270 *Explanation Borrowed From:* `Wordforge`_
271
272 .. _`Wordforge`: http://www.wordforge.org/static/translation_comments.html
273
274 **Note**: Translator comments are currently only supported in python source
275 code.
276
Copyright (C) 2012-2017 Edgewall Software