Mercurial > babel > mirror
annotate babel/messages/extract.py @ 179:31beb381d62f trunk
added 'N_' (gettext noop) to the extractor's default keywords
fixes #25
author | pjenvey |
---|---|
date | Wed, 27 Jun 2007 22:43:26 +0000 |
parents | e1199c0fb3bf |
children | 4052570f109d |
rev | line source |
---|---|
1 | 1 # -*- coding: utf-8 -*- |
2 # | |
3 # Copyright (C) 2007 Edgewall Software | |
4 # All rights reserved. | |
5 # | |
6 # This software is licensed as described in the file COPYING, which | |
7 # you should have received as part of this distribution. The terms | |
8 # are also available at http://babel.edgewall.org/wiki/License. | |
9 # | |
10 # This software consists of voluntary contributions made by many | |
11 # individuals. For the exact contribution history, see the revision | |
12 # history and logs, available at http://babel.edgewall.org/log/. | |
13 | |
14 """Basic infrastructure for extracting localizable messages from source files. | |
15 | |
16 This module defines an extensible system for collecting localizable message | |
17 strings from a variety of sources. A native extractor for Python source files | |
18 is builtin, extractors for other sources can be added using very simple plugins. | |
19 | |
20 The main entry points into the extraction functionality are the functions | |
21 `extract_from_dir` and `extract_from_file`. | |
22 """ | |
23 | |
24 import os | |
44 | 25 try: |
26 set | |
27 except NameError: | |
28 from sets import Set as set | |
1 | 29 import sys |
162 | 30 from tokenize import generate_tokens, COMMENT, NAME, OP, STRING |
1 | 31 |
164
e1199c0fb3bf
made the python extractor detect source file encodings from the magic encoding
pjenvey
parents:
162
diff
changeset
|
32 from babel.util import parse_encoding, pathmatch, relpath |
1 | 33 |
34 __all__ = ['extract', 'extract_from_dir', 'extract_from_file'] | |
35 __docformat__ = 'restructuredtext en' | |
36 | |
37 GROUP_NAME = 'babel.extractors' | |
38 | |
12
e6ba3e878b10
* Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents:
10
diff
changeset
|
39 DEFAULT_KEYWORDS = { |
10
4130d9c6cb34
Both Babel's [source:trunk/babel/catalog/frontend.py frontend] and [source:trunk/babel/catalog/extract.py extract] now handle keyword indices. Also added an extra boolean flag so that the default keywords defined by Babel are not included in the keywords to search for when extracting strings.
palgarvio
parents:
1
diff
changeset
|
40 '_': None, |
4130d9c6cb34
Both Babel's [source:trunk/babel/catalog/frontend.py frontend] and [source:trunk/babel/catalog/extract.py extract] now handle keyword indices. Also added an extra boolean flag so that the default keywords defined by Babel are not included in the keywords to search for when extracting strings.
palgarvio
parents:
1
diff
changeset
|
41 'gettext': None, |
4130d9c6cb34
Both Babel's [source:trunk/babel/catalog/frontend.py frontend] and [source:trunk/babel/catalog/extract.py extract] now handle keyword indices. Also added an extra boolean flag so that the default keywords defined by Babel are not included in the keywords to search for when extracting strings.
palgarvio
parents:
1
diff
changeset
|
42 'ngettext': (1, 2), |
4130d9c6cb34
Both Babel's [source:trunk/babel/catalog/frontend.py frontend] and [source:trunk/babel/catalog/extract.py extract] now handle keyword indices. Also added an extra boolean flag so that the default keywords defined by Babel are not included in the keywords to search for when extracting strings.
palgarvio
parents:
1
diff
changeset
|
43 'ugettext': None, |
4130d9c6cb34
Both Babel's [source:trunk/babel/catalog/frontend.py frontend] and [source:trunk/babel/catalog/extract.py extract] now handle keyword indices. Also added an extra boolean flag so that the default keywords defined by Babel are not included in the keywords to search for when extracting strings.
palgarvio
parents:
1
diff
changeset
|
44 'ungettext': (1, 2), |
4130d9c6cb34
Both Babel's [source:trunk/babel/catalog/frontend.py frontend] and [source:trunk/babel/catalog/extract.py extract] now handle keyword indices. Also added an extra boolean flag so that the default keywords defined by Babel are not included in the keywords to search for when extracting strings.
palgarvio
parents:
1
diff
changeset
|
45 'dgettext': (2,), |
4130d9c6cb34
Both Babel's [source:trunk/babel/catalog/frontend.py frontend] and [source:trunk/babel/catalog/extract.py extract] now handle keyword indices. Also added an extra boolean flag so that the default keywords defined by Babel are not included in the keywords to search for when extracting strings.
palgarvio
parents:
1
diff
changeset
|
46 'dngettext': (2, 3), |
179
31beb381d62f
added 'N_' (gettext noop) to the extractor's default keywords
pjenvey
parents:
164
diff
changeset
|
47 'N_': None |
10
4130d9c6cb34
Both Babel's [source:trunk/babel/catalog/frontend.py frontend] and [source:trunk/babel/catalog/extract.py extract] now handle keyword indices. Also added an extra boolean flag so that the default keywords defined by Babel are not included in the keywords to search for when extracting strings.
palgarvio
parents:
1
diff
changeset
|
48 } |
1 | 49 |
62
2df27f49c320
The order of extraction methods is now preserved (see #10).
cmlenz
parents:
57
diff
changeset
|
50 DEFAULT_MAPPING = [('**.py', 'python')] |
1 | 51 |
47
f8469ab4b257
Support passing extraction method mapping and options from the frontends (see #4). No distutils/setuptools keyword supported yet, but the rest seems to be working okay.
cmlenz
parents:
44
diff
changeset
|
52 def extract_from_dir(dirname=os.getcwd(), method_map=DEFAULT_MAPPING, |
f8469ab4b257
Support passing extraction method mapping and options from the frontends (see #4). No distutils/setuptools keyword supported yet, but the rest seems to be working okay.
cmlenz
parents:
44
diff
changeset
|
53 options_map=None, keywords=DEFAULT_KEYWORDS, |
84
3ae316b58231
Some cosmetic changes for the new translator comments support.
cmlenz
parents:
82
diff
changeset
|
54 comment_tags=(), callback=None): |
1 | 55 """Extract messages from any source files found in the given directory. |
56 | |
57 This function generates tuples of the form: | |
58 | |
82
540bb484f6e0
Missed some param's documentation regarding translator comments.
palgarvio
parents:
81
diff
changeset
|
59 ``(filename, lineno, message, comments)`` |
1 | 60 |
44 | 61 Which extraction method is used per file is determined by the `method_map` |
62 parameter, which maps extended glob patterns to extraction method names. | |
63 For example, the following is the default mapping: | |
1 | 64 |
62
2df27f49c320
The order of extraction methods is now preserved (see #10).
cmlenz
parents:
57
diff
changeset
|
65 >>> method_map = [ |
2df27f49c320
The order of extraction methods is now preserved (see #10).
cmlenz
parents:
57
diff
changeset
|
66 ... ('**.py', 'python') |
2df27f49c320
The order of extraction methods is now preserved (see #10).
cmlenz
parents:
57
diff
changeset
|
67 ... ] |
1 | 68 |
69 This basically says that files with the filename extension ".py" at any | |
70 level inside the directory should be processed by the "python" extraction | |
44 | 71 method. Files that don't match any of the mapping patterns are ignored. See |
72 the documentation of the `pathmatch` function for details on the pattern | |
73 syntax. | |
1 | 74 |
62
2df27f49c320
The order of extraction methods is now preserved (see #10).
cmlenz
parents:
57
diff
changeset
|
75 The following extended mapping would also use the "genshi" extraction |
2df27f49c320
The order of extraction methods is now preserved (see #10).
cmlenz
parents:
57
diff
changeset
|
76 method on any file in "templates" subdirectory: |
1 | 77 |
62
2df27f49c320
The order of extraction methods is now preserved (see #10).
cmlenz
parents:
57
diff
changeset
|
78 >>> method_map = [ |
2df27f49c320
The order of extraction methods is now preserved (see #10).
cmlenz
parents:
57
diff
changeset
|
79 ... ('**/templates/**.*', 'genshi'), |
2df27f49c320
The order of extraction methods is now preserved (see #10).
cmlenz
parents:
57
diff
changeset
|
80 ... ('**.py', 'python') |
2df27f49c320
The order of extraction methods is now preserved (see #10).
cmlenz
parents:
57
diff
changeset
|
81 ... ] |
44 | 82 |
83 The dictionary provided by the optional `options_map` parameter augments | |
62
2df27f49c320
The order of extraction methods is now preserved (see #10).
cmlenz
parents:
57
diff
changeset
|
84 these mappings. It uses extended glob patterns as keys, and the values are |
2df27f49c320
The order of extraction methods is now preserved (see #10).
cmlenz
parents:
57
diff
changeset
|
85 dictionaries mapping options names to option values (both strings). |
44 | 86 |
87 The glob patterns of the `options_map` do not necessarily need to be the | |
62
2df27f49c320
The order of extraction methods is now preserved (see #10).
cmlenz
parents:
57
diff
changeset
|
88 same as those used in the method mapping. For example, while all files in |
2df27f49c320
The order of extraction methods is now preserved (see #10).
cmlenz
parents:
57
diff
changeset
|
89 the ``templates`` folders in an application may be Genshi applications, the |
44 | 90 options for those files may differ based on extension: |
91 | |
92 >>> options_map = { | |
93 ... '**/templates/**.txt': { | |
144 | 94 ... 'template_class': 'genshi.template:TextTemplate', |
44 | 95 ... 'encoding': 'latin-1' |
96 ... }, | |
97 ... '**/templates/**.html': { | |
98 ... 'include_attrs': '' | |
99 ... } | |
1 | 100 ... } |
101 | |
102 :param dirname: the path to the directory to extract messages from | |
62
2df27f49c320
The order of extraction methods is now preserved (see #10).
cmlenz
parents:
57
diff
changeset
|
103 :param method_map: a list of ``(pattern, method)`` tuples that maps of |
2df27f49c320
The order of extraction methods is now preserved (see #10).
cmlenz
parents:
57
diff
changeset
|
104 extraction method names to extended glob patterns |
44 | 105 :param options_map: a dictionary of additional options (optional) |
12
e6ba3e878b10
* Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents:
10
diff
changeset
|
106 :param keywords: a dictionary mapping keywords (i.e. names of functions |
e6ba3e878b10
* Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents:
10
diff
changeset
|
107 that should be recognized as translation functions) to |
e6ba3e878b10
* Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents:
10
diff
changeset
|
108 tuples that specify which of their arguments contain |
e6ba3e878b10
* Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents:
10
diff
changeset
|
109 localizable strings |
84
3ae316b58231
Some cosmetic changes for the new translator comments support.
cmlenz
parents:
82
diff
changeset
|
110 :param comment_tags: a list of tags of translator comments to search for |
3ae316b58231
Some cosmetic changes for the new translator comments support.
cmlenz
parents:
82
diff
changeset
|
111 and include in the results |
47
f8469ab4b257
Support passing extraction method mapping and options from the frontends (see #4). No distutils/setuptools keyword supported yet, but the rest seems to be working okay.
cmlenz
parents:
44
diff
changeset
|
112 :param callback: a function that is called for every file that message are |
f8469ab4b257
Support passing extraction method mapping and options from the frontends (see #4). No distutils/setuptools keyword supported yet, but the rest seems to be working okay.
cmlenz
parents:
44
diff
changeset
|
113 extracted from, just before the extraction itself is |
75 | 114 performed; the function is passed the filename, the name |
115 of the extraction method and and the options dictionary as | |
116 positional arguments, in that order | |
1 | 117 :return: an iterator over ``(filename, lineno, funcname, message)`` tuples |
118 :rtype: ``iterator`` | |
44 | 119 :see: `pathmatch` |
1 | 120 """ |
44 | 121 if options_map is None: |
122 options_map = {} | |
56
f40fc143439c
Add actual data structures for handling message catalogs, so that more code can be reused here between the frontends.
cmlenz
parents:
54
diff
changeset
|
123 |
44 | 124 absname = os.path.abspath(dirname) |
125 for root, dirnames, filenames in os.walk(absname): | |
126 for subdir in dirnames: | |
127 if subdir.startswith('.') or subdir.startswith('_'): | |
128 dirnames.remove(subdir) | |
154
31478eb3fb9e
The default ordering of messages in generated POT files, which is based on the order those messages are found when walking the source tree, is no longer subject to differences between platforms; directory and file names are now always sorted alphabetically.
cmlenz
parents:
147
diff
changeset
|
129 dirnames.sort() |
31478eb3fb9e
The default ordering of messages in generated POT files, which is based on the order those messages are found when walking the source tree, is no longer subject to differences between platforms; directory and file names are now always sorted alphabetically.
cmlenz
parents:
147
diff
changeset
|
130 filenames.sort() |
44 | 131 for filename in filenames: |
132 filename = relpath( | |
133 os.path.join(root, filename).replace(os.sep, '/'), | |
134 dirname | |
135 ) | |
62
2df27f49c320
The order of extraction methods is now preserved (see #10).
cmlenz
parents:
57
diff
changeset
|
136 for pattern, method in method_map: |
44 | 137 if pathmatch(pattern, filename): |
138 filepath = os.path.join(absname, filename) | |
139 options = {} | |
140 for opattern, odict in options_map.items(): | |
141 if pathmatch(opattern, filename): | |
142 options = odict | |
47
f8469ab4b257
Support passing extraction method mapping and options from the frontends (see #4). No distutils/setuptools keyword supported yet, but the rest seems to be working okay.
cmlenz
parents:
44
diff
changeset
|
143 if callback: |
57
d930a3dfbf3d
* The `extract_messages` distutils command now operators on configurable input directories again, instead of the complete current directory. The `input_dirs` default to the package directories.
cmlenz
parents:
56
diff
changeset
|
144 callback(filename, method, options) |
80
116e34b8cefa
Added support for translator comments at the API and frontends levels.(See #12, item 1). Updated docs and tests accordingly.
palgarvio
parents:
75
diff
changeset
|
145 for lineno, message, comments in \ |
116e34b8cefa
Added support for translator comments at the API and frontends levels.(See #12, item 1). Updated docs and tests accordingly.
palgarvio
parents:
75
diff
changeset
|
146 extract_from_file(method, filepath, |
116e34b8cefa
Added support for translator comments at the API and frontends levels.(See #12, item 1). Updated docs and tests accordingly.
palgarvio
parents:
75
diff
changeset
|
147 keywords=keywords, |
84
3ae316b58231
Some cosmetic changes for the new translator comments support.
cmlenz
parents:
82
diff
changeset
|
148 comment_tags=comment_tags, |
80
116e34b8cefa
Added support for translator comments at the API and frontends levels.(See #12, item 1). Updated docs and tests accordingly.
palgarvio
parents:
75
diff
changeset
|
149 options=options): |
116e34b8cefa
Added support for translator comments at the API and frontends levels.(See #12, item 1). Updated docs and tests accordingly.
palgarvio
parents:
75
diff
changeset
|
150 yield filename, lineno, message, comments |
57
d930a3dfbf3d
* The `extract_messages` distutils command now operators on configurable input directories again, instead of the complete current directory. The `input_dirs` default to the package directories.
cmlenz
parents:
56
diff
changeset
|
151 break |
1 | 152 |
12
e6ba3e878b10
* Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents:
10
diff
changeset
|
153 def extract_from_file(method, filename, keywords=DEFAULT_KEYWORDS, |
84
3ae316b58231
Some cosmetic changes for the new translator comments support.
cmlenz
parents:
82
diff
changeset
|
154 comment_tags=(), options=None): |
1 | 155 """Extract messages from a specific file. |
156 | |
157 This function returns a list of tuples of the form: | |
158 | |
159 ``(lineno, funcname, message)`` | |
160 | |
161 :param filename: the path to the file to extract messages from | |
162 :param method: a string specifying the extraction method (.e.g. "python") | |
12
e6ba3e878b10
* Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents:
10
diff
changeset
|
163 :param keywords: a dictionary mapping keywords (i.e. names of functions |
e6ba3e878b10
* Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents:
10
diff
changeset
|
164 that should be recognized as translation functions) to |
e6ba3e878b10
* Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents:
10
diff
changeset
|
165 tuples that specify which of their arguments contain |
e6ba3e878b10
* Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents:
10
diff
changeset
|
166 localizable strings |
84
3ae316b58231
Some cosmetic changes for the new translator comments support.
cmlenz
parents:
82
diff
changeset
|
167 :param comment_tags: a list of translator tags to search for and include |
3ae316b58231
Some cosmetic changes for the new translator comments support.
cmlenz
parents:
82
diff
changeset
|
168 in the results |
1 | 169 :param options: a dictionary of additional options (optional) |
170 :return: the list of extracted messages | |
171 :rtype: `list` | |
172 """ | |
173 fileobj = open(filename, 'U') | |
174 try: | |
84
3ae316b58231
Some cosmetic changes for the new translator comments support.
cmlenz
parents:
82
diff
changeset
|
175 return list(extract(method, fileobj, keywords, comment_tags, options)) |
1 | 176 finally: |
177 fileobj.close() | |
178 | |
84
3ae316b58231
Some cosmetic changes for the new translator comments support.
cmlenz
parents:
82
diff
changeset
|
179 def extract(method, fileobj, keywords=DEFAULT_KEYWORDS, comment_tags=(), |
80
116e34b8cefa
Added support for translator comments at the API and frontends levels.(See #12, item 1). Updated docs and tests accordingly.
palgarvio
parents:
75
diff
changeset
|
180 options=None): |
1 | 181 """Extract messages from the given file-like object using the specified |
182 extraction method. | |
183 | |
184 This function returns a list of tuples of the form: | |
185 | |
80
116e34b8cefa
Added support for translator comments at the API and frontends levels.(See #12, item 1). Updated docs and tests accordingly.
palgarvio
parents:
75
diff
changeset
|
186 ``(lineno, message, comments)`` |
1 | 187 |
188 The implementation dispatches the actual extraction to plugins, based on the | |
189 value of the ``method`` parameter. | |
190 | |
191 >>> source = '''# foo module | |
192 ... def run(argv): | |
193 ... print _('Hello, world!') | |
194 ... ''' | |
10
4130d9c6cb34
Both Babel's [source:trunk/babel/catalog/frontend.py frontend] and [source:trunk/babel/catalog/extract.py extract] now handle keyword indices. Also added an extra boolean flag so that the default keywords defined by Babel are not included in the keywords to search for when extracting strings.
palgarvio
parents:
1
diff
changeset
|
195 |
1 | 196 >>> from StringIO import StringIO |
197 >>> for message in extract('python', StringIO(source)): | |
198 ... print message | |
164
e1199c0fb3bf
made the python extractor detect source file encodings from the magic encoding
pjenvey
parents:
162
diff
changeset
|
199 (3, u'Hello, world!', []) |
1 | 200 |
201 :param method: a string specifying the extraction method (.e.g. "python") | |
202 :param fileobj: the file-like object the messages should be extracted from | |
12
e6ba3e878b10
* Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents:
10
diff
changeset
|
203 :param keywords: a dictionary mapping keywords (i.e. names of functions |
e6ba3e878b10
* Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents:
10
diff
changeset
|
204 that should be recognized as translation functions) to |
e6ba3e878b10
* Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents:
10
diff
changeset
|
205 tuples that specify which of their arguments contain |
e6ba3e878b10
* Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents:
10
diff
changeset
|
206 localizable strings |
84
3ae316b58231
Some cosmetic changes for the new translator comments support.
cmlenz
parents:
82
diff
changeset
|
207 :param comment_tags: a list of translator tags to search for and include |
3ae316b58231
Some cosmetic changes for the new translator comments support.
cmlenz
parents:
82
diff
changeset
|
208 in the results |
1 | 209 :param options: a dictionary of additional options (optional) |
210 :return: the list of extracted messages | |
211 :rtype: `list` | |
212 :raise ValueError: if the extraction method is not registered | |
213 """ | |
12
e6ba3e878b10
* Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents:
10
diff
changeset
|
214 from pkg_resources import working_set |
e6ba3e878b10
* Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents:
10
diff
changeset
|
215 |
1 | 216 for entry_point in working_set.iter_entry_points(GROUP_NAME, method): |
217 func = entry_point.load(require=True) | |
84
3ae316b58231
Some cosmetic changes for the new translator comments support.
cmlenz
parents:
82
diff
changeset
|
218 results = func(fileobj, keywords.keys(), comment_tags, |
3ae316b58231
Some cosmetic changes for the new translator comments support.
cmlenz
parents:
82
diff
changeset
|
219 options=options or {}) |
3ae316b58231
Some cosmetic changes for the new translator comments support.
cmlenz
parents:
82
diff
changeset
|
220 for lineno, funcname, messages, comments in results: |
10
4130d9c6cb34
Both Babel's [source:trunk/babel/catalog/frontend.py frontend] and [source:trunk/babel/catalog/extract.py extract] now handle keyword indices. Also added an extra boolean flag so that the default keywords defined by Babel are not included in the keywords to search for when extracting strings.
palgarvio
parents:
1
diff
changeset
|
221 if isinstance(messages, (list, tuple)): |
4130d9c6cb34
Both Babel's [source:trunk/babel/catalog/frontend.py frontend] and [source:trunk/babel/catalog/extract.py extract] now handle keyword indices. Also added an extra boolean flag so that the default keywords defined by Babel are not included in the keywords to search for when extracting strings.
palgarvio
parents:
1
diff
changeset
|
222 msgs = [] |
12
e6ba3e878b10
* Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents:
10
diff
changeset
|
223 for index in keywords[funcname]: |
e6ba3e878b10
* Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents:
10
diff
changeset
|
224 msgs.append(messages[index - 1]) |
10
4130d9c6cb34
Both Babel's [source:trunk/babel/catalog/frontend.py frontend] and [source:trunk/babel/catalog/extract.py extract] now handle keyword indices. Also added an extra boolean flag so that the default keywords defined by Babel are not included in the keywords to search for when extracting strings.
palgarvio
parents:
1
diff
changeset
|
225 messages = tuple(msgs) |
4130d9c6cb34
Both Babel's [source:trunk/babel/catalog/frontend.py frontend] and [source:trunk/babel/catalog/extract.py extract] now handle keyword indices. Also added an extra boolean flag so that the default keywords defined by Babel are not included in the keywords to search for when extracting strings.
palgarvio
parents:
1
diff
changeset
|
226 if len(messages) == 1: |
4130d9c6cb34
Both Babel's [source:trunk/babel/catalog/frontend.py frontend] and [source:trunk/babel/catalog/extract.py extract] now handle keyword indices. Also added an extra boolean flag so that the default keywords defined by Babel are not included in the keywords to search for when extracting strings.
palgarvio
parents:
1
diff
changeset
|
227 messages = messages[0] |
80
116e34b8cefa
Added support for translator comments at the API and frontends levels.(See #12, item 1). Updated docs and tests accordingly.
palgarvio
parents:
75
diff
changeset
|
228 yield lineno, messages, comments |
10
4130d9c6cb34
Both Babel's [source:trunk/babel/catalog/frontend.py frontend] and [source:trunk/babel/catalog/extract.py extract] now handle keyword indices. Also added an extra boolean flag so that the default keywords defined by Babel are not included in the keywords to search for when extracting strings.
palgarvio
parents:
1
diff
changeset
|
229 return |
12
e6ba3e878b10
* Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents:
10
diff
changeset
|
230 |
1 | 231 raise ValueError('Unknown extraction method %r' % method) |
232 | |
84
3ae316b58231
Some cosmetic changes for the new translator comments support.
cmlenz
parents:
82
diff
changeset
|
233 def extract_nothing(fileobj, keywords, comment_tags, options): |
57
d930a3dfbf3d
* The `extract_messages` distutils command now operators on configurable input directories again, instead of the complete current directory. The `input_dirs` default to the package directories.
cmlenz
parents:
56
diff
changeset
|
234 """Pseudo extractor that does not actually extract anything, but simply |
d930a3dfbf3d
* The `extract_messages` distutils command now operators on configurable input directories again, instead of the complete current directory. The `input_dirs` default to the package directories.
cmlenz
parents:
56
diff
changeset
|
235 returns an empty list. |
d930a3dfbf3d
* The `extract_messages` distutils command now operators on configurable input directories again, instead of the complete current directory. The `input_dirs` default to the package directories.
cmlenz
parents:
56
diff
changeset
|
236 """ |
d930a3dfbf3d
* The `extract_messages` distutils command now operators on configurable input directories again, instead of the complete current directory. The `input_dirs` default to the package directories.
cmlenz
parents:
56
diff
changeset
|
237 return [] |
d930a3dfbf3d
* The `extract_messages` distutils command now operators on configurable input directories again, instead of the complete current directory. The `input_dirs` default to the package directories.
cmlenz
parents:
56
diff
changeset
|
238 |
84
3ae316b58231
Some cosmetic changes for the new translator comments support.
cmlenz
parents:
82
diff
changeset
|
239 def extract_python(fileobj, keywords, comment_tags, options): |
1 | 240 """Extract messages from Python source code. |
241 | |
164
e1199c0fb3bf
made the python extractor detect source file encodings from the magic encoding
pjenvey
parents:
162
diff
changeset
|
242 :param fileobj: the seekable, file-like object the messages should be |
e1199c0fb3bf
made the python extractor detect source file encodings from the magic encoding
pjenvey
parents:
162
diff
changeset
|
243 extracted from |
1 | 244 :param keywords: a list of keywords (i.e. function names) that should be |
245 recognized as translation functions | |
84
3ae316b58231
Some cosmetic changes for the new translator comments support.
cmlenz
parents:
82
diff
changeset
|
246 :param comment_tags: a list of translator tags to search for and include |
3ae316b58231
Some cosmetic changes for the new translator comments support.
cmlenz
parents:
82
diff
changeset
|
247 in the results |
1 | 248 :param options: a dictionary of additional options (optional) |
81
85af04c72ccd
Fixed and added some documentation about the translator comments implemented in [81].
palgarvio
parents:
80
diff
changeset
|
249 :return: an iterator over ``(lineno, funcname, message, comments)`` tuples |
1 | 250 :rtype: ``iterator`` |
251 """ | |
252 funcname = None | |
253 lineno = None | |
254 buf = [] | |
255 messages = [] | |
80
116e34b8cefa
Added support for translator comments at the API and frontends levels.(See #12, item 1). Updated docs and tests accordingly.
palgarvio
parents:
75
diff
changeset
|
256 translator_comments = [] |
1 | 257 in_args = False |
80
116e34b8cefa
Added support for translator comments at the API and frontends levels.(See #12, item 1). Updated docs and tests accordingly.
palgarvio
parents:
75
diff
changeset
|
258 in_translator_comments = False |
1 | 259 |
164
e1199c0fb3bf
made the python extractor detect source file encodings from the magic encoding
pjenvey
parents:
162
diff
changeset
|
260 encoding = parse_encoding(fileobj) or options.get('encoding', 'ascii') |
e1199c0fb3bf
made the python extractor detect source file encodings from the magic encoding
pjenvey
parents:
162
diff
changeset
|
261 |
1 | 262 tokens = generate_tokens(fileobj.readline) |
263 for tok, value, (lineno, _), _, _ in tokens: | |
264 if funcname and tok == OP and value == '(': | |
265 in_args = True | |
80
116e34b8cefa
Added support for translator comments at the API and frontends levels.(See #12, item 1). Updated docs and tests accordingly.
palgarvio
parents:
75
diff
changeset
|
266 elif tok == COMMENT: |
92
ccb9da614597
Fixed bug introduced in [92], bad use of `lstrip()`. Added a unittest to test multiple translator comment tags.
palgarvio
parents:
91
diff
changeset
|
267 # Strip the comment token from the line |
164
e1199c0fb3bf
made the python extractor detect source file encodings from the magic encoding
pjenvey
parents:
162
diff
changeset
|
268 value = value.decode(encoding)[1:].strip() |
147 | 269 if in_translator_comments and \ |
93
f008662b5d6e
Commiting patch provided by pjenvey: Translator comments don't apply unless they immediately preceed the message.
palgarvio
parents:
92
diff
changeset
|
270 translator_comments[-1][0] == lineno - 1: |
92
ccb9da614597
Fixed bug introduced in [92], bad use of `lstrip()`. Added a unittest to test multiple translator comment tags.
palgarvio
parents:
91
diff
changeset
|
271 # We're already inside a translator comment, continue appending |
ccb9da614597
Fixed bug introduced in [92], bad use of `lstrip()`. Added a unittest to test multiple translator comment tags.
palgarvio
parents:
91
diff
changeset
|
272 # XXX: Should we check if the programmer keeps adding the |
ccb9da614597
Fixed bug introduced in [92], bad use of `lstrip()`. Added a unittest to test multiple translator comment tags.
palgarvio
parents:
91
diff
changeset
|
273 # comment_tag for every comment line??? probably not! |
93
f008662b5d6e
Commiting patch provided by pjenvey: Translator comments don't apply unless they immediately preceed the message.
palgarvio
parents:
92
diff
changeset
|
274 translator_comments.append((lineno, value)) |
92
ccb9da614597
Fixed bug introduced in [92], bad use of `lstrip()`. Added a unittest to test multiple translator comment tags.
palgarvio
parents:
91
diff
changeset
|
275 continue |
ccb9da614597
Fixed bug introduced in [92], bad use of `lstrip()`. Added a unittest to test multiple translator comment tags.
palgarvio
parents:
91
diff
changeset
|
276 # If execution reaches this point, let's see if comment line |
ccb9da614597
Fixed bug introduced in [92], bad use of `lstrip()`. Added a unittest to test multiple translator comment tags.
palgarvio
parents:
91
diff
changeset
|
277 # starts with one of the comment tags |
85
04a2f16bdd04
Fixed de-pluralization bug introduced in [85] regarding the extraction of translator comments.
palgarvio
parents:
84
diff
changeset
|
278 for comment_tag in comment_tags: |
92
ccb9da614597
Fixed bug introduced in [92], bad use of `lstrip()`. Added a unittest to test multiple translator comment tags.
palgarvio
parents:
91
diff
changeset
|
279 if value.startswith(comment_tag): |
147 | 280 in_translator_comments = True |
92
ccb9da614597
Fixed bug introduced in [92], bad use of `lstrip()`. Added a unittest to test multiple translator comment tags.
palgarvio
parents:
91
diff
changeset
|
281 comment = value[len(comment_tag):].strip() |
93
f008662b5d6e
Commiting patch provided by pjenvey: Translator comments don't apply unless they immediately preceed the message.
palgarvio
parents:
92
diff
changeset
|
282 translator_comments.append((lineno, comment)) |
92
ccb9da614597
Fixed bug introduced in [92], bad use of `lstrip()`. Added a unittest to test multiple translator comment tags.
palgarvio
parents:
91
diff
changeset
|
283 break |
1 | 284 elif funcname and in_args: |
285 if tok == OP and value == ')': | |
80
116e34b8cefa
Added support for translator comments at the API and frontends levels.(See #12, item 1). Updated docs and tests accordingly.
palgarvio
parents:
75
diff
changeset
|
286 in_args = in_translator_comments = False |
1 | 287 if buf: |
288 messages.append(''.join(buf)) | |
289 del buf[:] | |
290 if filter(None, messages): | |
291 if len(messages) > 1: | |
292 messages = tuple(messages) | |
293 else: | |
294 messages = messages[0] | |
93
f008662b5d6e
Commiting patch provided by pjenvey: Translator comments don't apply unless they immediately preceed the message.
palgarvio
parents:
92
diff
changeset
|
295 # Comments don't apply unless they immediately preceed the |
f008662b5d6e
Commiting patch provided by pjenvey: Translator comments don't apply unless they immediately preceed the message.
palgarvio
parents:
92
diff
changeset
|
296 # message |
f008662b5d6e
Commiting patch provided by pjenvey: Translator comments don't apply unless they immediately preceed the message.
palgarvio
parents:
92
diff
changeset
|
297 if translator_comments and \ |
f008662b5d6e
Commiting patch provided by pjenvey: Translator comments don't apply unless they immediately preceed the message.
palgarvio
parents:
92
diff
changeset
|
298 translator_comments[-1][0] < lineno - 1: |
f008662b5d6e
Commiting patch provided by pjenvey: Translator comments don't apply unless they immediately preceed the message.
palgarvio
parents:
92
diff
changeset
|
299 translator_comments = [] |
f008662b5d6e
Commiting patch provided by pjenvey: Translator comments don't apply unless they immediately preceed the message.
palgarvio
parents:
92
diff
changeset
|
300 |
f008662b5d6e
Commiting patch provided by pjenvey: Translator comments don't apply unless they immediately preceed the message.
palgarvio
parents:
92
diff
changeset
|
301 yield (lineno, funcname, messages, |
f008662b5d6e
Commiting patch provided by pjenvey: Translator comments don't apply unless they immediately preceed the message.
palgarvio
parents:
92
diff
changeset
|
302 [comment[1] for comment in translator_comments]) |
1 | 303 funcname = lineno = None |
304 messages = [] | |
80
116e34b8cefa
Added support for translator comments at the API and frontends levels.(See #12, item 1). Updated docs and tests accordingly.
palgarvio
parents:
75
diff
changeset
|
305 translator_comments = [] |
1 | 306 elif tok == STRING: |
164
e1199c0fb3bf
made the python extractor detect source file encodings from the magic encoding
pjenvey
parents:
162
diff
changeset
|
307 # Unwrap quotes in a safe manner, maintaining the string's |
e1199c0fb3bf
made the python extractor detect source file encodings from the magic encoding
pjenvey
parents:
162
diff
changeset
|
308 # encoding |
e1199c0fb3bf
made the python extractor detect source file encodings from the magic encoding
pjenvey
parents:
162
diff
changeset
|
309 # https://sourceforge.net/tracker/?func=detail&atid=355470&aid=617979&group_id=5470 |
e1199c0fb3bf
made the python extractor detect source file encodings from the magic encoding
pjenvey
parents:
162
diff
changeset
|
310 value = eval('# coding=%s\n%s' % (encoding, value), |
e1199c0fb3bf
made the python extractor detect source file encodings from the magic encoding
pjenvey
parents:
162
diff
changeset
|
311 {'__builtins__':{}}, {}) |
e1199c0fb3bf
made the python extractor detect source file encodings from the magic encoding
pjenvey
parents:
162
diff
changeset
|
312 if isinstance(value, str): |
e1199c0fb3bf
made the python extractor detect source file encodings from the magic encoding
pjenvey
parents:
162
diff
changeset
|
313 value = value.decode(encoding) |
e1199c0fb3bf
made the python extractor detect source file encodings from the magic encoding
pjenvey
parents:
162
diff
changeset
|
314 buf.append(value) |
1 | 315 elif tok == OP and value == ',': |
316 messages.append(''.join(buf)) | |
317 del buf[:] | |
318 elif funcname: | |
319 funcname = None | |
320 elif tok == NAME and value in keywords: | |
321 funcname = value |