Mercurial > babel > mirror
annotate babel/messages/extract.py @ 340:ce83b4f77114 trunk
added some newlines to extract and jslexer to stay consistent with the rest of the sourcecode.
author | aronacher |
---|---|
date | Sat, 14 Jun 2008 19:00:35 +0000 |
parents | 93a896111488 |
children | 9c718e8af219 |
rev | line source |
---|---|
1 | 1 # -*- coding: utf-8 -*- |
2 # | |
3 # Copyright (C) 2007 Edgewall Software | |
4 # All rights reserved. | |
5 # | |
6 # This software is licensed as described in the file COPYING, which | |
7 # you should have received as part of this distribution. The terms | |
8 # are also available at http://babel.edgewall.org/wiki/License. | |
9 # | |
10 # This software consists of voluntary contributions made by many | |
11 # individuals. For the exact contribution history, see the revision | |
12 # history and logs, available at http://babel.edgewall.org/log/. | |
13 | |
14 """Basic infrastructure for extracting localizable messages from source files. | |
15 | |
16 This module defines an extensible system for collecting localizable message | |
17 strings from a variety of sources. A native extractor for Python source files | |
18 is builtin, extractors for other sources can be added using very simple plugins. | |
19 | |
20 The main entry points into the extraction functionality are the functions | |
21 `extract_from_dir` and `extract_from_file`. | |
22 """ | |
23 | |
24 import os | |
44 | 25 try: |
26 set | |
27 except NameError: | |
28 from sets import Set as set | |
1 | 29 import sys |
162 | 30 from tokenize import generate_tokens, COMMENT, NAME, OP, STRING |
1 | 31 |
164
e1199c0fb3bf
made the python extractor detect source file encodings from the magic encoding
pjenvey
parents:
162
diff
changeset
|
32 from babel.util import parse_encoding, pathmatch, relpath |
338
b39145076d8a
Stripping of comment tags is optional now. If enabled it will strip the tags from all lines of a comment now.
aronacher
parents:
329
diff
changeset
|
33 from textwrap import dedent |
1 | 34 |
35 __all__ = ['extract', 'extract_from_dir', 'extract_from_file'] | |
36 __docformat__ = 'restructuredtext en' | |
37 | |
38 GROUP_NAME = 'babel.extractors' | |
39 | |
12
e6ba3e878b10
* Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents:
10
diff
changeset
|
40 DEFAULT_KEYWORDS = { |
10
4130d9c6cb34
Both Babel's [source:trunk/babel/catalog/frontend.py frontend] and [source:trunk/babel/catalog/extract.py extract] now handle keyword indices. Also added an extra boolean flag so that the default keywords defined by Babel are not included in the keywords to search for when extracting strings.
palgarvio
parents:
1
diff
changeset
|
41 '_': None, |
4130d9c6cb34
Both Babel's [source:trunk/babel/catalog/frontend.py frontend] and [source:trunk/babel/catalog/extract.py extract] now handle keyword indices. Also added an extra boolean flag so that the default keywords defined by Babel are not included in the keywords to search for when extracting strings.
palgarvio
parents:
1
diff
changeset
|
42 'gettext': None, |
4130d9c6cb34
Both Babel's [source:trunk/babel/catalog/frontend.py frontend] and [source:trunk/babel/catalog/extract.py extract] now handle keyword indices. Also added an extra boolean flag so that the default keywords defined by Babel are not included in the keywords to search for when extracting strings.
palgarvio
parents:
1
diff
changeset
|
43 'ngettext': (1, 2), |
4130d9c6cb34
Both Babel's [source:trunk/babel/catalog/frontend.py frontend] and [source:trunk/babel/catalog/extract.py extract] now handle keyword indices. Also added an extra boolean flag so that the default keywords defined by Babel are not included in the keywords to search for when extracting strings.
palgarvio
parents:
1
diff
changeset
|
44 'ugettext': None, |
4130d9c6cb34
Both Babel's [source:trunk/babel/catalog/frontend.py frontend] and [source:trunk/babel/catalog/extract.py extract] now handle keyword indices. Also added an extra boolean flag so that the default keywords defined by Babel are not included in the keywords to search for when extracting strings.
palgarvio
parents:
1
diff
changeset
|
45 'ungettext': (1, 2), |
4130d9c6cb34
Both Babel's [source:trunk/babel/catalog/frontend.py frontend] and [source:trunk/babel/catalog/extract.py extract] now handle keyword indices. Also added an extra boolean flag so that the default keywords defined by Babel are not included in the keywords to search for when extracting strings.
palgarvio
parents:
1
diff
changeset
|
46 'dgettext': (2,), |
4130d9c6cb34
Both Babel's [source:trunk/babel/catalog/frontend.py frontend] and [source:trunk/babel/catalog/extract.py extract] now handle keyword indices. Also added an extra boolean flag so that the default keywords defined by Babel are not included in the keywords to search for when extracting strings.
palgarvio
parents:
1
diff
changeset
|
47 'dngettext': (2, 3), |
179
31beb381d62f
added 'N_' (gettext noop) to the extractor's default keywords
pjenvey
parents:
164
diff
changeset
|
48 'N_': None |
10
4130d9c6cb34
Both Babel's [source:trunk/babel/catalog/frontend.py frontend] and [source:trunk/babel/catalog/extract.py extract] now handle keyword indices. Also added an extra boolean flag so that the default keywords defined by Babel are not included in the keywords to search for when extracting strings.
palgarvio
parents:
1
diff
changeset
|
49 } |
1 | 50 |
62
2df27f49c320
The order of extraction methods is now preserved (see #10).
cmlenz
parents:
57
diff
changeset
|
51 DEFAULT_MAPPING = [('**.py', 'python')] |
1 | 52 |
222 | 53 empty_msgid_warning = ( |
54 '%s: warning: Empty msgid. It is reserved by GNU gettext: gettext("") ' | |
55 'returns the header entry with meta information, not the empty string.') | |
56 | |
338
b39145076d8a
Stripping of comment tags is optional now. If enabled it will strip the tags from all lines of a comment now.
aronacher
parents:
329
diff
changeset
|
57 |
b39145076d8a
Stripping of comment tags is optional now. If enabled it will strip the tags from all lines of a comment now.
aronacher
parents:
329
diff
changeset
|
58 def _strip_comment_tags(comments, tags): |
b39145076d8a
Stripping of comment tags is optional now. If enabled it will strip the tags from all lines of a comment now.
aronacher
parents:
329
diff
changeset
|
59 """Helper function for `extract` that strips comment tags from strings |
b39145076d8a
Stripping of comment tags is optional now. If enabled it will strip the tags from all lines of a comment now.
aronacher
parents:
329
diff
changeset
|
60 in a list of comment lines. This functions operates in-place. |
b39145076d8a
Stripping of comment tags is optional now. If enabled it will strip the tags from all lines of a comment now.
aronacher
parents:
329
diff
changeset
|
61 """ |
b39145076d8a
Stripping of comment tags is optional now. If enabled it will strip the tags from all lines of a comment now.
aronacher
parents:
329
diff
changeset
|
62 def _strip(line): |
b39145076d8a
Stripping of comment tags is optional now. If enabled it will strip the tags from all lines of a comment now.
aronacher
parents:
329
diff
changeset
|
63 for tag in tags: |
b39145076d8a
Stripping of comment tags is optional now. If enabled it will strip the tags from all lines of a comment now.
aronacher
parents:
329
diff
changeset
|
64 if line.startswith(tag): |
b39145076d8a
Stripping of comment tags is optional now. If enabled it will strip the tags from all lines of a comment now.
aronacher
parents:
329
diff
changeset
|
65 return line[len(tag):].strip() |
b39145076d8a
Stripping of comment tags is optional now. If enabled it will strip the tags from all lines of a comment now.
aronacher
parents:
329
diff
changeset
|
66 return line |
b39145076d8a
Stripping of comment tags is optional now. If enabled it will strip the tags from all lines of a comment now.
aronacher
parents:
329
diff
changeset
|
67 comments[:] = map(_strip, comments) |
b39145076d8a
Stripping of comment tags is optional now. If enabled it will strip the tags from all lines of a comment now.
aronacher
parents:
329
diff
changeset
|
68 |
340
ce83b4f77114
added some newlines to extract and jslexer to stay consistent with the rest of the sourcecode.
aronacher
parents:
339
diff
changeset
|
69 |
47
f8469ab4b257
Support passing extraction method mapping and options from the frontends (see #4). No distutils/setuptools keyword supported yet, but the rest seems to be working okay.
cmlenz
parents:
44
diff
changeset
|
70 def extract_from_dir(dirname=os.getcwd(), method_map=DEFAULT_MAPPING, |
f8469ab4b257
Support passing extraction method mapping and options from the frontends (see #4). No distutils/setuptools keyword supported yet, but the rest seems to be working okay.
cmlenz
parents:
44
diff
changeset
|
71 options_map=None, keywords=DEFAULT_KEYWORDS, |
338
b39145076d8a
Stripping of comment tags is optional now. If enabled it will strip the tags from all lines of a comment now.
aronacher
parents:
329
diff
changeset
|
72 comment_tags=(), callback=None, strip_comment_tags=False): |
1 | 73 """Extract messages from any source files found in the given directory. |
224
0a71b675fc48
Fix for message extractors which return `None` as the gettext call.
palgarvio
parents:
223
diff
changeset
|
74 |
1 | 75 This function generates tuples of the form: |
224
0a71b675fc48
Fix for message extractors which return `None` as the gettext call.
palgarvio
parents:
223
diff
changeset
|
76 |
82
540bb484f6e0
Missed some param's documentation regarding translator comments.
palgarvio
parents:
81
diff
changeset
|
77 ``(filename, lineno, message, comments)`` |
224
0a71b675fc48
Fix for message extractors which return `None` as the gettext call.
palgarvio
parents:
223
diff
changeset
|
78 |
44 | 79 Which extraction method is used per file is determined by the `method_map` |
80 parameter, which maps extended glob patterns to extraction method names. | |
81 For example, the following is the default mapping: | |
224
0a71b675fc48
Fix for message extractors which return `None` as the gettext call.
palgarvio
parents:
223
diff
changeset
|
82 |
62
2df27f49c320
The order of extraction methods is now preserved (see #10).
cmlenz
parents:
57
diff
changeset
|
83 >>> method_map = [ |
2df27f49c320
The order of extraction methods is now preserved (see #10).
cmlenz
parents:
57
diff
changeset
|
84 ... ('**.py', 'python') |
2df27f49c320
The order of extraction methods is now preserved (see #10).
cmlenz
parents:
57
diff
changeset
|
85 ... ] |
224
0a71b675fc48
Fix for message extractors which return `None` as the gettext call.
palgarvio
parents:
223
diff
changeset
|
86 |
1 | 87 This basically says that files with the filename extension ".py" at any |
88 level inside the directory should be processed by the "python" extraction | |
44 | 89 method. Files that don't match any of the mapping patterns are ignored. See |
90 the documentation of the `pathmatch` function for details on the pattern | |
91 syntax. | |
224
0a71b675fc48
Fix for message extractors which return `None` as the gettext call.
palgarvio
parents:
223
diff
changeset
|
92 |
62
2df27f49c320
The order of extraction methods is now preserved (see #10).
cmlenz
parents:
57
diff
changeset
|
93 The following extended mapping would also use the "genshi" extraction |
2df27f49c320
The order of extraction methods is now preserved (see #10).
cmlenz
parents:
57
diff
changeset
|
94 method on any file in "templates" subdirectory: |
224
0a71b675fc48
Fix for message extractors which return `None` as the gettext call.
palgarvio
parents:
223
diff
changeset
|
95 |
62
2df27f49c320
The order of extraction methods is now preserved (see #10).
cmlenz
parents:
57
diff
changeset
|
96 >>> method_map = [ |
2df27f49c320
The order of extraction methods is now preserved (see #10).
cmlenz
parents:
57
diff
changeset
|
97 ... ('**/templates/**.*', 'genshi'), |
2df27f49c320
The order of extraction methods is now preserved (see #10).
cmlenz
parents:
57
diff
changeset
|
98 ... ('**.py', 'python') |
2df27f49c320
The order of extraction methods is now preserved (see #10).
cmlenz
parents:
57
diff
changeset
|
99 ... ] |
224
0a71b675fc48
Fix for message extractors which return `None` as the gettext call.
palgarvio
parents:
223
diff
changeset
|
100 |
44 | 101 The dictionary provided by the optional `options_map` parameter augments |
62
2df27f49c320
The order of extraction methods is now preserved (see #10).
cmlenz
parents:
57
diff
changeset
|
102 these mappings. It uses extended glob patterns as keys, and the values are |
2df27f49c320
The order of extraction methods is now preserved (see #10).
cmlenz
parents:
57
diff
changeset
|
103 dictionaries mapping options names to option values (both strings). |
224
0a71b675fc48
Fix for message extractors which return `None` as the gettext call.
palgarvio
parents:
223
diff
changeset
|
104 |
44 | 105 The glob patterns of the `options_map` do not necessarily need to be the |
62
2df27f49c320
The order of extraction methods is now preserved (see #10).
cmlenz
parents:
57
diff
changeset
|
106 same as those used in the method mapping. For example, while all files in |
2df27f49c320
The order of extraction methods is now preserved (see #10).
cmlenz
parents:
57
diff
changeset
|
107 the ``templates`` folders in an application may be Genshi applications, the |
44 | 108 options for those files may differ based on extension: |
224
0a71b675fc48
Fix for message extractors which return `None` as the gettext call.
palgarvio
parents:
223
diff
changeset
|
109 |
44 | 110 >>> options_map = { |
111 ... '**/templates/**.txt': { | |
144 | 112 ... 'template_class': 'genshi.template:TextTemplate', |
44 | 113 ... 'encoding': 'latin-1' |
114 ... }, | |
115 ... '**/templates/**.html': { | |
116 ... 'include_attrs': '' | |
117 ... } | |
1 | 118 ... } |
224
0a71b675fc48
Fix for message extractors which return `None` as the gettext call.
palgarvio
parents:
223
diff
changeset
|
119 |
1 | 120 :param dirname: the path to the directory to extract messages from |
62
2df27f49c320
The order of extraction methods is now preserved (see #10).
cmlenz
parents:
57
diff
changeset
|
121 :param method_map: a list of ``(pattern, method)`` tuples that maps of |
2df27f49c320
The order of extraction methods is now preserved (see #10).
cmlenz
parents:
57
diff
changeset
|
122 extraction method names to extended glob patterns |
44 | 123 :param options_map: a dictionary of additional options (optional) |
12
e6ba3e878b10
* Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents:
10
diff
changeset
|
124 :param keywords: a dictionary mapping keywords (i.e. names of functions |
e6ba3e878b10
* Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents:
10
diff
changeset
|
125 that should be recognized as translation functions) to |
e6ba3e878b10
* Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents:
10
diff
changeset
|
126 tuples that specify which of their arguments contain |
e6ba3e878b10
* Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents:
10
diff
changeset
|
127 localizable strings |
84
3ae316b58231
Some cosmetic changes for the new translator comments support.
cmlenz
parents:
82
diff
changeset
|
128 :param comment_tags: a list of tags of translator comments to search for |
3ae316b58231
Some cosmetic changes for the new translator comments support.
cmlenz
parents:
82
diff
changeset
|
129 and include in the results |
47
f8469ab4b257
Support passing extraction method mapping and options from the frontends (see #4). No distutils/setuptools keyword supported yet, but the rest seems to be working okay.
cmlenz
parents:
44
diff
changeset
|
130 :param callback: a function that is called for every file that message are |
f8469ab4b257
Support passing extraction method mapping and options from the frontends (see #4). No distutils/setuptools keyword supported yet, but the rest seems to be working okay.
cmlenz
parents:
44
diff
changeset
|
131 extracted from, just before the extraction itself is |
75 | 132 performed; the function is passed the filename, the name |
133 of the extraction method and and the options dictionary as | |
134 positional arguments, in that order | |
338
b39145076d8a
Stripping of comment tags is optional now. If enabled it will strip the tags from all lines of a comment now.
aronacher
parents:
329
diff
changeset
|
135 :param strip_comment_tags: a flag that if set to `True` causes all comment |
b39145076d8a
Stripping of comment tags is optional now. If enabled it will strip the tags from all lines of a comment now.
aronacher
parents:
329
diff
changeset
|
136 tags to be removed from the collected comments. |
1 | 137 :return: an iterator over ``(filename, lineno, funcname, message)`` tuples |
138 :rtype: ``iterator`` | |
44 | 139 :see: `pathmatch` |
1 | 140 """ |
44 | 141 if options_map is None: |
142 options_map = {} | |
56
f40fc143439c
Add actual data structures for handling message catalogs, so that more code can be reused here between the frontends.
cmlenz
parents:
54
diff
changeset
|
143 |
44 | 144 absname = os.path.abspath(dirname) |
145 for root, dirnames, filenames in os.walk(absname): | |
146 for subdir in dirnames: | |
147 if subdir.startswith('.') or subdir.startswith('_'): | |
148 dirnames.remove(subdir) | |
154
31478eb3fb9e
The default ordering of messages in generated POT files, which is based on the order those messages are found when walking the source tree, is no longer subject to differences between platforms; directory and file names are now always sorted alphabetically.
cmlenz
parents:
147
diff
changeset
|
149 dirnames.sort() |
31478eb3fb9e
The default ordering of messages in generated POT files, which is based on the order those messages are found when walking the source tree, is no longer subject to differences between platforms; directory and file names are now always sorted alphabetically.
cmlenz
parents:
147
diff
changeset
|
150 filenames.sort() |
44 | 151 for filename in filenames: |
152 filename = relpath( | |
153 os.path.join(root, filename).replace(os.sep, '/'), | |
154 dirname | |
155 ) | |
62
2df27f49c320
The order of extraction methods is now preserved (see #10).
cmlenz
parents:
57
diff
changeset
|
156 for pattern, method in method_map: |
44 | 157 if pathmatch(pattern, filename): |
158 filepath = os.path.join(absname, filename) | |
159 options = {} | |
160 for opattern, odict in options_map.items(): | |
161 if pathmatch(opattern, filename): | |
162 options = odict | |
47
f8469ab4b257
Support passing extraction method mapping and options from the frontends (see #4). No distutils/setuptools keyword supported yet, but the rest seems to be working okay.
cmlenz
parents:
44
diff
changeset
|
163 if callback: |
57
d930a3dfbf3d
* The `extract_messages` distutils command now operators on configurable input directories again, instead of the complete current directory. The `input_dirs` default to the package directories.
cmlenz
parents:
56
diff
changeset
|
164 callback(filename, method, options) |
80
116e34b8cefa
Added support for translator comments at the API and frontends levels.(See #12, item 1). Updated docs and tests accordingly.
palgarvio
parents:
75
diff
changeset
|
165 for lineno, message, comments in \ |
338
b39145076d8a
Stripping of comment tags is optional now. If enabled it will strip the tags from all lines of a comment now.
aronacher
parents:
329
diff
changeset
|
166 extract_from_file(method, filepath, |
b39145076d8a
Stripping of comment tags is optional now. If enabled it will strip the tags from all lines of a comment now.
aronacher
parents:
329
diff
changeset
|
167 keywords=keywords, |
b39145076d8a
Stripping of comment tags is optional now. If enabled it will strip the tags from all lines of a comment now.
aronacher
parents:
329
diff
changeset
|
168 comment_tags=comment_tags, |
b39145076d8a
Stripping of comment tags is optional now. If enabled it will strip the tags from all lines of a comment now.
aronacher
parents:
329
diff
changeset
|
169 options=options, |
b39145076d8a
Stripping of comment tags is optional now. If enabled it will strip the tags from all lines of a comment now.
aronacher
parents:
329
diff
changeset
|
170 strip_comment_tags= |
b39145076d8a
Stripping of comment tags is optional now. If enabled it will strip the tags from all lines of a comment now.
aronacher
parents:
329
diff
changeset
|
171 strip_comment_tags): |
80
116e34b8cefa
Added support for translator comments at the API and frontends levels.(See #12, item 1). Updated docs and tests accordingly.
palgarvio
parents:
75
diff
changeset
|
172 yield filename, lineno, message, comments |
57
d930a3dfbf3d
* The `extract_messages` distutils command now operators on configurable input directories again, instead of the complete current directory. The `input_dirs` default to the package directories.
cmlenz
parents:
56
diff
changeset
|
173 break |
1 | 174 |
340
ce83b4f77114
added some newlines to extract and jslexer to stay consistent with the rest of the sourcecode.
aronacher
parents:
339
diff
changeset
|
175 |
12
e6ba3e878b10
* Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents:
10
diff
changeset
|
176 def extract_from_file(method, filename, keywords=DEFAULT_KEYWORDS, |
338
b39145076d8a
Stripping of comment tags is optional now. If enabled it will strip the tags from all lines of a comment now.
aronacher
parents:
329
diff
changeset
|
177 comment_tags=(), options=None, strip_comment_tags=False): |
1 | 178 """Extract messages from a specific file. |
224
0a71b675fc48
Fix for message extractors which return `None` as the gettext call.
palgarvio
parents:
223
diff
changeset
|
179 |
1 | 180 This function returns a list of tuples of the form: |
224
0a71b675fc48
Fix for message extractors which return `None` as the gettext call.
palgarvio
parents:
223
diff
changeset
|
181 |
1 | 182 ``(lineno, funcname, message)`` |
224
0a71b675fc48
Fix for message extractors which return `None` as the gettext call.
palgarvio
parents:
223
diff
changeset
|
183 |
1 | 184 :param filename: the path to the file to extract messages from |
185 :param method: a string specifying the extraction method (.e.g. "python") | |
12
e6ba3e878b10
* Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents:
10
diff
changeset
|
186 :param keywords: a dictionary mapping keywords (i.e. names of functions |
e6ba3e878b10
* Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents:
10
diff
changeset
|
187 that should be recognized as translation functions) to |
e6ba3e878b10
* Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents:
10
diff
changeset
|
188 tuples that specify which of their arguments contain |
e6ba3e878b10
* Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents:
10
diff
changeset
|
189 localizable strings |
84
3ae316b58231
Some cosmetic changes for the new translator comments support.
cmlenz
parents:
82
diff
changeset
|
190 :param comment_tags: a list of translator tags to search for and include |
3ae316b58231
Some cosmetic changes for the new translator comments support.
cmlenz
parents:
82
diff
changeset
|
191 in the results |
338
b39145076d8a
Stripping of comment tags is optional now. If enabled it will strip the tags from all lines of a comment now.
aronacher
parents:
329
diff
changeset
|
192 :param strip_comment_tags: a flag that if set to `True` causes all comment |
b39145076d8a
Stripping of comment tags is optional now. If enabled it will strip the tags from all lines of a comment now.
aronacher
parents:
329
diff
changeset
|
193 tags to be removed from the collected comments. |
1 | 194 :param options: a dictionary of additional options (optional) |
195 :return: the list of extracted messages | |
196 :rtype: `list` | |
197 """ | |
198 fileobj = open(filename, 'U') | |
199 try: | |
338
b39145076d8a
Stripping of comment tags is optional now. If enabled it will strip the tags from all lines of a comment now.
aronacher
parents:
329
diff
changeset
|
200 return list(extract(method, fileobj, keywords, comment_tags, options, |
b39145076d8a
Stripping of comment tags is optional now. If enabled it will strip the tags from all lines of a comment now.
aronacher
parents:
329
diff
changeset
|
201 strip_comment_tags)) |
1 | 202 finally: |
203 fileobj.close() | |
204 | |
340
ce83b4f77114
added some newlines to extract and jslexer to stay consistent with the rest of the sourcecode.
aronacher
parents:
339
diff
changeset
|
205 |
84
3ae316b58231
Some cosmetic changes for the new translator comments support.
cmlenz
parents:
82
diff
changeset
|
206 def extract(method, fileobj, keywords=DEFAULT_KEYWORDS, comment_tags=(), |
338
b39145076d8a
Stripping of comment tags is optional now. If enabled it will strip the tags from all lines of a comment now.
aronacher
parents:
329
diff
changeset
|
207 options=None, strip_comment_tags=False): |
1 | 208 """Extract messages from the given file-like object using the specified |
209 extraction method. | |
224
0a71b675fc48
Fix for message extractors which return `None` as the gettext call.
palgarvio
parents:
223
diff
changeset
|
210 |
1 | 211 This function returns a list of tuples of the form: |
224
0a71b675fc48
Fix for message extractors which return `None` as the gettext call.
palgarvio
parents:
223
diff
changeset
|
212 |
80
116e34b8cefa
Added support for translator comments at the API and frontends levels.(See #12, item 1). Updated docs and tests accordingly.
palgarvio
parents:
75
diff
changeset
|
213 ``(lineno, message, comments)`` |
224
0a71b675fc48
Fix for message extractors which return `None` as the gettext call.
palgarvio
parents:
223
diff
changeset
|
214 |
1 | 215 The implementation dispatches the actual extraction to plugins, based on the |
216 value of the ``method`` parameter. | |
224
0a71b675fc48
Fix for message extractors which return `None` as the gettext call.
palgarvio
parents:
223
diff
changeset
|
217 |
1 | 218 >>> source = '''# foo module |
219 ... def run(argv): | |
220 ... print _('Hello, world!') | |
221 ... ''' | |
224
0a71b675fc48
Fix for message extractors which return `None` as the gettext call.
palgarvio
parents:
223
diff
changeset
|
222 |
1 | 223 >>> from StringIO import StringIO |
224 >>> for message in extract('python', StringIO(source)): | |
225 ... print message | |
164
e1199c0fb3bf
made the python extractor detect source file encodings from the magic encoding
pjenvey
parents:
162
diff
changeset
|
226 (3, u'Hello, world!', []) |
224
0a71b675fc48
Fix for message extractors which return `None` as the gettext call.
palgarvio
parents:
223
diff
changeset
|
227 |
250
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
228 :param method: a string specifying the extraction method (.e.g. "python"); |
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
229 if this is a simple name, the extraction function will be |
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
230 looked up by entry point; if it is an explicit reference |
329
35c19c01e4b5
Allow extraction method specification to use a dot instead of the colon for separating module and function names. See #105.
cmlenz
parents:
322
diff
changeset
|
231 to a function (of the form ``package.module:funcname`` or |
35c19c01e4b5
Allow extraction method specification to use a dot instead of the colon for separating module and function names. See #105.
cmlenz
parents:
322
diff
changeset
|
232 ``package.module.funcname``), the corresponding function |
35c19c01e4b5
Allow extraction method specification to use a dot instead of the colon for separating module and function names. See #105.
cmlenz
parents:
322
diff
changeset
|
233 will be imported and used |
1 | 234 :param fileobj: the file-like object the messages should be extracted from |
12
e6ba3e878b10
* Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents:
10
diff
changeset
|
235 :param keywords: a dictionary mapping keywords (i.e. names of functions |
e6ba3e878b10
* Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents:
10
diff
changeset
|
236 that should be recognized as translation functions) to |
e6ba3e878b10
* Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents:
10
diff
changeset
|
237 tuples that specify which of their arguments contain |
e6ba3e878b10
* Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents:
10
diff
changeset
|
238 localizable strings |
84
3ae316b58231
Some cosmetic changes for the new translator comments support.
cmlenz
parents:
82
diff
changeset
|
239 :param comment_tags: a list of translator tags to search for and include |
3ae316b58231
Some cosmetic changes for the new translator comments support.
cmlenz
parents:
82
diff
changeset
|
240 in the results |
1 | 241 :param options: a dictionary of additional options (optional) |
338
b39145076d8a
Stripping of comment tags is optional now. If enabled it will strip the tags from all lines of a comment now.
aronacher
parents:
329
diff
changeset
|
242 :param strip_comment_tags: a flag that if set to `True` causes all comment |
b39145076d8a
Stripping of comment tags is optional now. If enabled it will strip the tags from all lines of a comment now.
aronacher
parents:
329
diff
changeset
|
243 tags to be removed from the collected comments. |
1 | 244 :return: the list of extracted messages |
245 :rtype: `list` | |
246 :raise ValueError: if the extraction method is not registered | |
247 """ | |
322 | 248 func = None |
329
35c19c01e4b5
Allow extraction method specification to use a dot instead of the colon for separating module and function names. See #105.
cmlenz
parents:
322
diff
changeset
|
249 if ':' in method or '.' in method: |
35c19c01e4b5
Allow extraction method specification to use a dot instead of the colon for separating module and function names. See #105.
cmlenz
parents:
322
diff
changeset
|
250 if ':' not in method: |
35c19c01e4b5
Allow extraction method specification to use a dot instead of the colon for separating module and function names. See #105.
cmlenz
parents:
322
diff
changeset
|
251 lastdot = method.rfind('.') |
35c19c01e4b5
Allow extraction method specification to use a dot instead of the colon for separating module and function names. See #105.
cmlenz
parents:
322
diff
changeset
|
252 module, attrname = method[:lastdot], method[lastdot + 1:] |
35c19c01e4b5
Allow extraction method specification to use a dot instead of the colon for separating module and function names. See #105.
cmlenz
parents:
322
diff
changeset
|
253 else: |
35c19c01e4b5
Allow extraction method specification to use a dot instead of the colon for separating module and function names. See #105.
cmlenz
parents:
322
diff
changeset
|
254 module, attrname = method.split(':', 1) |
35c19c01e4b5
Allow extraction method specification to use a dot instead of the colon for separating module and function names. See #105.
cmlenz
parents:
322
diff
changeset
|
255 func = getattr(__import__(module, {}, {}, [attrname]), attrname) |
35c19c01e4b5
Allow extraction method specification to use a dot instead of the colon for separating module and function names. See #105.
cmlenz
parents:
322
diff
changeset
|
256 elif '.' in method: |
35c19c01e4b5
Allow extraction method specification to use a dot instead of the colon for separating module and function names. See #105.
cmlenz
parents:
322
diff
changeset
|
257 parts = method.split('.') |
35c19c01e4b5
Allow extraction method specification to use a dot instead of the colon for separating module and function names. See #105.
cmlenz
parents:
322
diff
changeset
|
258 clsname |
250
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
259 if ':' in method: |
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
260 module, clsname = method.split(':', 1) |
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
261 func = getattr(__import__(module, {}, {}, [clsname]), clsname) |
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
262 else: |
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
263 try: |
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
264 from pkg_resources import working_set |
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
265 except ImportError: |
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
266 # pkg_resources is not available, so we resort to looking up the |
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
267 # builtin extractors directly |
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
268 builtin = {'ignore': extract_nothing, 'python': extract_python} |
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
269 func = builtin.get(method) |
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
270 else: |
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
271 for entry_point in working_set.iter_entry_points(GROUP_NAME, |
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
272 method): |
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
273 func = entry_point.load(require=True) |
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
274 break |
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
275 if func is None: |
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
276 raise ValueError('Unknown extraction method %r' % method) |
222 | 277 |
250
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
278 results = func(fileobj, keywords.keys(), comment_tags, |
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
279 options=options or {}) |
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
280 for lineno, funcname, messages, comments in results: |
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
281 if funcname: |
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
282 spec = keywords[funcname] or (1,) |
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
283 else: |
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
284 spec = (1,) |
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
285 if not isinstance(messages, (list, tuple)): |
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
286 messages = [messages] |
258
5ca5fbd47766
skip messages that have less arguments than the keyword spec calls for
pjenvey
parents:
250
diff
changeset
|
287 if not messages: |
5ca5fbd47766
skip messages that have less arguments than the keyword spec calls for
pjenvey
parents:
250
diff
changeset
|
288 continue |
222 | 289 |
258
5ca5fbd47766
skip messages that have less arguments than the keyword spec calls for
pjenvey
parents:
250
diff
changeset
|
290 # Validate the messages against the keyword's specification |
250
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
291 msgs = [] |
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
292 invalid = False |
258
5ca5fbd47766
skip messages that have less arguments than the keyword spec calls for
pjenvey
parents:
250
diff
changeset
|
293 # last_index is 1 based like the keyword spec |
5ca5fbd47766
skip messages that have less arguments than the keyword spec calls for
pjenvey
parents:
250
diff
changeset
|
294 last_index = len(messages) |
250
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
295 for index in spec: |
258
5ca5fbd47766
skip messages that have less arguments than the keyword spec calls for
pjenvey
parents:
250
diff
changeset
|
296 if last_index < index: |
5ca5fbd47766
skip messages that have less arguments than the keyword spec calls for
pjenvey
parents:
250
diff
changeset
|
297 # Not enough arguments |
5ca5fbd47766
skip messages that have less arguments than the keyword spec calls for
pjenvey
parents:
250
diff
changeset
|
298 invalid = True |
5ca5fbd47766
skip messages that have less arguments than the keyword spec calls for
pjenvey
parents:
250
diff
changeset
|
299 break |
250
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
300 message = messages[index - 1] |
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
301 if message is None: |
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
302 invalid = True |
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
303 break |
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
304 msgs.append(message) |
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
305 if invalid: |
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
306 continue |
222 | 307 |
250
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
308 first_msg_index = spec[0] - 1 |
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
309 if not messages[first_msg_index]: |
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
310 # An empty string msgid isn't valid, emit a warning |
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
311 where = '%s:%i' % (hasattr(fileobj, 'name') and \ |
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
312 fileobj.name or '(unknown)', lineno) |
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
313 print >> sys.stderr, empty_msgid_warning % where |
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
314 continue |
12
e6ba3e878b10
* Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents:
10
diff
changeset
|
315 |
250
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
316 messages = tuple(msgs) |
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
317 if len(messages) == 1: |
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
318 messages = messages[0] |
338
b39145076d8a
Stripping of comment tags is optional now. If enabled it will strip the tags from all lines of a comment now.
aronacher
parents:
329
diff
changeset
|
319 |
b39145076d8a
Stripping of comment tags is optional now. If enabled it will strip the tags from all lines of a comment now.
aronacher
parents:
329
diff
changeset
|
320 if strip_comment_tags: |
b39145076d8a
Stripping of comment tags is optional now. If enabled it will strip the tags from all lines of a comment now.
aronacher
parents:
329
diff
changeset
|
321 _strip_comment_tags(comments, comment_tags) |
b39145076d8a
Stripping of comment tags is optional now. If enabled it will strip the tags from all lines of a comment now.
aronacher
parents:
329
diff
changeset
|
322 |
250
6c06570af1b9
Soften dependency on setuptools. Extraction methods can now be referenced using a special section in the mapping configuration, mapping short names to fully-qualified function references.
cmlenz
parents:
224
diff
changeset
|
323 yield lineno, messages, comments |
1 | 324 |
340
ce83b4f77114
added some newlines to extract and jslexer to stay consistent with the rest of the sourcecode.
aronacher
parents:
339
diff
changeset
|
325 |
84
3ae316b58231
Some cosmetic changes for the new translator comments support.
cmlenz
parents:
82
diff
changeset
|
326 def extract_nothing(fileobj, keywords, comment_tags, options): |
57
d930a3dfbf3d
* The `extract_messages` distutils command now operators on configurable input directories again, instead of the complete current directory. The `input_dirs` default to the package directories.
cmlenz
parents:
56
diff
changeset
|
327 """Pseudo extractor that does not actually extract anything, but simply |
d930a3dfbf3d
* The `extract_messages` distutils command now operators on configurable input directories again, instead of the complete current directory. The `input_dirs` default to the package directories.
cmlenz
parents:
56
diff
changeset
|
328 returns an empty list. |
d930a3dfbf3d
* The `extract_messages` distutils command now operators on configurable input directories again, instead of the complete current directory. The `input_dirs` default to the package directories.
cmlenz
parents:
56
diff
changeset
|
329 """ |
d930a3dfbf3d
* The `extract_messages` distutils command now operators on configurable input directories again, instead of the complete current directory. The `input_dirs` default to the package directories.
cmlenz
parents:
56
diff
changeset
|
330 return [] |
d930a3dfbf3d
* The `extract_messages` distutils command now operators on configurable input directories again, instead of the complete current directory. The `input_dirs` default to the package directories.
cmlenz
parents:
56
diff
changeset
|
331 |
340
ce83b4f77114
added some newlines to extract and jslexer to stay consistent with the rest of the sourcecode.
aronacher
parents:
339
diff
changeset
|
332 |
84
3ae316b58231
Some cosmetic changes for the new translator comments support.
cmlenz
parents:
82
diff
changeset
|
333 def extract_python(fileobj, keywords, comment_tags, options): |
1 | 334 """Extract messages from Python source code. |
224
0a71b675fc48
Fix for message extractors which return `None` as the gettext call.
palgarvio
parents:
223
diff
changeset
|
335 |
164
e1199c0fb3bf
made the python extractor detect source file encodings from the magic encoding
pjenvey
parents:
162
diff
changeset
|
336 :param fileobj: the seekable, file-like object the messages should be |
e1199c0fb3bf
made the python extractor detect source file encodings from the magic encoding
pjenvey
parents:
162
diff
changeset
|
337 extracted from |
1 | 338 :param keywords: a list of keywords (i.e. function names) that should be |
339 recognized as translation functions | |
84
3ae316b58231
Some cosmetic changes for the new translator comments support.
cmlenz
parents:
82
diff
changeset
|
340 :param comment_tags: a list of translator tags to search for and include |
3ae316b58231
Some cosmetic changes for the new translator comments support.
cmlenz
parents:
82
diff
changeset
|
341 in the results |
1 | 342 :param options: a dictionary of additional options (optional) |
81
85af04c72ccd
Fixed and added some documentation about the translator comments implemented in [81].
palgarvio
parents:
80
diff
changeset
|
343 :return: an iterator over ``(lineno, funcname, message, comments)`` tuples |
1 | 344 :rtype: ``iterator`` |
345 """ | |
222 | 346 funcname = lineno = message_lineno = None |
347 call_stack = -1 | |
1 | 348 buf = [] |
349 messages = [] | |
80
116e34b8cefa
Added support for translator comments at the API and frontends levels.(See #12, item 1). Updated docs and tests accordingly.
palgarvio
parents:
75
diff
changeset
|
350 translator_comments = [] |
222 | 351 in_def = in_translator_comments = False |
338
b39145076d8a
Stripping of comment tags is optional now. If enabled it will strip the tags from all lines of a comment now.
aronacher
parents:
329
diff
changeset
|
352 comment_tag = None |
1 | 353 |
222 | 354 encoding = parse_encoding(fileobj) or options.get('encoding', 'iso-8859-1') |
164
e1199c0fb3bf
made the python extractor detect source file encodings from the magic encoding
pjenvey
parents:
162
diff
changeset
|
355 |
1 | 356 tokens = generate_tokens(fileobj.readline) |
357 for tok, value, (lineno, _), _, _ in tokens: | |
222 | 358 if call_stack == -1 and tok == NAME and value in ('def', 'class'): |
359 in_def = True | |
360 elif tok == OP and value == '(': | |
361 if in_def: | |
362 # Avoid false positives for declarations such as: | |
363 # def gettext(arg='message'): | |
364 in_def = False | |
365 continue | |
366 if funcname: | |
367 message_lineno = lineno | |
368 call_stack += 1 | |
223 | 369 elif in_def and tok == OP and value == ':': |
370 # End of a class definition without parens | |
371 in_def = False | |
372 continue | |
222 | 373 elif call_stack == -1 and tok == COMMENT: |
92
ccb9da614597
Fixed bug introduced in [92], bad use of `lstrip()`. Added a unittest to test multiple translator comment tags.
palgarvio
parents:
91
diff
changeset
|
374 # Strip the comment token from the line |
164
e1199c0fb3bf
made the python extractor detect source file encodings from the magic encoding
pjenvey
parents:
162
diff
changeset
|
375 value = value.decode(encoding)[1:].strip() |
147 | 376 if in_translator_comments and \ |
93
f008662b5d6e
Commiting patch provided by pjenvey: Translator comments don't apply unless they immediately preceed the message.
palgarvio
parents:
92
diff
changeset
|
377 translator_comments[-1][0] == lineno - 1: |
92
ccb9da614597
Fixed bug introduced in [92], bad use of `lstrip()`. Added a unittest to test multiple translator comment tags.
palgarvio
parents:
91
diff
changeset
|
378 # We're already inside a translator comment, continue appending |
93
f008662b5d6e
Commiting patch provided by pjenvey: Translator comments don't apply unless they immediately preceed the message.
palgarvio
parents:
92
diff
changeset
|
379 translator_comments.append((lineno, value)) |
92
ccb9da614597
Fixed bug introduced in [92], bad use of `lstrip()`. Added a unittest to test multiple translator comment tags.
palgarvio
parents:
91
diff
changeset
|
380 continue |
ccb9da614597
Fixed bug introduced in [92], bad use of `lstrip()`. Added a unittest to test multiple translator comment tags.
palgarvio
parents:
91
diff
changeset
|
381 # If execution reaches this point, let's see if comment line |
ccb9da614597
Fixed bug introduced in [92], bad use of `lstrip()`. Added a unittest to test multiple translator comment tags.
palgarvio
parents:
91
diff
changeset
|
382 # starts with one of the comment tags |
85
04a2f16bdd04
Fixed de-pluralization bug introduced in [85] regarding the extraction of translator comments.
palgarvio
parents:
84
diff
changeset
|
383 for comment_tag in comment_tags: |
92
ccb9da614597
Fixed bug introduced in [92], bad use of `lstrip()`. Added a unittest to test multiple translator comment tags.
palgarvio
parents:
91
diff
changeset
|
384 if value.startswith(comment_tag): |
147 | 385 in_translator_comments = True |
338
b39145076d8a
Stripping of comment tags is optional now. If enabled it will strip the tags from all lines of a comment now.
aronacher
parents:
329
diff
changeset
|
386 translator_comments.append((lineno, value)) |
92
ccb9da614597
Fixed bug introduced in [92], bad use of `lstrip()`. Added a unittest to test multiple translator comment tags.
palgarvio
parents:
91
diff
changeset
|
387 break |
222 | 388 elif funcname and call_stack == 0: |
1 | 389 if tok == OP and value == ')': |
390 if buf: | |
391 messages.append(''.join(buf)) | |
392 del buf[:] | |
222 | 393 else: |
394 messages.append(None) | |
93
f008662b5d6e
Commiting patch provided by pjenvey: Translator comments don't apply unless they immediately preceed the message.
palgarvio
parents:
92
diff
changeset
|
395 |
222 | 396 if len(messages) > 1: |
397 messages = tuple(messages) | |
398 else: | |
399 messages = messages[0] | |
400 # Comments don't apply unless they immediately preceed the | |
401 # message | |
402 if translator_comments and \ | |
403 translator_comments[-1][0] < message_lineno - 1: | |
404 translator_comments = [] | |
405 | |
406 yield (message_lineno, funcname, messages, | |
407 [comment[1] for comment in translator_comments]) | |
408 | |
409 funcname = lineno = message_lineno = None | |
410 call_stack = -1 | |
1 | 411 messages = [] |
80
116e34b8cefa
Added support for translator comments at the API and frontends levels.(See #12, item 1). Updated docs and tests accordingly.
palgarvio
parents:
75
diff
changeset
|
412 translator_comments = [] |
222 | 413 in_translator_comments = False |
1 | 414 elif tok == STRING: |
164
e1199c0fb3bf
made the python extractor detect source file encodings from the magic encoding
pjenvey
parents:
162
diff
changeset
|
415 # Unwrap quotes in a safe manner, maintaining the string's |
e1199c0fb3bf
made the python extractor detect source file encodings from the magic encoding
pjenvey
parents:
162
diff
changeset
|
416 # encoding |
222 | 417 # https://sourceforge.net/tracker/?func=detail&atid=355470& |
418 # aid=617979&group_id=5470 | |
164
e1199c0fb3bf
made the python extractor detect source file encodings from the magic encoding
pjenvey
parents:
162
diff
changeset
|
419 value = eval('# coding=%s\n%s' % (encoding, value), |
e1199c0fb3bf
made the python extractor detect source file encodings from the magic encoding
pjenvey
parents:
162
diff
changeset
|
420 {'__builtins__':{}}, {}) |
e1199c0fb3bf
made the python extractor detect source file encodings from the magic encoding
pjenvey
parents:
162
diff
changeset
|
421 if isinstance(value, str): |
e1199c0fb3bf
made the python extractor detect source file encodings from the magic encoding
pjenvey
parents:
162
diff
changeset
|
422 value = value.decode(encoding) |
e1199c0fb3bf
made the python extractor detect source file encodings from the magic encoding
pjenvey
parents:
162
diff
changeset
|
423 buf.append(value) |
1 | 424 elif tok == OP and value == ',': |
222 | 425 if buf: |
426 messages.append(''.join(buf)) | |
427 del buf[:] | |
428 else: | |
429 messages.append(None) | |
430 elif call_stack > 0 and tok == OP and value == ')': | |
431 call_stack -= 1 | |
432 elif funcname and call_stack == -1: | |
1 | 433 funcname = None |
434 elif tok == NAME and value in keywords: | |
435 funcname = value | |
339 | 436 |
340
ce83b4f77114
added some newlines to extract and jslexer to stay consistent with the rest of the sourcecode.
aronacher
parents:
339
diff
changeset
|
437 |
339 | 438 def extract_javascript(fileobj, keywords, comment_tags, options): |
439 """Extract messages from JavaScript source code. | |
440 | |
441 :param fileobj: the seekable, file-like object the messages should be | |
442 extracted from | |
443 :param keywords: a list of keywords (i.e. function names) that should be | |
444 recognized as translation functions | |
445 :param comment_tags: a list of translator tags to search for and include | |
446 in the results | |
447 :param options: a dictionary of additional options (optional) | |
448 :return: an iterator over ``(lineno, funcname, message, comments)`` tuples | |
449 :rtype: ``iterator`` | |
450 """ | |
451 from babel.messages.jslexer import tokenize, unquote_string | |
452 funcname = message_lineno = None | |
453 messages = [] | |
454 last_argument = None | |
455 translator_comments = [] | |
456 encoding = options.get('encoding', 'utf-8') | |
457 last_token = None | |
458 call_stack = -1 | |
459 | |
460 for token in tokenize(fileobj.read().decode(encoding)): | |
461 if token.type == 'operator' and token.value == '(': | |
462 if funcname: | |
463 message_lineno = token.lineno | |
464 call_stack += 1 | |
465 | |
466 elif call_stack == -1 and token.type == 'linecomment': | |
467 value = token.value[2:].strip() | |
468 if translator_comments and \ | |
469 translator_comments[-1][0] == token.lineno - 1: | |
470 translator_comments.append((token.lineno, value)) | |
471 continue | |
472 | |
473 for comment_tag in comment_tags: | |
474 if value.startswith(comment_tag): | |
475 translator_comments.append((token.lineno, value.strip())) | |
476 break | |
477 | |
478 elif token.type == 'multilinecomment': | |
479 # only one multi-line comment may preceed a translation | |
480 translator_comments = [] | |
481 value = token.value[2:-2].strip() | |
482 for comment_tag in comment_tags: | |
483 if value.startswith(comment_tag): | |
484 lines = value.splitlines() | |
485 if lines: | |
486 lines[0] = lines[0].strip() | |
487 lines[1:] = dedent('\n'.join(lines[1:])).splitlines() | |
488 for offset, line in enumerate(lines): | |
489 translator_comments.append((token.lineno + offset, | |
490 line)) | |
491 break | |
492 | |
493 elif funcname and call_stack == 0: | |
494 if token.type == 'operator' and token.value == ')': | |
495 if last_argument is not None: | |
496 messages.append(last_argument) | |
497 if len(messages) > 1: | |
498 messages = tuple(messages) | |
499 elif messages: | |
500 messages = messages[0] | |
501 else: | |
502 messages = None | |
503 | |
504 # Comments don't apply unless they immediately preceed the | |
505 # message | |
506 if translator_comments and \ | |
507 translator_comments[-1][0] < message_lineno - 1: | |
508 translator_comments = [] | |
509 | |
510 if messages is not None: | |
511 yield (message_lineno, funcname, messages, | |
512 [comment[1] for comment in translator_comments]) | |
513 | |
514 funcname = message_lineno = last_argument = None | |
515 translator_comments = [] | |
516 messages = [] | |
517 call_stack = -1 | |
518 | |
519 elif token.type == 'string': | |
520 last_argument = unquote_string(token.value) | |
521 | |
522 elif token.type == 'operator' and token.value == ',': | |
523 if last_argument is not None: | |
524 messages.append(last_argument) | |
525 last_argument = None | |
526 else: | |
527 messages.append(None) | |
528 | |
529 elif call_stack > 0 and token.type == 'operator' \ | |
530 and token.value == ')': | |
531 call_stack -= 1 | |
532 | |
533 elif funcname and call_stack == -1: | |
534 funcname = None | |
535 | |
536 elif call_stack == -1 and token.type == 'name' and \ | |
537 token.value in keywords and \ | |
538 (last_token is None or last_token.type != 'name' or | |
539 last_token.value != 'function'): | |
540 funcname = token.value | |
541 | |
542 last_token = token |