annotate babel/messages/extract.py @ 138:2071e375cf29

Genshi extraction method has moved to Genshi project. Closes #13.
author cmlenz
date Wed, 20 Jun 2007 10:02:04 +0000
parents bfe7357a4754
children a5914ba672d1
rev   line source
1
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
1 # -*- coding: utf-8 -*-
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
2 #
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
3 # Copyright (C) 2007 Edgewall Software
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
4 # All rights reserved.
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
5 #
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
6 # This software is licensed as described in the file COPYING, which
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
7 # you should have received as part of this distribution. The terms
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
8 # are also available at http://babel.edgewall.org/wiki/License.
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
9 #
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
10 # This software consists of voluntary contributions made by many
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
11 # individuals. For the exact contribution history, see the revision
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
12 # history and logs, available at http://babel.edgewall.org/log/.
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
13
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
14 """Basic infrastructure for extracting localizable messages from source files.
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
15
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
16 This module defines an extensible system for collecting localizable message
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
17 strings from a variety of sources. A native extractor for Python source files
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
18 is builtin, extractors for other sources can be added using very simple plugins.
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
19
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
20 The main entry points into the extraction functionality are the functions
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
21 `extract_from_dir` and `extract_from_file`.
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
22 """
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
23
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
24 import os
44
818646bcd90b Some work towards #4.
cmlenz
parents: 36
diff changeset
25 try:
818646bcd90b Some work towards #4.
cmlenz
parents: 36
diff changeset
26 set
818646bcd90b Some work towards #4.
cmlenz
parents: 36
diff changeset
27 except NameError:
818646bcd90b Some work towards #4.
cmlenz
parents: 36
diff changeset
28 from sets import Set as set
1
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
29 import sys
80
9c84b9fa5d30 Added support for translator comments at the API and frontends levels.(See #12, item 1). Updated docs and tests accordingly.
palgarvio
parents: 75
diff changeset
30 from tokenize import generate_tokens, NAME, OP, STRING, COMMENT
1
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
31
44
818646bcd90b Some work towards #4.
cmlenz
parents: 36
diff changeset
32 from babel.util import pathmatch, relpath
1
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
33
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
34 __all__ = ['extract', 'extract_from_dir', 'extract_from_file']
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
35 __docformat__ = 'restructuredtext en'
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
36
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
37 GROUP_NAME = 'babel.extractors'
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
38
12
a2c54ef107c2 * Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents: 10
diff changeset
39 DEFAULT_KEYWORDS = {
10
b24987f7318d Both Babel's [source:trunk/babel/catalog/frontend.py frontend] and [source:trunk/babel/catalog/extract.py extract] now handle keyword indices. Also added an extra boolean flag so that the default keywords defined by Babel are not included in the keywords to search for when extracting strings.
palgarvio
parents: 1
diff changeset
40 '_': None,
b24987f7318d Both Babel's [source:trunk/babel/catalog/frontend.py frontend] and [source:trunk/babel/catalog/extract.py extract] now handle keyword indices. Also added an extra boolean flag so that the default keywords defined by Babel are not included in the keywords to search for when extracting strings.
palgarvio
parents: 1
diff changeset
41 'gettext': None,
b24987f7318d Both Babel's [source:trunk/babel/catalog/frontend.py frontend] and [source:trunk/babel/catalog/extract.py extract] now handle keyword indices. Also added an extra boolean flag so that the default keywords defined by Babel are not included in the keywords to search for when extracting strings.
palgarvio
parents: 1
diff changeset
42 'ngettext': (1, 2),
b24987f7318d Both Babel's [source:trunk/babel/catalog/frontend.py frontend] and [source:trunk/babel/catalog/extract.py extract] now handle keyword indices. Also added an extra boolean flag so that the default keywords defined by Babel are not included in the keywords to search for when extracting strings.
palgarvio
parents: 1
diff changeset
43 'ugettext': None,
b24987f7318d Both Babel's [source:trunk/babel/catalog/frontend.py frontend] and [source:trunk/babel/catalog/extract.py extract] now handle keyword indices. Also added an extra boolean flag so that the default keywords defined by Babel are not included in the keywords to search for when extracting strings.
palgarvio
parents: 1
diff changeset
44 'ungettext': (1, 2),
b24987f7318d Both Babel's [source:trunk/babel/catalog/frontend.py frontend] and [source:trunk/babel/catalog/extract.py extract] now handle keyword indices. Also added an extra boolean flag so that the default keywords defined by Babel are not included in the keywords to search for when extracting strings.
palgarvio
parents: 1
diff changeset
45 'dgettext': (2,),
b24987f7318d Both Babel's [source:trunk/babel/catalog/frontend.py frontend] and [source:trunk/babel/catalog/extract.py extract] now handle keyword indices. Also added an extra boolean flag so that the default keywords defined by Babel are not included in the keywords to search for when extracting strings.
palgarvio
parents: 1
diff changeset
46 'dngettext': (2, 3),
b24987f7318d Both Babel's [source:trunk/babel/catalog/frontend.py frontend] and [source:trunk/babel/catalog/extract.py extract] now handle keyword indices. Also added an extra boolean flag so that the default keywords defined by Babel are not included in the keywords to search for when extracting strings.
palgarvio
parents: 1
diff changeset
47 }
1
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
48
62
84d400066b71 The order of extraction methods is now preserved (see #10).
cmlenz
parents: 57
diff changeset
49 DEFAULT_MAPPING = [('**.py', 'python')]
1
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
50
47
76381d4b3635 Support passing extraction method mapping and options from the frontends (see #4). No distutils/setuptools keyword supported yet, but the rest seems to be working okay.
cmlenz
parents: 44
diff changeset
51 def extract_from_dir(dirname=os.getcwd(), method_map=DEFAULT_MAPPING,
76381d4b3635 Support passing extraction method mapping and options from the frontends (see #4). No distutils/setuptools keyword supported yet, but the rest seems to be working okay.
cmlenz
parents: 44
diff changeset
52 options_map=None, keywords=DEFAULT_KEYWORDS,
84
4ff9cc26c11b Some cosmetic changes for the new translator comments support.
cmlenz
parents: 82
diff changeset
53 comment_tags=(), callback=None):
1
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
54 """Extract messages from any source files found in the given directory.
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
55
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
56 This function generates tuples of the form:
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
57
82
0b5d604399b8 Missed some param's documentation regarding translator comments.
palgarvio
parents: 81
diff changeset
58 ``(filename, lineno, message, comments)``
1
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
59
44
818646bcd90b Some work towards #4.
cmlenz
parents: 36
diff changeset
60 Which extraction method is used per file is determined by the `method_map`
818646bcd90b Some work towards #4.
cmlenz
parents: 36
diff changeset
61 parameter, which maps extended glob patterns to extraction method names.
818646bcd90b Some work towards #4.
cmlenz
parents: 36
diff changeset
62 For example, the following is the default mapping:
1
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
63
62
84d400066b71 The order of extraction methods is now preserved (see #10).
cmlenz
parents: 57
diff changeset
64 >>> method_map = [
84d400066b71 The order of extraction methods is now preserved (see #10).
cmlenz
parents: 57
diff changeset
65 ... ('**.py', 'python')
84d400066b71 The order of extraction methods is now preserved (see #10).
cmlenz
parents: 57
diff changeset
66 ... ]
1
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
67
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
68 This basically says that files with the filename extension ".py" at any
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
69 level inside the directory should be processed by the "python" extraction
44
818646bcd90b Some work towards #4.
cmlenz
parents: 36
diff changeset
70 method. Files that don't match any of the mapping patterns are ignored. See
818646bcd90b Some work towards #4.
cmlenz
parents: 36
diff changeset
71 the documentation of the `pathmatch` function for details on the pattern
818646bcd90b Some work towards #4.
cmlenz
parents: 36
diff changeset
72 syntax.
1
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
73
62
84d400066b71 The order of extraction methods is now preserved (see #10).
cmlenz
parents: 57
diff changeset
74 The following extended mapping would also use the "genshi" extraction
84d400066b71 The order of extraction methods is now preserved (see #10).
cmlenz
parents: 57
diff changeset
75 method on any file in "templates" subdirectory:
1
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
76
62
84d400066b71 The order of extraction methods is now preserved (see #10).
cmlenz
parents: 57
diff changeset
77 >>> method_map = [
84d400066b71 The order of extraction methods is now preserved (see #10).
cmlenz
parents: 57
diff changeset
78 ... ('**/templates/**.*', 'genshi'),
84d400066b71 The order of extraction methods is now preserved (see #10).
cmlenz
parents: 57
diff changeset
79 ... ('**.py', 'python')
84d400066b71 The order of extraction methods is now preserved (see #10).
cmlenz
parents: 57
diff changeset
80 ... ]
44
818646bcd90b Some work towards #4.
cmlenz
parents: 36
diff changeset
81
818646bcd90b Some work towards #4.
cmlenz
parents: 36
diff changeset
82 The dictionary provided by the optional `options_map` parameter augments
62
84d400066b71 The order of extraction methods is now preserved (see #10).
cmlenz
parents: 57
diff changeset
83 these mappings. It uses extended glob patterns as keys, and the values are
84d400066b71 The order of extraction methods is now preserved (see #10).
cmlenz
parents: 57
diff changeset
84 dictionaries mapping options names to option values (both strings).
44
818646bcd90b Some work towards #4.
cmlenz
parents: 36
diff changeset
85
818646bcd90b Some work towards #4.
cmlenz
parents: 36
diff changeset
86 The glob patterns of the `options_map` do not necessarily need to be the
62
84d400066b71 The order of extraction methods is now preserved (see #10).
cmlenz
parents: 57
diff changeset
87 same as those used in the method mapping. For example, while all files in
84d400066b71 The order of extraction methods is now preserved (see #10).
cmlenz
parents: 57
diff changeset
88 the ``templates`` folders in an application may be Genshi applications, the
44
818646bcd90b Some work towards #4.
cmlenz
parents: 36
diff changeset
89 options for those files may differ based on extension:
818646bcd90b Some work towards #4.
cmlenz
parents: 36
diff changeset
90
818646bcd90b Some work towards #4.
cmlenz
parents: 36
diff changeset
91 >>> options_map = {
818646bcd90b Some work towards #4.
cmlenz
parents: 36
diff changeset
92 ... '**/templates/**.txt': {
818646bcd90b Some work towards #4.
cmlenz
parents: 36
diff changeset
93 ... 'template_class': 'genshi.template.text.TextTemplate',
818646bcd90b Some work towards #4.
cmlenz
parents: 36
diff changeset
94 ... 'encoding': 'latin-1'
818646bcd90b Some work towards #4.
cmlenz
parents: 36
diff changeset
95 ... },
818646bcd90b Some work towards #4.
cmlenz
parents: 36
diff changeset
96 ... '**/templates/**.html': {
818646bcd90b Some work towards #4.
cmlenz
parents: 36
diff changeset
97 ... 'include_attrs': ''
818646bcd90b Some work towards #4.
cmlenz
parents: 36
diff changeset
98 ... }
1
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
99 ... }
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
100
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
101 :param dirname: the path to the directory to extract messages from
62
84d400066b71 The order of extraction methods is now preserved (see #10).
cmlenz
parents: 57
diff changeset
102 :param method_map: a list of ``(pattern, method)`` tuples that maps of
84d400066b71 The order of extraction methods is now preserved (see #10).
cmlenz
parents: 57
diff changeset
103 extraction method names to extended glob patterns
44
818646bcd90b Some work towards #4.
cmlenz
parents: 36
diff changeset
104 :param options_map: a dictionary of additional options (optional)
12
a2c54ef107c2 * Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents: 10
diff changeset
105 :param keywords: a dictionary mapping keywords (i.e. names of functions
a2c54ef107c2 * Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents: 10
diff changeset
106 that should be recognized as translation functions) to
a2c54ef107c2 * Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents: 10
diff changeset
107 tuples that specify which of their arguments contain
a2c54ef107c2 * Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents: 10
diff changeset
108 localizable strings
84
4ff9cc26c11b Some cosmetic changes for the new translator comments support.
cmlenz
parents: 82
diff changeset
109 :param comment_tags: a list of tags of translator comments to search for
4ff9cc26c11b Some cosmetic changes for the new translator comments support.
cmlenz
parents: 82
diff changeset
110 and include in the results
47
76381d4b3635 Support passing extraction method mapping and options from the frontends (see #4). No distutils/setuptools keyword supported yet, but the rest seems to be working okay.
cmlenz
parents: 44
diff changeset
111 :param callback: a function that is called for every file that message are
76381d4b3635 Support passing extraction method mapping and options from the frontends (see #4). No distutils/setuptools keyword supported yet, but the rest seems to be working okay.
cmlenz
parents: 44
diff changeset
112 extracted from, just before the extraction itself is
75
05502363c925 Fixed MIME type of new doc page.
cmlenz
parents: 62
diff changeset
113 performed; the function is passed the filename, the name
05502363c925 Fixed MIME type of new doc page.
cmlenz
parents: 62
diff changeset
114 of the extraction method and and the options dictionary as
05502363c925 Fixed MIME type of new doc page.
cmlenz
parents: 62
diff changeset
115 positional arguments, in that order
1
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
116 :return: an iterator over ``(filename, lineno, funcname, message)`` tuples
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
117 :rtype: ``iterator``
44
818646bcd90b Some work towards #4.
cmlenz
parents: 36
diff changeset
118 :see: `pathmatch`
1
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
119 """
44
818646bcd90b Some work towards #4.
cmlenz
parents: 36
diff changeset
120 if options_map is None:
818646bcd90b Some work towards #4.
cmlenz
parents: 36
diff changeset
121 options_map = {}
56
27fba894d3ca Add actual data structures for handling message catalogs, so that more code can be reused here between the frontends.
cmlenz
parents: 54
diff changeset
122
44
818646bcd90b Some work towards #4.
cmlenz
parents: 36
diff changeset
123 absname = os.path.abspath(dirname)
818646bcd90b Some work towards #4.
cmlenz
parents: 36
diff changeset
124 for root, dirnames, filenames in os.walk(absname):
818646bcd90b Some work towards #4.
cmlenz
parents: 36
diff changeset
125 for subdir in dirnames:
818646bcd90b Some work towards #4.
cmlenz
parents: 36
diff changeset
126 if subdir.startswith('.') or subdir.startswith('_'):
818646bcd90b Some work towards #4.
cmlenz
parents: 36
diff changeset
127 dirnames.remove(subdir)
818646bcd90b Some work towards #4.
cmlenz
parents: 36
diff changeset
128 for filename in filenames:
818646bcd90b Some work towards #4.
cmlenz
parents: 36
diff changeset
129 filename = relpath(
818646bcd90b Some work towards #4.
cmlenz
parents: 36
diff changeset
130 os.path.join(root, filename).replace(os.sep, '/'),
818646bcd90b Some work towards #4.
cmlenz
parents: 36
diff changeset
131 dirname
818646bcd90b Some work towards #4.
cmlenz
parents: 36
diff changeset
132 )
62
84d400066b71 The order of extraction methods is now preserved (see #10).
cmlenz
parents: 57
diff changeset
133 for pattern, method in method_map:
44
818646bcd90b Some work towards #4.
cmlenz
parents: 36
diff changeset
134 if pathmatch(pattern, filename):
818646bcd90b Some work towards #4.
cmlenz
parents: 36
diff changeset
135 filepath = os.path.join(absname, filename)
818646bcd90b Some work towards #4.
cmlenz
parents: 36
diff changeset
136 options = {}
818646bcd90b Some work towards #4.
cmlenz
parents: 36
diff changeset
137 for opattern, odict in options_map.items():
818646bcd90b Some work towards #4.
cmlenz
parents: 36
diff changeset
138 if pathmatch(opattern, filename):
818646bcd90b Some work towards #4.
cmlenz
parents: 36
diff changeset
139 options = odict
47
76381d4b3635 Support passing extraction method mapping and options from the frontends (see #4). No distutils/setuptools keyword supported yet, but the rest seems to be working okay.
cmlenz
parents: 44
diff changeset
140 if callback:
57
a6183d300a6e * The `extract_messages` distutils command now operators on configurable input directories again, instead of the complete current directory. The `input_dirs` default to the package directories.
cmlenz
parents: 56
diff changeset
141 callback(filename, method, options)
80
9c84b9fa5d30 Added support for translator comments at the API and frontends levels.(See #12, item 1). Updated docs and tests accordingly.
palgarvio
parents: 75
diff changeset
142 for lineno, message, comments in \
9c84b9fa5d30 Added support for translator comments at the API and frontends levels.(See #12, item 1). Updated docs and tests accordingly.
palgarvio
parents: 75
diff changeset
143 extract_from_file(method, filepath,
9c84b9fa5d30 Added support for translator comments at the API and frontends levels.(See #12, item 1). Updated docs and tests accordingly.
palgarvio
parents: 75
diff changeset
144 keywords=keywords,
84
4ff9cc26c11b Some cosmetic changes for the new translator comments support.
cmlenz
parents: 82
diff changeset
145 comment_tags=comment_tags,
80
9c84b9fa5d30 Added support for translator comments at the API and frontends levels.(See #12, item 1). Updated docs and tests accordingly.
palgarvio
parents: 75
diff changeset
146 options=options):
9c84b9fa5d30 Added support for translator comments at the API and frontends levels.(See #12, item 1). Updated docs and tests accordingly.
palgarvio
parents: 75
diff changeset
147 yield filename, lineno, message, comments
57
a6183d300a6e * The `extract_messages` distutils command now operators on configurable input directories again, instead of the complete current directory. The `input_dirs` default to the package directories.
cmlenz
parents: 56
diff changeset
148 break
1
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
149
12
a2c54ef107c2 * Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents: 10
diff changeset
150 def extract_from_file(method, filename, keywords=DEFAULT_KEYWORDS,
84
4ff9cc26c11b Some cosmetic changes for the new translator comments support.
cmlenz
parents: 82
diff changeset
151 comment_tags=(), options=None):
1
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
152 """Extract messages from a specific file.
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
153
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
154 This function returns a list of tuples of the form:
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
155
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
156 ``(lineno, funcname, message)``
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
157
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
158 :param filename: the path to the file to extract messages from
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
159 :param method: a string specifying the extraction method (.e.g. "python")
12
a2c54ef107c2 * Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents: 10
diff changeset
160 :param keywords: a dictionary mapping keywords (i.e. names of functions
a2c54ef107c2 * Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents: 10
diff changeset
161 that should be recognized as translation functions) to
a2c54ef107c2 * Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents: 10
diff changeset
162 tuples that specify which of their arguments contain
a2c54ef107c2 * Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents: 10
diff changeset
163 localizable strings
84
4ff9cc26c11b Some cosmetic changes for the new translator comments support.
cmlenz
parents: 82
diff changeset
164 :param comment_tags: a list of translator tags to search for and include
4ff9cc26c11b Some cosmetic changes for the new translator comments support.
cmlenz
parents: 82
diff changeset
165 in the results
1
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
166 :param options: a dictionary of additional options (optional)
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
167 :return: the list of extracted messages
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
168 :rtype: `list`
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
169 """
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
170 fileobj = open(filename, 'U')
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
171 try:
84
4ff9cc26c11b Some cosmetic changes for the new translator comments support.
cmlenz
parents: 82
diff changeset
172 return list(extract(method, fileobj, keywords, comment_tags, options))
1
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
173 finally:
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
174 fileobj.close()
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
175
84
4ff9cc26c11b Some cosmetic changes for the new translator comments support.
cmlenz
parents: 82
diff changeset
176 def extract(method, fileobj, keywords=DEFAULT_KEYWORDS, comment_tags=(),
80
9c84b9fa5d30 Added support for translator comments at the API and frontends levels.(See #12, item 1). Updated docs and tests accordingly.
palgarvio
parents: 75
diff changeset
177 options=None):
1
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
178 """Extract messages from the given file-like object using the specified
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
179 extraction method.
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
180
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
181 This function returns a list of tuples of the form:
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
182
80
9c84b9fa5d30 Added support for translator comments at the API and frontends levels.(See #12, item 1). Updated docs and tests accordingly.
palgarvio
parents: 75
diff changeset
183 ``(lineno, message, comments)``
1
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
184
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
185 The implementation dispatches the actual extraction to plugins, based on the
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
186 value of the ``method`` parameter.
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
187
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
188 >>> source = '''# foo module
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
189 ... def run(argv):
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
190 ... print _('Hello, world!')
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
191 ... '''
10
b24987f7318d Both Babel's [source:trunk/babel/catalog/frontend.py frontend] and [source:trunk/babel/catalog/extract.py extract] now handle keyword indices. Also added an extra boolean flag so that the default keywords defined by Babel are not included in the keywords to search for when extracting strings.
palgarvio
parents: 1
diff changeset
192
1
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
193 >>> from StringIO import StringIO
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
194 >>> for message in extract('python', StringIO(source)):
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
195 ... print message
80
9c84b9fa5d30 Added support for translator comments at the API and frontends levels.(See #12, item 1). Updated docs and tests accordingly.
palgarvio
parents: 75
diff changeset
196 (3, 'Hello, world!', [])
1
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
197
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
198 :param method: a string specifying the extraction method (.e.g. "python")
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
199 :param fileobj: the file-like object the messages should be extracted from
12
a2c54ef107c2 * Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents: 10
diff changeset
200 :param keywords: a dictionary mapping keywords (i.e. names of functions
a2c54ef107c2 * Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents: 10
diff changeset
201 that should be recognized as translation functions) to
a2c54ef107c2 * Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents: 10
diff changeset
202 tuples that specify which of their arguments contain
a2c54ef107c2 * Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents: 10
diff changeset
203 localizable strings
84
4ff9cc26c11b Some cosmetic changes for the new translator comments support.
cmlenz
parents: 82
diff changeset
204 :param comment_tags: a list of translator tags to search for and include
4ff9cc26c11b Some cosmetic changes for the new translator comments support.
cmlenz
parents: 82
diff changeset
205 in the results
1
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
206 :param options: a dictionary of additional options (optional)
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
207 :return: the list of extracted messages
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
208 :rtype: `list`
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
209 :raise ValueError: if the extraction method is not registered
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
210 """
12
a2c54ef107c2 * Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents: 10
diff changeset
211 from pkg_resources import working_set
a2c54ef107c2 * Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents: 10
diff changeset
212
1
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
213 for entry_point in working_set.iter_entry_points(GROUP_NAME, method):
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
214 func = entry_point.load(require=True)
84
4ff9cc26c11b Some cosmetic changes for the new translator comments support.
cmlenz
parents: 82
diff changeset
215 results = func(fileobj, keywords.keys(), comment_tags,
4ff9cc26c11b Some cosmetic changes for the new translator comments support.
cmlenz
parents: 82
diff changeset
216 options=options or {})
4ff9cc26c11b Some cosmetic changes for the new translator comments support.
cmlenz
parents: 82
diff changeset
217 for lineno, funcname, messages, comments in results:
10
b24987f7318d Both Babel's [source:trunk/babel/catalog/frontend.py frontend] and [source:trunk/babel/catalog/extract.py extract] now handle keyword indices. Also added an extra boolean flag so that the default keywords defined by Babel are not included in the keywords to search for when extracting strings.
palgarvio
parents: 1
diff changeset
218 if isinstance(messages, (list, tuple)):
b24987f7318d Both Babel's [source:trunk/babel/catalog/frontend.py frontend] and [source:trunk/babel/catalog/extract.py extract] now handle keyword indices. Also added an extra boolean flag so that the default keywords defined by Babel are not included in the keywords to search for when extracting strings.
palgarvio
parents: 1
diff changeset
219 msgs = []
12
a2c54ef107c2 * Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents: 10
diff changeset
220 for index in keywords[funcname]:
a2c54ef107c2 * Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents: 10
diff changeset
221 msgs.append(messages[index - 1])
10
b24987f7318d Both Babel's [source:trunk/babel/catalog/frontend.py frontend] and [source:trunk/babel/catalog/extract.py extract] now handle keyword indices. Also added an extra boolean flag so that the default keywords defined by Babel are not included in the keywords to search for when extracting strings.
palgarvio
parents: 1
diff changeset
222 messages = tuple(msgs)
b24987f7318d Both Babel's [source:trunk/babel/catalog/frontend.py frontend] and [source:trunk/babel/catalog/extract.py extract] now handle keyword indices. Also added an extra boolean flag so that the default keywords defined by Babel are not included in the keywords to search for when extracting strings.
palgarvio
parents: 1
diff changeset
223 if len(messages) == 1:
b24987f7318d Both Babel's [source:trunk/babel/catalog/frontend.py frontend] and [source:trunk/babel/catalog/extract.py extract] now handle keyword indices. Also added an extra boolean flag so that the default keywords defined by Babel are not included in the keywords to search for when extracting strings.
palgarvio
parents: 1
diff changeset
224 messages = messages[0]
80
9c84b9fa5d30 Added support for translator comments at the API and frontends levels.(See #12, item 1). Updated docs and tests accordingly.
palgarvio
parents: 75
diff changeset
225 yield lineno, messages, comments
10
b24987f7318d Both Babel's [source:trunk/babel/catalog/frontend.py frontend] and [source:trunk/babel/catalog/extract.py extract] now handle keyword indices. Also added an extra boolean flag so that the default keywords defined by Babel are not included in the keywords to search for when extracting strings.
palgarvio
parents: 1
diff changeset
226 return
12
a2c54ef107c2 * Removed pkg_resources/setuptools requirement from various places.
cmlenz
parents: 10
diff changeset
227
1
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
228 raise ValueError('Unknown extraction method %r' % method)
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
229
84
4ff9cc26c11b Some cosmetic changes for the new translator comments support.
cmlenz
parents: 82
diff changeset
230 def extract_nothing(fileobj, keywords, comment_tags, options):
57
a6183d300a6e * The `extract_messages` distutils command now operators on configurable input directories again, instead of the complete current directory. The `input_dirs` default to the package directories.
cmlenz
parents: 56
diff changeset
231 """Pseudo extractor that does not actually extract anything, but simply
a6183d300a6e * The `extract_messages` distutils command now operators on configurable input directories again, instead of the complete current directory. The `input_dirs` default to the package directories.
cmlenz
parents: 56
diff changeset
232 returns an empty list.
a6183d300a6e * The `extract_messages` distutils command now operators on configurable input directories again, instead of the complete current directory. The `input_dirs` default to the package directories.
cmlenz
parents: 56
diff changeset
233 """
a6183d300a6e * The `extract_messages` distutils command now operators on configurable input directories again, instead of the complete current directory. The `input_dirs` default to the package directories.
cmlenz
parents: 56
diff changeset
234 return []
a6183d300a6e * The `extract_messages` distutils command now operators on configurable input directories again, instead of the complete current directory. The `input_dirs` default to the package directories.
cmlenz
parents: 56
diff changeset
235
84
4ff9cc26c11b Some cosmetic changes for the new translator comments support.
cmlenz
parents: 82
diff changeset
236 def extract_python(fileobj, keywords, comment_tags, options):
1
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
237 """Extract messages from Python source code.
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
238
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
239 :param fileobj: the file-like object the messages should be extracted from
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
240 :param keywords: a list of keywords (i.e. function names) that should be
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
241 recognized as translation functions
84
4ff9cc26c11b Some cosmetic changes for the new translator comments support.
cmlenz
parents: 82
diff changeset
242 :param comment_tags: a list of translator tags to search for and include
4ff9cc26c11b Some cosmetic changes for the new translator comments support.
cmlenz
parents: 82
diff changeset
243 in the results
1
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
244 :param options: a dictionary of additional options (optional)
81
1e89661e6b26 Fixed and added some documentation about the translator comments implemented in [81].
palgarvio
parents: 80
diff changeset
245 :return: an iterator over ``(lineno, funcname, message, comments)`` tuples
1
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
246 :rtype: ``iterator``
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
247 """
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
248 funcname = None
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
249 lineno = None
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
250 buf = []
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
251 messages = []
80
9c84b9fa5d30 Added support for translator comments at the API and frontends levels.(See #12, item 1). Updated docs and tests accordingly.
palgarvio
parents: 75
diff changeset
252 translator_comments = []
1
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
253 in_args = False
80
9c84b9fa5d30 Added support for translator comments at the API and frontends levels.(See #12, item 1). Updated docs and tests accordingly.
palgarvio
parents: 75
diff changeset
254 in_translator_comments = False
1
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
255
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
256 tokens = generate_tokens(fileobj.readline)
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
257 for tok, value, (lineno, _), _, _ in tokens:
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
258 if funcname and tok == OP and value == '(':
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
259 in_args = True
80
9c84b9fa5d30 Added support for translator comments at the API and frontends levels.(See #12, item 1). Updated docs and tests accordingly.
palgarvio
parents: 75
diff changeset
260 elif tok == COMMENT:
92
5bac3678e60d Fixed bug introduced in [92], bad use of `lstrip()`. Added a unittest to test multiple translator comment tags.
palgarvio
parents: 91
diff changeset
261 # Strip the comment token from the line
5bac3678e60d Fixed bug introduced in [92], bad use of `lstrip()`. Added a unittest to test multiple translator comment tags.
palgarvio
parents: 91
diff changeset
262 value = value[1:].strip()
93
1ce6692ed625 Commiting patch provided by pjenvey: Translator comments don't apply unless they immediately preceed the message.
palgarvio
parents: 92
diff changeset
263 if in_translator_comments is True and \
1ce6692ed625 Commiting patch provided by pjenvey: Translator comments don't apply unless they immediately preceed the message.
palgarvio
parents: 92
diff changeset
264 translator_comments[-1][0] == lineno - 1:
92
5bac3678e60d Fixed bug introduced in [92], bad use of `lstrip()`. Added a unittest to test multiple translator comment tags.
palgarvio
parents: 91
diff changeset
265 # We're already inside a translator comment, continue appending
5bac3678e60d Fixed bug introduced in [92], bad use of `lstrip()`. Added a unittest to test multiple translator comment tags.
palgarvio
parents: 91
diff changeset
266 # XXX: Should we check if the programmer keeps adding the
5bac3678e60d Fixed bug introduced in [92], bad use of `lstrip()`. Added a unittest to test multiple translator comment tags.
palgarvio
parents: 91
diff changeset
267 # comment_tag for every comment line??? probably not!
93
1ce6692ed625 Commiting patch provided by pjenvey: Translator comments don't apply unless they immediately preceed the message.
palgarvio
parents: 92
diff changeset
268 translator_comments.append((lineno, value))
92
5bac3678e60d Fixed bug introduced in [92], bad use of `lstrip()`. Added a unittest to test multiple translator comment tags.
palgarvio
parents: 91
diff changeset
269 continue
5bac3678e60d Fixed bug introduced in [92], bad use of `lstrip()`. Added a unittest to test multiple translator comment tags.
palgarvio
parents: 91
diff changeset
270 # If execution reaches this point, let's see if comment line
5bac3678e60d Fixed bug introduced in [92], bad use of `lstrip()`. Added a unittest to test multiple translator comment tags.
palgarvio
parents: 91
diff changeset
271 # starts with one of the comment tags
85
dc260efaed34 Fixed de-pluralization bug introduced in [85] regarding the extraction of translator comments.
palgarvio
parents: 84
diff changeset
272 for comment_tag in comment_tags:
92
5bac3678e60d Fixed bug introduced in [92], bad use of `lstrip()`. Added a unittest to test multiple translator comment tags.
palgarvio
parents: 91
diff changeset
273 if value.startswith(comment_tag):
80
9c84b9fa5d30 Added support for translator comments at the API and frontends levels.(See #12, item 1). Updated docs and tests accordingly.
palgarvio
parents: 75
diff changeset
274 if in_translator_comments is not True:
9c84b9fa5d30 Added support for translator comments at the API and frontends levels.(See #12, item 1). Updated docs and tests accordingly.
palgarvio
parents: 75
diff changeset
275 in_translator_comments = True
92
5bac3678e60d Fixed bug introduced in [92], bad use of `lstrip()`. Added a unittest to test multiple translator comment tags.
palgarvio
parents: 91
diff changeset
276 comment = value[len(comment_tag):].strip()
93
1ce6692ed625 Commiting patch provided by pjenvey: Translator comments don't apply unless they immediately preceed the message.
palgarvio
parents: 92
diff changeset
277 translator_comments.append((lineno, comment))
92
5bac3678e60d Fixed bug introduced in [92], bad use of `lstrip()`. Added a unittest to test multiple translator comment tags.
palgarvio
parents: 91
diff changeset
278 break
1
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
279 elif funcname and in_args:
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
280 if tok == OP and value == ')':
80
9c84b9fa5d30 Added support for translator comments at the API and frontends levels.(See #12, item 1). Updated docs and tests accordingly.
palgarvio
parents: 75
diff changeset
281 in_args = in_translator_comments = False
1
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
282 if buf:
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
283 messages.append(''.join(buf))
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
284 del buf[:]
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
285 if filter(None, messages):
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
286 if len(messages) > 1:
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
287 messages = tuple(messages)
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
288 else:
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
289 messages = messages[0]
93
1ce6692ed625 Commiting patch provided by pjenvey: Translator comments don't apply unless they immediately preceed the message.
palgarvio
parents: 92
diff changeset
290 # Comments don't apply unless they immediately preceed the
1ce6692ed625 Commiting patch provided by pjenvey: Translator comments don't apply unless they immediately preceed the message.
palgarvio
parents: 92
diff changeset
291 # message
1ce6692ed625 Commiting patch provided by pjenvey: Translator comments don't apply unless they immediately preceed the message.
palgarvio
parents: 92
diff changeset
292 if translator_comments and \
1ce6692ed625 Commiting patch provided by pjenvey: Translator comments don't apply unless they immediately preceed the message.
palgarvio
parents: 92
diff changeset
293 translator_comments[-1][0] < lineno - 1:
1ce6692ed625 Commiting patch provided by pjenvey: Translator comments don't apply unless they immediately preceed the message.
palgarvio
parents: 92
diff changeset
294 translator_comments = []
1ce6692ed625 Commiting patch provided by pjenvey: Translator comments don't apply unless they immediately preceed the message.
palgarvio
parents: 92
diff changeset
295
1ce6692ed625 Commiting patch provided by pjenvey: Translator comments don't apply unless they immediately preceed the message.
palgarvio
parents: 92
diff changeset
296 yield (lineno, funcname, messages,
1ce6692ed625 Commiting patch provided by pjenvey: Translator comments don't apply unless they immediately preceed the message.
palgarvio
parents: 92
diff changeset
297 [comment[1] for comment in translator_comments])
1
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
298 funcname = lineno = None
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
299 messages = []
80
9c84b9fa5d30 Added support for translator comments at the API and frontends levels.(See #12, item 1). Updated docs and tests accordingly.
palgarvio
parents: 75
diff changeset
300 translator_comments = []
1
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
301 elif tok == STRING:
36
56b931b23d5b Fix for #8: fix extraction of strings from Python source using prefixes ('u' or 'r') or triple quotes.
cmlenz
parents: 12
diff changeset
302 # Unwrap quotes in a safe manner
56b931b23d5b Fix for #8: fix extraction of strings from Python source using prefixes ('u' or 'r') or triple quotes.
cmlenz
parents: 12
diff changeset
303 buf.append(eval(value, {'__builtins__':{}}, {}))
1
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
304 elif tok == OP and value == ',':
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
305 messages.append(''.join(buf))
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
306 del buf[:]
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
307 elif funcname:
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
308 funcname = None
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
309 elif tok == NAME and value in keywords:
f71ca60f2a4a Import of initial code base.
cmlenz
parents:
diff changeset
310 funcname = value
Copyright (C) 2012-2017 Edgewall Software