annotate babel/messages/pofile.py @ 161:212d8469ec8c trunk

Slightly simplified CLI-frontend class.
author cmlenz
date Thu, 21 Jun 2007 16:12:38 +0000
parents 0a01e8cd26d0
children 5d32098d8352
rev   line source
1
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
1 # -*- coding: utf-8 -*-
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
2 #
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
3 # Copyright (C) 2007 Edgewall Software
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
4 # All rights reserved.
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
5 #
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
6 # This software is licensed as described in the file COPYING, which
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
7 # you should have received as part of this distribution. The terms
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
8 # are also available at http://babel.edgewall.org/wiki/License.
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
9 #
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
10 # This software consists of voluntary contributions made by many
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
11 # individuals. For the exact contribution history, see the revision
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
12 # history and logs, available at http://babel.edgewall.org/log/.
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
13
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
14 """Reading and writing of files in the ``gettext`` PO (portable object)
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
15 format.
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
16
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
17 :see: `The Format of PO Files
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
18 <http://www.gnu.org/software/gettext/manual/gettext.html#PO-Files>`_
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
19 """
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
20
5
132526dcd074 * The creation-date header in generated PO files now includes the timezone offset.
cmlenz
parents: 1
diff changeset
21 from datetime import date, datetime
134
58b729b647f3 More fixes for Windows compatibility:
cmlenz
parents: 120
diff changeset
22 import os
1
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
23 import re
6
c3b1b0b3d129 Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 5
diff changeset
24 try:
c3b1b0b3d129 Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 5
diff changeset
25 set
c3b1b0b3d129 Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 5
diff changeset
26 except NameError:
c3b1b0b3d129 Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 5
diff changeset
27 from sets import Set as set
103
dacfbaf0d1e0 Implement wrapping of header comments in PO(T) output. Related to #14.
cmlenz
parents: 102
diff changeset
28 from textwrap import wrap
1
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
29
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
30 from babel import __version__ as VERSION
56
f40fc143439c Add actual data structures for handling message catalogs, so that more code can be reused here between the frontends.
cmlenz
parents: 55
diff changeset
31 from babel.messages.catalog import Catalog
97
4e5c9dc57f1d Renamed `LOCAL` to `LOCALTZ`.
cmlenz
parents: 96
diff changeset
32 from babel.util import LOCALTZ
1
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
33
158
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
34 __all__ = ['unescape', 'denormalize', 'read_po', 'escape', 'normalize',
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
35 'write_po']
161
212d8469ec8c Slightly simplified CLI-frontend class.
cmlenz
parents: 158
diff changeset
36 __docformat__ = 'restructuredtext en'
158
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
37
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
38 def unescape(string):
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
39 r"""Reverse `escape` the given string.
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
40
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
41 >>> print unescape('"Say:\\n \\"hello, world!\\"\\n"')
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
42 Say:
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
43 "hello, world!"
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
44 <BLANKLINE>
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
45
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
46 :param string: the string to unescape
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
47 :return: the unescaped string
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
48 :rtype: `str` or `unicode`
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
49 """
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
50 return string[1:-1].replace('\\\\', '\\') \
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
51 .replace('\\t', '\t') \
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
52 .replace('\\r', '\r') \
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
53 .replace('\\n', '\n') \
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
54 .replace('\\"', '\"')
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
55
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
56 def denormalize(string):
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
57 r"""Reverse the normalization done by the `normalize` function.
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
58
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
59 >>> print denormalize(r'''""
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
60 ... "Say:\n"
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
61 ... " \"hello, world!\"\n"''')
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
62 Say:
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
63 "hello, world!"
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
64 <BLANKLINE>
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
65
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
66 >>> print denormalize(r'''""
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
67 ... "Say:\n"
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
68 ... " \"Lorem ipsum dolor sit "
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
69 ... "amet, consectetur adipisicing"
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
70 ... " elit, \"\n"''')
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
71 Say:
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
72 "Lorem ipsum dolor sit amet, consectetur adipisicing elit, "
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
73 <BLANKLINE>
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
74
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
75 :param string: the string to denormalize
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
76 :return: the denormalized string
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
77 :rtype: `unicode` or `str`
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
78 """
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
79 if string.startswith('""'):
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
80 lines = []
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
81 for line in string.splitlines()[1:]:
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
82 lines.append(unescape(line))
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
83 return ''.join(lines)
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
84 else:
0a01e8cd26d0 Minor cleanup in the `pofile` module.
cmlenz
parents: 149
diff changeset
85 return unescape(string)
1
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
86
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
87 def read_po(fileobj):
6
c3b1b0b3d129 Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 5
diff changeset
88 """Read messages from a ``gettext`` PO (portable object) file from the given
64
ef318245cfe5 `read_po` now returns a `Catalog`.
cmlenz
parents: 56
diff changeset
89 file-like object and return a `Catalog`.
6
c3b1b0b3d129 Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 5
diff changeset
90
c3b1b0b3d129 Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 5
diff changeset
91 >>> from StringIO import StringIO
106
2cd83f77cc98 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`.
cmlenz
parents: 105
diff changeset
92 >>> buf = StringIO('''
6
c3b1b0b3d129 Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 5
diff changeset
93 ... #: main.py:1
c3b1b0b3d129 Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 5
diff changeset
94 ... #, fuzzy, python-format
c3b1b0b3d129 Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 5
diff changeset
95 ... msgid "foo %(name)s"
c3b1b0b3d129 Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 5
diff changeset
96 ... msgstr ""
21
cd9aa202568e Change pot header's first line, "Translations Template for %%(project)s." instead of "SOME DESCRIPTIVE TITLE.". '''`project`''' and '''`version`''' now default to '''PROJECT''' and '''VERSION''' respectively. Fixed a bug regarding '''Content-Transfer-Encoding''', it shouldn't be the charset, and we're defaulting to `8bit` untill someone complains.
palgarvio
parents: 17
diff changeset
97 ...
94
96037779b518 Updated `read_po` to add user comments besides just auto comments.
palgarvio
parents: 84
diff changeset
98 ... # A user comment
96037779b518 Updated `read_po` to add user comments besides just auto comments.
palgarvio
parents: 84
diff changeset
99 ... #. An auto comment
6
c3b1b0b3d129 Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 5
diff changeset
100 ... #: main.py:3
c3b1b0b3d129 Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 5
diff changeset
101 ... msgid "bar"
c3b1b0b3d129 Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 5
diff changeset
102 ... msgid_plural "baz"
c3b1b0b3d129 Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 5
diff changeset
103 ... msgstr[0] ""
c3b1b0b3d129 Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 5
diff changeset
104 ... msgstr[1] ""
c3b1b0b3d129 Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 5
diff changeset
105 ... ''')
64
ef318245cfe5 `read_po` now returns a `Catalog`.
cmlenz
parents: 56
diff changeset
106 >>> catalog = read_po(buf)
104
395704fda00b Merged `write_pot` and `write_po` functions by moving more functionality to the `Catalog` class. This is certainly not perfect yet, but moves us in the right direction.
cmlenz
parents: 103
diff changeset
107 >>> catalog.revision_date = datetime(2007, 04, 01)
395704fda00b Merged `write_pot` and `write_po` functions by moving more functionality to the `Catalog` class. This is certainly not perfect yet, but moves us in the right direction.
cmlenz
parents: 103
diff changeset
108
64
ef318245cfe5 `read_po` now returns a `Catalog`.
cmlenz
parents: 56
diff changeset
109 >>> for message in catalog:
67
7b2fcd6d6d26 Enhance catalog to also manage the MIME headers.
cmlenz
parents: 64
diff changeset
110 ... if message.id:
7b2fcd6d6d26 Enhance catalog to also manage the MIME headers.
cmlenz
parents: 64
diff changeset
111 ... print (message.id, message.string)
105
c62b68a0b65e `Message`, `read_po` and `write_po` now all handle user/auto comments correctly.
palgarvio
parents: 104
diff changeset
112 ... print ' ', (message.locations, message.flags)
c62b68a0b65e `Message`, `read_po` and `write_po` now all handle user/auto comments correctly.
palgarvio
parents: 104
diff changeset
113 ... print ' ', (message.user_comments, message.auto_comments)
149
d62c63280e81 Respect charset specified in PO headers in `read_po()`. Fixes #17.
cmlenz
parents: 134
diff changeset
114 (u'foo %(name)s', '')
d62c63280e81 Respect charset specified in PO headers in `read_po()`. Fixes #17.
cmlenz
parents: 134
diff changeset
115 ([(u'main.py', 1)], set([u'fuzzy', u'python-format']))
105
c62b68a0b65e `Message`, `read_po` and `write_po` now all handle user/auto comments correctly.
palgarvio
parents: 104
diff changeset
116 ([], [])
149
d62c63280e81 Respect charset specified in PO headers in `read_po()`. Fixes #17.
cmlenz
parents: 134
diff changeset
117 ((u'bar', u'baz'), ('', ''))
d62c63280e81 Respect charset specified in PO headers in `read_po()`. Fixes #17.
cmlenz
parents: 134
diff changeset
118 ([(u'main.py', 3)], set([]))
d62c63280e81 Respect charset specified in PO headers in `read_po()`. Fixes #17.
cmlenz
parents: 134
diff changeset
119 ([u'A user comment'], [u'An auto comment'])
1
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
120
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
121 :param fileobj: the file-like object to read the PO file from
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
122 :return: an iterator over ``(message, translation, location)`` tuples
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
123 :rtype: ``iterator``
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
124 """
64
ef318245cfe5 `read_po` now returns a `Catalog`.
cmlenz
parents: 56
diff changeset
125 catalog = Catalog()
ef318245cfe5 `read_po` now returns a `Catalog`.
cmlenz
parents: 56
diff changeset
126
6
c3b1b0b3d129 Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 5
diff changeset
127 messages = []
c3b1b0b3d129 Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 5
diff changeset
128 translations = []
c3b1b0b3d129 Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 5
diff changeset
129 locations = []
c3b1b0b3d129 Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 5
diff changeset
130 flags = []
105
c62b68a0b65e `Message`, `read_po` and `write_po` now all handle user/auto comments correctly.
palgarvio
parents: 104
diff changeset
131 user_comments = []
c62b68a0b65e `Message`, `read_po` and `write_po` now all handle user/auto comments correctly.
palgarvio
parents: 104
diff changeset
132 auto_comments = []
6
c3b1b0b3d129 Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 5
diff changeset
133 in_msgid = in_msgstr = False
c3b1b0b3d129 Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 5
diff changeset
134
64
ef318245cfe5 `read_po` now returns a `Catalog`.
cmlenz
parents: 56
diff changeset
135 def _add_message():
6
c3b1b0b3d129 Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 5
diff changeset
136 translations.sort()
64
ef318245cfe5 `read_po` now returns a `Catalog`.
cmlenz
parents: 56
diff changeset
137 if len(messages) > 1:
106
2cd83f77cc98 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`.
cmlenz
parents: 105
diff changeset
138 msgid = tuple([denormalize(m) for m in messages])
64
ef318245cfe5 `read_po` now returns a `Catalog`.
cmlenz
parents: 56
diff changeset
139 else:
106
2cd83f77cc98 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`.
cmlenz
parents: 105
diff changeset
140 msgid = denormalize(messages[0])
64
ef318245cfe5 `read_po` now returns a `Catalog`.
cmlenz
parents: 56
diff changeset
141 if len(translations) > 1:
106
2cd83f77cc98 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`.
cmlenz
parents: 105
diff changeset
142 string = tuple([denormalize(t[1]) for t in translations])
64
ef318245cfe5 `read_po` now returns a `Catalog`.
cmlenz
parents: 56
diff changeset
143 else:
106
2cd83f77cc98 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`.
cmlenz
parents: 105
diff changeset
144 string = denormalize(translations[0][1])
105
c62b68a0b65e `Message`, `read_po` and `write_po` now all handle user/auto comments correctly.
palgarvio
parents: 104
diff changeset
145 catalog.add(msgid, string, list(locations), set(flags),
108
0fee7666cccb Fixed a bug introduced in [106].
palgarvio
parents: 106
diff changeset
146 list(auto_comments), list(user_comments))
84
3ae316b58231 Some cosmetic changes for the new translator comments support.
cmlenz
parents: 80
diff changeset
147 del messages[:]; del translations[:]; del locations[:];
105
c62b68a0b65e `Message`, `read_po` and `write_po` now all handle user/auto comments correctly.
palgarvio
parents: 104
diff changeset
148 del flags[:]; del auto_comments[:]; del user_comments[:]
6
c3b1b0b3d129 Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 5
diff changeset
149
1
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
150 for line in fileobj.readlines():
149
d62c63280e81 Respect charset specified in PO headers in `read_po()`. Fixes #17.
cmlenz
parents: 134
diff changeset
151 line = line.strip().decode(catalog.charset)
1
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
152 if line.startswith('#'):
106
2cd83f77cc98 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`.
cmlenz
parents: 105
diff changeset
153 in_msgid = in_msgstr = False
2cd83f77cc98 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`.
cmlenz
parents: 105
diff changeset
154 if messages:
2cd83f77cc98 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`.
cmlenz
parents: 105
diff changeset
155 _add_message()
2cd83f77cc98 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`.
cmlenz
parents: 105
diff changeset
156 if line[1:].startswith(':'):
2cd83f77cc98 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`.
cmlenz
parents: 105
diff changeset
157 for location in line[2:].lstrip().split():
2cd83f77cc98 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`.
cmlenz
parents: 105
diff changeset
158 filename, lineno = location.split(':', 1)
2cd83f77cc98 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`.
cmlenz
parents: 105
diff changeset
159 locations.append((filename, int(lineno)))
2cd83f77cc98 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`.
cmlenz
parents: 105
diff changeset
160 elif line[1:].startswith(','):
2cd83f77cc98 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`.
cmlenz
parents: 105
diff changeset
161 for flag in line[2:].lstrip().split(','):
2cd83f77cc98 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`.
cmlenz
parents: 105
diff changeset
162 flags.append(flag.strip())
2cd83f77cc98 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`.
cmlenz
parents: 105
diff changeset
163 elif line[1:].startswith('.'):
2cd83f77cc98 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`.
cmlenz
parents: 105
diff changeset
164 # These are called auto-comments
2cd83f77cc98 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`.
cmlenz
parents: 105
diff changeset
165 comment = line[2:].strip()
2cd83f77cc98 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`.
cmlenz
parents: 105
diff changeset
166 if comment:
2cd83f77cc98 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`.
cmlenz
parents: 105
diff changeset
167 # Just check that we're not adding empty comments
2cd83f77cc98 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`.
cmlenz
parents: 105
diff changeset
168 auto_comments.append(comment)
120
1741953aafd8 Added tests for `new_catalog` distutils command.
cmlenz
parents: 108
diff changeset
169 else:
106
2cd83f77cc98 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`.
cmlenz
parents: 105
diff changeset
170 # These are called user comments
120
1741953aafd8 Added tests for `new_catalog` distutils command.
cmlenz
parents: 108
diff changeset
171 user_comments.append(line[1:].strip())
104
395704fda00b Merged `write_pot` and `write_po` functions by moving more functionality to the `Catalog` class. This is certainly not perfect yet, but moves us in the right direction.
cmlenz
parents: 103
diff changeset
172 else:
1
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
173 if line.startswith('msgid_plural'):
6
c3b1b0b3d129 Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 5
diff changeset
174 in_msgid = True
1
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
175 msg = line[12:].lstrip()
106
2cd83f77cc98 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`.
cmlenz
parents: 105
diff changeset
176 messages.append(msg)
1
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
177 elif line.startswith('msgid'):
6
c3b1b0b3d129 Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 5
diff changeset
178 in_msgid = True
c3b1b0b3d129 Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 5
diff changeset
179 if messages:
64
ef318245cfe5 `read_po` now returns a `Catalog`.
cmlenz
parents: 56
diff changeset
180 _add_message()
106
2cd83f77cc98 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`.
cmlenz
parents: 105
diff changeset
181 messages.append(line[5:].lstrip())
1
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
182 elif line.startswith('msgstr'):
6
c3b1b0b3d129 Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 5
diff changeset
183 in_msgid = False
c3b1b0b3d129 Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 5
diff changeset
184 in_msgstr = True
1
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
185 msg = line[6:].lstrip()
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
186 if msg.startswith('['):
6
c3b1b0b3d129 Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 5
diff changeset
187 idx, msg = msg[1:].split(']')
106
2cd83f77cc98 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`.
cmlenz
parents: 105
diff changeset
188 translations.append([int(idx), msg.lstrip()])
6
c3b1b0b3d129 Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 5
diff changeset
189 else:
106
2cd83f77cc98 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`.
cmlenz
parents: 105
diff changeset
190 translations.append([0, msg])
6
c3b1b0b3d129 Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 5
diff changeset
191 elif line.startswith('"'):
c3b1b0b3d129 Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 5
diff changeset
192 if in_msgid:
106
2cd83f77cc98 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`.
cmlenz
parents: 105
diff changeset
193 messages[-1] += u'\n' + line.rstrip()
6
c3b1b0b3d129 Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 5
diff changeset
194 elif in_msgstr:
106
2cd83f77cc98 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`.
cmlenz
parents: 105
diff changeset
195 translations[-1][1] += u'\n' + line.rstrip()
6
c3b1b0b3d129 Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 5
diff changeset
196
c3b1b0b3d129 Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 5
diff changeset
197 if messages:
64
ef318245cfe5 `read_po` now returns a `Catalog`.
cmlenz
parents: 56
diff changeset
198 _add_message()
ef318245cfe5 `read_po` now returns a `Catalog`.
cmlenz
parents: 56
diff changeset
199 return catalog
1
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
200
24
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
201 WORD_SEP = re.compile('('
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
202 r'\s+|' # any whitespace
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
203 r'[^\s\w]*\w+[a-zA-Z]-(?=\w+[a-zA-Z])|' # hyphenated words
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
204 r'(?<=[\w\!\"\'\&\.\,\?])-{2,}(?=\w)' # em-dash
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
205 ')')
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
206
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
207 def escape(string):
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
208 r"""Escape the given string so that it can be included in double-quoted
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
209 strings in ``PO`` files.
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
210
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
211 >>> escape('''Say:
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
212 ... "hello, world!"
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
213 ... ''')
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
214 '"Say:\\n \\"hello, world!\\"\\n"'
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
215
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
216 :param string: the string to escape
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
217 :return: the escaped string
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
218 :rtype: `str` or `unicode`
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
219 """
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
220 return '"%s"' % string.replace('\\', '\\\\') \
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
221 .replace('\t', '\\t') \
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
222 .replace('\r', '\\r') \
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
223 .replace('\n', '\\n') \
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
224 .replace('\"', '\\"')
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
225
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
226 def normalize(string, width=76):
106
2cd83f77cc98 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`.
cmlenz
parents: 105
diff changeset
227 r"""Convert a string into a format that is appropriate for .po files.
24
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
228
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
229 >>> print normalize('''Say:
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
230 ... "hello, world!"
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
231 ... ''', width=None)
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
232 ""
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
233 "Say:\n"
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
234 " \"hello, world!\"\n"
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
235
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
236 >>> print normalize('''Say:
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
237 ... "Lorem ipsum dolor sit amet, consectetur adipisicing elit, "
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
238 ... ''', width=32)
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
239 ""
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
240 "Say:\n"
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
241 " \"Lorem ipsum dolor sit "
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
242 "amet, consectetur adipisicing"
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
243 " elit, \"\n"
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
244
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
245 :param string: the string to normalize
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
246 :param width: the maximum line width; use `None`, 0, or a negative number
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
247 to completely disable line wrapping
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
248 :return: the normalized string
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
249 :rtype: `unicode`
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
250 """
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
251 if width and width > 0:
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
252 lines = []
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
253 for idx, line in enumerate(string.splitlines(True)):
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
254 if len(escape(line)) > width:
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
255 chunks = WORD_SEP.split(line)
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
256 chunks.reverse()
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
257 while chunks:
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
258 buf = []
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
259 size = 2
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
260 while chunks:
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
261 l = len(escape(chunks[-1])) - 2
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
262 if size + l < width:
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
263 buf.append(chunks.pop())
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
264 size += l
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
265 else:
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
266 if not buf:
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
267 # handle long chunks by putting them on a
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
268 # separate line
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
269 buf.append(chunks.pop())
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
270 break
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
271 lines.append(u''.join(buf))
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
272 else:
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
273 lines.append(line)
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
274 else:
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
275 lines = string.splitlines(True)
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
276
67
7b2fcd6d6d26 Enhance catalog to also manage the MIME headers.
cmlenz
parents: 64
diff changeset
277 if len(lines) <= 1:
24
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
278 return escape(string)
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
279
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
280 # Remove empty trailing line
67
7b2fcd6d6d26 Enhance catalog to also manage the MIME headers.
cmlenz
parents: 64
diff changeset
281 if lines and not lines[-1]:
24
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
282 del lines[-1]
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
283 lines[-1] += '\n'
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
284 return u'""\n' + u'\n'.join([escape(l) for l in lines])
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
285
104
395704fda00b Merged `write_pot` and `write_po` functions by moving more functionality to the `Catalog` class. This is certainly not perfect yet, but moves us in the right direction.
cmlenz
parents: 103
diff changeset
286 def write_po(fileobj, catalog, width=76, no_location=False, omit_header=False,
395704fda00b Merged `write_pot` and `write_po` functions by moving more functionality to the `Catalog` class. This is certainly not perfect yet, but moves us in the right direction.
cmlenz
parents: 103
diff changeset
287 sort_output=False, sort_by_file=False):
56
f40fc143439c Add actual data structures for handling message catalogs, so that more code can be reused here between the frontends.
cmlenz
parents: 55
diff changeset
288 r"""Write a ``gettext`` PO (portable object) template file for a given
f40fc143439c Add actual data structures for handling message catalogs, so that more code can be reused here between the frontends.
cmlenz
parents: 55
diff changeset
289 message catalog to the provided file-like object.
1
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
290
56
f40fc143439c Add actual data structures for handling message catalogs, so that more code can be reused here between the frontends.
cmlenz
parents: 55
diff changeset
291 >>> catalog = Catalog()
f40fc143439c Add actual data structures for handling message catalogs, so that more code can be reused here between the frontends.
cmlenz
parents: 55
diff changeset
292 >>> catalog.add(u'foo %(name)s', locations=[('main.py', 1)],
f40fc143439c Add actual data structures for handling message catalogs, so that more code can be reused here between the frontends.
cmlenz
parents: 55
diff changeset
293 ... flags=('fuzzy',))
f40fc143439c Add actual data structures for handling message catalogs, so that more code can be reused here between the frontends.
cmlenz
parents: 55
diff changeset
294 >>> catalog.add((u'bar', u'baz'), locations=[('main.py', 3)])
1
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
295 >>> from StringIO import StringIO
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
296 >>> buf = StringIO()
104
395704fda00b Merged `write_pot` and `write_po` functions by moving more functionality to the `Catalog` class. This is certainly not perfect yet, but moves us in the right direction.
cmlenz
parents: 103
diff changeset
297 >>> write_po(buf, catalog, omit_header=True)
1
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
298 >>> print buf.getvalue()
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
299 #: main.py:1
6
c3b1b0b3d129 Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 5
diff changeset
300 #, fuzzy, python-format
c3b1b0b3d129 Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 5
diff changeset
301 msgid "foo %(name)s"
1
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
302 msgstr ""
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
303 <BLANKLINE>
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
304 #: main.py:3
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
305 msgid "bar"
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
306 msgid_plural "baz"
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
307 msgstr[0] ""
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
308 msgstr[1] ""
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
309 <BLANKLINE>
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
310 <BLANKLINE>
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
311
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
312 :param fileobj: the file-like object to write to
67
7b2fcd6d6d26 Enhance catalog to also manage the MIME headers.
cmlenz
parents: 64
diff changeset
313 :param catalog: the `Catalog` instance
24
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
314 :param width: the maximum line width for the generated output; use `None`,
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
315 0, or a negative number to completely disable line wrapping
1
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
316 :param no_location: do not emit a location comment for every message
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
317 :param omit_header: do not include the ``msgid ""`` entry at the top of the
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
318 output
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
319 """
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
320 def _normalize(key):
102
14a3d766a701 Project name and version, and the charset are available via the `Catalog` object, and do not need to be passed to `write_pot()`.
cmlenz
parents: 97
diff changeset
321 return normalize(key, width=width).encode(catalog.charset,
14a3d766a701 Project name and version, and the charset are available via the `Catalog` object, and do not need to be passed to `write_pot()`.
cmlenz
parents: 97
diff changeset
322 'backslashreplace')
24
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
323
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
324 def _write(text):
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
325 if isinstance(text, unicode):
102
14a3d766a701 Project name and version, and the charset are available via the `Catalog` object, and do not need to be passed to `write_pot()`.
cmlenz
parents: 97
diff changeset
326 text = text.encode(catalog.charset)
24
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
327 fileobj.write(text)
1
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
328
104
395704fda00b Merged `write_pot` and `write_po` functions by moving more functionality to the `Catalog` class. This is certainly not perfect yet, but moves us in the right direction.
cmlenz
parents: 103
diff changeset
329 messages = list(catalog)
71
27f01e7626ea Implemented message sorting, see #7.
palgarvio
parents: 68
diff changeset
330 if sort_output:
27f01e7626ea Implemented message sorting, see #7.
palgarvio
parents: 68
diff changeset
331 messages.sort(lambda x,y: cmp(x.id, y.id))
27f01e7626ea Implemented message sorting, see #7.
palgarvio
parents: 68
diff changeset
332 elif sort_by_file:
27f01e7626ea Implemented message sorting, see #7.
palgarvio
parents: 68
diff changeset
333 messages.sort(lambda x,y: cmp(x.locations, y.locations))
68
269941aa0e55 Add back POT header broken in previous check-in.
cmlenz
parents: 67
diff changeset
334
71
27f01e7626ea Implemented message sorting, see #7.
palgarvio
parents: 68
diff changeset
335 for message in messages:
67
7b2fcd6d6d26 Enhance catalog to also manage the MIME headers.
cmlenz
parents: 64
diff changeset
336 if not message.id: # This is the header "message"
7b2fcd6d6d26 Enhance catalog to also manage the MIME headers.
cmlenz
parents: 64
diff changeset
337 if omit_header:
7b2fcd6d6d26 Enhance catalog to also manage the MIME headers.
cmlenz
parents: 64
diff changeset
338 continue
104
395704fda00b Merged `write_pot` and `write_po` functions by moving more functionality to the `Catalog` class. This is certainly not perfect yet, but moves us in the right direction.
cmlenz
parents: 103
diff changeset
339 comment_header = catalog.header_comment
103
dacfbaf0d1e0 Implement wrapping of header comments in PO(T) output. Related to #14.
cmlenz
parents: 102
diff changeset
340 if width and width > 0:
dacfbaf0d1e0 Implement wrapping of header comments in PO(T) output. Related to #14.
cmlenz
parents: 102
diff changeset
341 lines = []
104
395704fda00b Merged `write_pot` and `write_po` functions by moving more functionality to the `Catalog` class. This is certainly not perfect yet, but moves us in the right direction.
cmlenz
parents: 103
diff changeset
342 for line in comment_header.splitlines():
103
dacfbaf0d1e0 Implement wrapping of header comments in PO(T) output. Related to #14.
cmlenz
parents: 102
diff changeset
343 lines += wrap(line, width=width, subsequent_indent='# ',
dacfbaf0d1e0 Implement wrapping of header comments in PO(T) output. Related to #14.
cmlenz
parents: 102
diff changeset
344 break_long_words=False)
104
395704fda00b Merged `write_pot` and `write_po` functions by moving more functionality to the `Catalog` class. This is certainly not perfect yet, but moves us in the right direction.
cmlenz
parents: 103
diff changeset
345 comment_header = u'\n'.join(lines) + u'\n'
395704fda00b Merged `write_pot` and `write_po` functions by moving more functionality to the `Catalog` class. This is certainly not perfect yet, but moves us in the right direction.
cmlenz
parents: 103
diff changeset
346 _write(comment_header)
102
14a3d766a701 Project name and version, and the charset are available via the `Catalog` object, and do not need to be passed to `write_pot()`.
cmlenz
parents: 97
diff changeset
347
105
c62b68a0b65e `Message`, `read_po` and `write_po` now all handle user/auto comments correctly.
palgarvio
parents: 104
diff changeset
348 if message.user_comments:
c62b68a0b65e `Message`, `read_po` and `write_po` now all handle user/auto comments correctly.
palgarvio
parents: 104
diff changeset
349 for comment in message.user_comments:
c62b68a0b65e `Message`, `read_po` and `write_po` now all handle user/auto comments correctly.
palgarvio
parents: 104
diff changeset
350 for line in wrap(comment, width, break_long_words=False):
c62b68a0b65e `Message`, `read_po` and `write_po` now all handle user/auto comments correctly.
palgarvio
parents: 104
diff changeset
351 _write('# %s\n' % line.strip())
c62b68a0b65e `Message`, `read_po` and `write_po` now all handle user/auto comments correctly.
palgarvio
parents: 104
diff changeset
352
c62b68a0b65e `Message`, `read_po` and `write_po` now all handle user/auto comments correctly.
palgarvio
parents: 104
diff changeset
353 if message.auto_comments:
c62b68a0b65e `Message`, `read_po` and `write_po` now all handle user/auto comments correctly.
palgarvio
parents: 104
diff changeset
354 for comment in message.auto_comments:
103
dacfbaf0d1e0 Implement wrapping of header comments in PO(T) output. Related to #14.
cmlenz
parents: 102
diff changeset
355 for line in wrap(comment, width, break_long_words=False):
80
116e34b8cefa Added support for translator comments at the API and frontends levels.(See #12, item 1). Updated docs and tests accordingly.
palgarvio
parents: 79
diff changeset
356 _write('#. %s\n' % line.strip())
1
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
357
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
358 if not no_location:
134
58b729b647f3 More fixes for Windows compatibility:
cmlenz
parents: 120
diff changeset
359 locs = u' '.join([u'%s:%d' % (filename.replace(os.sep, '/'), lineno)
58b729b647f3 More fixes for Windows compatibility:
cmlenz
parents: 120
diff changeset
360 for filename, lineno in message.locations])
24
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
361 if width and width > 0:
103
dacfbaf0d1e0 Implement wrapping of header comments in PO(T) output. Related to #14.
cmlenz
parents: 102
diff changeset
362 locs = wrap(locs, width, break_long_words=False)
24
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
363 for line in locs:
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
364 _write('#: %s\n' % line.strip())
56
f40fc143439c Add actual data structures for handling message catalogs, so that more code can be reused here between the frontends.
cmlenz
parents: 55
diff changeset
365 if message.flags:
f40fc143439c Add actual data structures for handling message catalogs, so that more code can be reused here between the frontends.
cmlenz
parents: 55
diff changeset
366 _write('#%s\n' % ', '.join([''] + list(message.flags)))
24
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
367
56
f40fc143439c Add actual data structures for handling message catalogs, so that more code can be reused here between the frontends.
cmlenz
parents: 55
diff changeset
368 if isinstance(message.id, (list, tuple)):
f40fc143439c Add actual data structures for handling message catalogs, so that more code can be reused here between the frontends.
cmlenz
parents: 55
diff changeset
369 _write('msgid %s\n' % _normalize(message.id[0]))
f40fc143439c Add actual data structures for handling message catalogs, so that more code can be reused here between the frontends.
cmlenz
parents: 55
diff changeset
370 _write('msgid_plural %s\n' % _normalize(message.id[1]))
68
269941aa0e55 Add back POT header broken in previous check-in.
cmlenz
parents: 67
diff changeset
371 for i, string in enumerate(message.string):
269941aa0e55 Add back POT header broken in previous check-in.
cmlenz
parents: 67
diff changeset
372 _write('msgstr[%d] %s\n' % (i, _normalize(message.string[i])))
1
7870274479f5 Import of initial code base.
cmlenz
parents:
diff changeset
373 else:
56
f40fc143439c Add actual data structures for handling message catalogs, so that more code can be reused here between the frontends.
cmlenz
parents: 55
diff changeset
374 _write('msgid %s\n' % _normalize(message.id))
68
269941aa0e55 Add back POT header broken in previous check-in.
cmlenz
parents: 67
diff changeset
375 _write('msgstr %s\n' % _normalize(message.string or ''))
24
b09e90803d1b Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 23
diff changeset
376 _write('\n')
Copyright (C) 2012-2017 Edgewall Software