annotate babel/catalog/pofile.py @ 26:93eaa2f4a0a2

Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
author cmlenz
date Fri, 01 Jun 2007 15:36:00 +0000
parents ee33990f6e83
children 695884591af6
rev   line source
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
1 # -*- coding: utf-8 -*-
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
2 #
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
3 # Copyright (C) 2007 Edgewall Software
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
4 # All rights reserved.
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
5 #
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
6 # This software is licensed as described in the file COPYING, which
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
7 # you should have received as part of this distribution. The terms
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
8 # are also available at http://babel.edgewall.org/wiki/License.
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
9 #
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
10 # This software consists of voluntary contributions made by many
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
11 # individuals. For the exact contribution history, see the revision
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
12 # history and logs, available at http://babel.edgewall.org/log/.
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
13
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
14 """Reading and writing of files in the ``gettext`` PO (portable object)
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
15 format.
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
16
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
17 :see: `The Format of PO Files
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
18 <http://www.gnu.org/software/gettext/manual/gettext.html#PO-Files>`_
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
19 """
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
20
7
8d7b3077e6d1 * The creation-date header in generated PO files now includes the timezone offset.
cmlenz
parents: 3
diff changeset
21 from datetime import date, datetime
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
22 import re
8
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
23 try:
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
24 set
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
25 except NameError:
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
26 from sets import Set as set
26
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
27 import textwrap
7
8d7b3077e6d1 * The creation-date header in generated PO files now includes the timezone offset.
cmlenz
parents: 3
diff changeset
28 import time
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
29
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
30 from babel import __version__ as VERSION
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
31
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
32 __all__ = ['escape', 'normalize', 'read_po', 'write_po']
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
33
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
34 def read_po(fileobj):
8
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
35 """Read messages from a ``gettext`` PO (portable object) file from the given
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
36 file-like object.
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
37
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
38 This function yields tuples of the form:
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
39
8
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
40 ``(message, translation, locations, flags)``
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
41
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
42 where:
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
43
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
44 * ``message`` is the original (untranslated) message, or a
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
45 ``(singular, plural)`` tuple for pluralizable messages
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
46 * ``translation`` is the translation of the message, or a tuple of
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
47 translations for pluralizable messages
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
48 * ``locations`` is a sequence of ``(filename, lineno)`` tuples
8
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
49 * ``flags`` is a set of strings (for exampe, "fuzzy")
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
50
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
51 >>> from StringIO import StringIO
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
52 >>> buf = StringIO('''
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
53 ... #: main.py:1
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
54 ... #, fuzzy, python-format
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
55 ... msgid "foo %(name)s"
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
56 ... msgstr ""
23
f828705c3bce Change pot header's first line, "Translations Template for %%(project)s." instead of "SOME DESCRIPTIVE TITLE.". '''`project`''' and '''`version`''' now default to '''PROJECT''' and '''VERSION''' respectively. Fixed a bug regarding '''Content-Transfer-Encoding''', it shouldn't be the charset, and we're defaulting to `8bit` untill someone complains.
palgarvio
parents: 19
diff changeset
57 ...
8
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
58 ... #: main.py:3
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
59 ... msgid "bar"
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
60 ... msgid_plural "baz"
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
61 ... msgstr[0] ""
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
62 ... msgstr[1] ""
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
63 ... ''')
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
64 >>> for message, translation, locations, flags in read_po(buf):
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
65 ... print (message, translation)
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
66 ... print ' ', (locations, flags)
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
67 (('foo %(name)s',), ('',))
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
68 ((('main.py', 1),), set(['fuzzy', 'python-format']))
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
69 (('bar', 'baz'), ('', ''))
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
70 ((('main.py', 3),), set([]))
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
71
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
72 :param fileobj: the file-like object to read the PO file from
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
73 :return: an iterator over ``(message, translation, location)`` tuples
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
74 :rtype: ``iterator``
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
75 """
8
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
76 messages = []
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
77 translations = []
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
78 locations = []
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
79 flags = []
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
80 in_msgid = in_msgstr = False
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
81
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
82 def pack():
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
83 translations.sort()
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
84 retval = (tuple(messages), tuple([t[1] for t in translations]),
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
85 tuple(locations), set(flags))
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
86 del messages[:]
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
87 del translations[:]
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
88 del locations[:]
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
89 del flags[:]
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
90 return retval
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
91
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
92 for line in fileobj.readlines():
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
93 line = line.strip()
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
94 if line.startswith('#'):
8
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
95 in_msgid = in_msgstr = False
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
96 if messages:
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
97 yield pack()
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
98 if line[1:].startswith(':'):
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
99 for location in line[2:].lstrip().split():
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
100 filename, lineno = location.split(':', 1)
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
101 locations.append((filename, int(lineno)))
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
102 elif line[1:].startswith(','):
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
103 for flag in line[2:].lstrip().split(','):
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
104 flags.append(flag.strip())
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
105 elif line:
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
106 if line.startswith('msgid_plural'):
8
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
107 in_msgid = True
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
108 msg = line[12:].lstrip()
8
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
109 messages.append(msg[1:-1])
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
110 elif line.startswith('msgid'):
8
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
111 in_msgid = True
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
112 if messages:
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
113 yield pack()
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
114 msg = line[5:].lstrip()
8
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
115 messages.append(msg[1:-1])
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
116 elif line.startswith('msgstr'):
8
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
117 in_msgid = False
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
118 in_msgstr = True
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
119 msg = line[6:].lstrip()
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
120 if msg.startswith('['):
8
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
121 idx, msg = msg[1:].split(']')
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
122 translations.append([int(idx), msg.lstrip()[1:-1]])
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
123 else:
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
124 translations.append([0, msg[1:-1]])
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
125 elif line.startswith('"'):
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
126 if in_msgid:
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
127 messages[-1] += line.rstrip()[1:-1]
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
128 elif in_msgstr:
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
129 translations[-1][1] += line.rstrip()[1:-1]
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
130
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
131 if messages:
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
132 yield pack()
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
133
26
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
134 POT_HEADER = """\
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
135 # Translations Template for %%(project)s.
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
136 # Copyright (C) YEAR ORGANIZATION
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
137 # FIRST AUTHOR <EMAIL@ADDRESS>, YEAR.
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
138 #
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
139 msgid ""
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
140 msgstr ""
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
141 "Project-Id-Version: %%(project)s %%(version)s\\n"
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
142 "POT-Creation-Date: %%(creation_date)s\\n"
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
143 "PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\\n"
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
144 "Last-Translator: FULL NAME <EMAIL@ADDRESS>\\n"
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
145 "Language-Team: LANGUAGE <LL@li.org>\\n"
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
146 "MIME-Version: 1.0\\n"
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
147 "Content-Type: text/plain; charset=%%(charset)s\\n"
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
148 "Content-Transfer-Encoding: 8bit\\n"
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
149 "Generated-By: Babel %s\\n"
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
150
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
151 """ % VERSION
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
152
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
153 PYTHON_FORMAT = re.compile(r'\%(\([\w]+\))?[diouxXeEfFgGcrs]').search
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
154
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
155 WORD_SEP = re.compile('('
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
156 r'\s+|' # any whitespace
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
157 r'[^\s\w]*\w+[a-zA-Z]-(?=\w+[a-zA-Z])|' # hyphenated words
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
158 r'(?<=[\w\!\"\'\&\.\,\?])-{2,}(?=\w)' # em-dash
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
159 ')')
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
160
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
161 def escape(string):
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
162 r"""Escape the given string so that it can be included in double-quoted
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
163 strings in ``PO`` files.
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
164
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
165 >>> escape('''Say:
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
166 ... "hello, world!"
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
167 ... ''')
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
168 '"Say:\\n \\"hello, world!\\"\\n"'
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
169
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
170 :param string: the string to escape
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
171 :return: the escaped string
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
172 :rtype: `str` or `unicode`
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
173 """
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
174 return '"%s"' % string.replace('\\', '\\\\') \
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
175 .replace('\t', '\\t') \
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
176 .replace('\r', '\\r') \
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
177 .replace('\n', '\\n') \
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
178 .replace('\"', '\\"')
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
179
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
180 def normalize(string, width=76):
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
181 r"""This converts a string into a format that is appropriate for .po files.
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
182
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
183 >>> print normalize('''Say:
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
184 ... "hello, world!"
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
185 ... ''', width=None)
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
186 ""
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
187 "Say:\n"
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
188 " \"hello, world!\"\n"
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
189
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
190 >>> print normalize('''Say:
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
191 ... "Lorem ipsum dolor sit amet, consectetur adipisicing elit, "
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
192 ... ''', width=32)
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
193 ""
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
194 "Say:\n"
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
195 " \"Lorem ipsum dolor sit "
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
196 "amet, consectetur adipisicing"
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
197 " elit, \"\n"
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
198
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
199 :param string: the string to normalize
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
200 :param width: the maximum line width; use `None`, 0, or a negative number
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
201 to completely disable line wrapping
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
202 :param charset: the encoding to use for `unicode` strings
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
203 :return: the normalized string
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
204 :rtype: `unicode`
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
205 """
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
206 if width and width > 0:
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
207 lines = []
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
208 for idx, line in enumerate(string.splitlines(True)):
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
209 if len(escape(line)) > width:
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
210 chunks = WORD_SEP.split(line)
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
211 chunks.reverse()
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
212 while chunks:
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
213 buf = []
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
214 size = 2
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
215 while chunks:
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
216 l = len(escape(chunks[-1])) - 2
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
217 if size + l < width:
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
218 buf.append(chunks.pop())
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
219 size += l
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
220 else:
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
221 if not buf:
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
222 # handle long chunks by putting them on a
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
223 # separate line
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
224 buf.append(chunks.pop())
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
225 break
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
226 lines.append(u''.join(buf))
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
227 else:
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
228 lines.append(line)
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
229 else:
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
230 lines = string.splitlines(True)
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
231
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
232 if len(lines) == 1:
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
233 return escape(string)
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
234
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
235 # Remove empty trailing line
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
236 if not lines[-1]:
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
237 del lines[-1]
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
238 lines[-1] += '\n'
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
239
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
240 return u'""\n' + u'\n'.join([escape(l) for l in lines])
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
241
25
ee33990f6e83 Added line-wrap support for `write_po`.
palgarvio
parents: 23
diff changeset
242 def write_po(fileobj, messages, project='PROJECT', version='VERSION', width=76,
23
f828705c3bce Change pot header's first line, "Translations Template for %%(project)s." instead of "SOME DESCRIPTIVE TITLE.". '''`project`''' and '''`version`''' now default to '''PROJECT''' and '''VERSION''' respectively. Fixed a bug regarding '''Content-Transfer-Encoding''', it shouldn't be the charset, and we're defaulting to `8bit` untill someone complains.
palgarvio
parents: 19
diff changeset
243 charset='utf-8', no_location=False, omit_header=False):
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
244 r"""Write a ``gettext`` PO (portable object) file to the given file-like
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
245 object.
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
246
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
247 The `messages` parameter is expected to be an iterable object producing
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
248 tuples of the form:
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
249
8
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
250 ``(filename, lineno, funcname, message, flags)``
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
251
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
252 >>> from StringIO import StringIO
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
253 >>> buf = StringIO()
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
254 >>> write_po(buf, [
8
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
255 ... ('main.py', 1, None, u'foo %(name)s', ('fuzzy',)),
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
256 ... ('main.py', 3, 'ngettext', (u'bar', u'baz'), None)
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
257 ... ], omit_header=True)
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
258
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
259 >>> print buf.getvalue()
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
260 #: main.py:1
8
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
261 #, fuzzy, python-format
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
262 msgid "foo %(name)s"
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
263 msgstr ""
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
264 <BLANKLINE>
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
265 #: main.py:3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
266 msgid "bar"
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
267 msgid_plural "baz"
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
268 msgstr[0] ""
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
269 msgstr[1] ""
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
270 <BLANKLINE>
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
271 <BLANKLINE>
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
272
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
273 :param fileobj: the file-like object to write to
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
274 :param messages: an iterable over the messages
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
275 :param project: the project name
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
276 :param version: the project version
26
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
277 :param width: the maximum line width for the generated output; use `None`,
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
278 0, or a negative number to completely disable line wrapping
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
279 :param charset: the encoding
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
280 :param no_location: do not emit a location comment for every message
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
281 :param omit_header: do not include the ``msgid ""`` entry at the top of the
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
282 output
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
283 """
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
284 def _normalize(key):
26
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
285 return normalize(key, width=width).encode(charset, 'backslashreplace')
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
286
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
287 def _write(text):
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
288 if isinstance(text, unicode):
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
289 text = text.encode(charset)
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
290 fileobj.write(text)
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
291
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
292 if not omit_header:
26
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
293 _write(POT_HEADER % {
7
8d7b3077e6d1 * The creation-date header in generated PO files now includes the timezone offset.
cmlenz
parents: 3
diff changeset
294 'project': project,
8d7b3077e6d1 * The creation-date header in generated PO files now includes the timezone offset.
cmlenz
parents: 3
diff changeset
295 'version': version,
8d7b3077e6d1 * The creation-date header in generated PO files now includes the timezone offset.
cmlenz
parents: 3
diff changeset
296 'creation_date': time.strftime('%Y-%m-%d %H:%M%z'),
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
297 'charset': charset,
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
298 })
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
299
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
300 locations = {}
8
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
301 msgflags = {}
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
302 msgids = []
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
303
8
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
304 for filename, lineno, funcname, key, flags in messages:
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
305 flags = set(flags or [])
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
306 if key in msgids:
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
307 locations[key].append((filename, lineno))
8
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
308 msgflags[key] |= flags
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
309 else:
8
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
310 if (isinstance(key, (list, tuple)) and
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
311 filter(None, [PYTHON_FORMAT(k) for k in key])) or \
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
312 (isinstance(key, basestring) and PYTHON_FORMAT(key)):
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
313 flags.add('python-format')
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
314 else:
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
315 flags.discard('python-format')
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
316 locations[key] = [(filename, lineno)]
8
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
317 msgflags[key] = flags
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
318 msgids.append(key)
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
319
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
320 for msgid in msgids:
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
321 if not no_location:
26
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
322 locs = u' '.join([u'%s:%d' % item for item in locations[msgid]])
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
323 if width and width > 0:
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
324 locs = textwrap.wrap(locs, width, break_long_words=False)
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
325 for line in locs:
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
326 _write('#: %s\n' % line.strip())
8
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
327 flags = msgflags[msgid]
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy").
cmlenz
parents: 7
diff changeset
328 if flags:
26
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
329 _write('#%s\n' % ', '.join([''] + list(flags)))
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
330
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
331 if type(msgid) is tuple:
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
332 assert len(msgid) == 2
26
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
333 _write('msgid %s\n' % _normalize(msgid[0]))
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
334 _write('msgid_plural %s\n' % _normalize(msgid[1]))
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
335 _write('msgstr[0] ""\n')
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
336 _write('msgstr[1] ""\n')
3
e9eaddab598e Import of initial code base.
cmlenz
parents:
diff changeset
337 else:
26
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
338 _write('msgid %s\n' % _normalize(msgid))
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
339 _write('msgstr ""\n')
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks).
cmlenz
parents: 25
diff changeset
340 _write('\n')
Copyright (C) 2012-2017 Edgewall Software