annotate babel/messages/mofile.py @ 334:1786dce4b1b0 trunk

Add basic MO file reading in preparation for #54.
author cmlenz
date Tue, 10 Jun 2008 17:05:52 +0000
parents 465a0582d308
children 4db404d0c19b
rev   line source
160
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
1 # -*- coding: utf-8 -*-
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
2 #
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
3 # Copyright (C) 2007 Edgewall Software
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
4 # All rights reserved.
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
5 #
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
6 # This software is licensed as described in the file COPYING, which
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
7 # you should have received as part of this distribution. The terms
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
8 # are also available at http://babel.edgewall.org/wiki/License.
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
9 #
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
10 # This software consists of voluntary contributions made by many
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
11 # individuals. For the exact contribution history, see the revision
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
12 # history and logs, available at http://babel.edgewall.org/log/.
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
13
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
14 """Writing of files in the ``gettext`` MO (machine object) format.
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
15
234
38053412171b Add more `since` tags to stuff added in trunk.
cmlenz
parents: 174
diff changeset
16 :since: version 0.9
160
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
17 :see: `The Format of MO Files
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
18 <http://www.gnu.org/software/gettext/manual/gettext.html#MO-Files>`_
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
19 """
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
20
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
21 import array
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
22 import struct
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
23
334
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
24 from babel.messages.catalog import Catalog, Message
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
25
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
26 __all__ = ['read_mo', 'write_mo']
161
212d8469ec8c Slightly simplified CLI-frontend class.
cmlenz
parents: 160
diff changeset
27 __docformat__ = 'restructuredtext en'
212d8469ec8c Slightly simplified CLI-frontend class.
cmlenz
parents: 160
diff changeset
28
334
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
29
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
30 LE_MAGIC = 0x950412deL
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
31 BE_MAGIC = 0xde120495L
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
32
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
33 def read_mo(fileobj):
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
34 """Read a binary MO file from the given file-like object and return a
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
35 corresponding `Catalog` object.
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
36
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
37 :param fileobj: the file-like object to read the MO file from
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
38 :return: a catalog object representing the parsed MO file
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
39 :rtype: `Catalog`
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
40
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
41 :note: The implementation of this function is heavily based on the
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
42 ``GNUTranslations._parse`` method of the ``gettext`` module in the
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
43 standard library.
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
44 """
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
45 catalog = Catalog()
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
46 headers = {}
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
47
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
48 unpack = struct.unpack
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
49 filename = getattr(fileobj, 'name', '')
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
50 charset = None
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
51
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
52 buf = fileobj.read()
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
53 buflen = len(buf)
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
54
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
55 # Parse the .mo file header, which consists of 5 little endian 32
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
56 # bit words.
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
57 magic = unpack('<I', buf[:4])[0] # Are we big endian or little endian?
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
58 if magic == LE_MAGIC:
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
59 version, msgcount, masteridx, transidx = unpack('<4I', buf[4:20])
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
60 ii = '<II'
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
61 elif magic == BE_MAGIC:
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
62 version, msgcount, masteridx, transidx = unpack('>4I', buf[4:20])
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
63 ii = '>II'
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
64 else:
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
65 raise IOError(0, 'Bad magic number', filename)
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
66
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
67 # Now put all messages from the .mo file buffer into the catalog
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
68 # dictionary
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
69 for i in xrange(0, msgcount):
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
70 mlen, moff = unpack(ii, buf[masteridx:masteridx + 8])
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
71 mend = moff + mlen
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
72 tlen, toff = unpack(ii, buf[transidx:transidx + 8])
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
73 tend = toff + tlen
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
74 if mend < buflen and tend < buflen:
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
75 msg = buf[moff:mend]
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
76 tmsg = buf[toff:tend]
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
77 else:
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
78 raise IOError(0, 'File is corrupt', filename)
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
79
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
80 # See if we're looking at GNU .mo conventions for metadata
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
81 if mlen == 0:
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
82 # Catalog description
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
83 lastkey = key = None
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
84 for item in tmsg.splitlines():
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
85 item = item.strip()
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
86 if not item:
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
87 continue
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
88 if ':' in item:
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
89 key, value = item.split(':', 1)
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
90 lastkey = key = key.strip().lower()
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
91 value = value.strip()
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
92 headers[key] = value
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
93 if key == 'content-type':
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
94 charset = value.split('charset=')[1]
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
95 elif lastkey:
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
96 self._info[lastkey] += '\n' + item
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
97
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
98 # Note: we unconditionally convert both msgids and msgstrs to
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
99 # Unicode using the character encoding specified in the charset
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
100 # parameter of the Content-Type header. The gettext documentation
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
101 # strongly encourages msgids to be us-ascii, but some appliations
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
102 # require alternative encodings (e.g. Zope's ZCML and ZPT). For
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
103 # traditional gettext applications, the msgid conversion will
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
104 # cause no problems since us-ascii should always be a subset of
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
105 # the charset encoding. We may want to fall back to 8-bit msgids
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
106 # if the Unicode conversion fails.
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
107 if '\x00' in msg:
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
108 # Plural forms
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
109 msg = msg.split('\x00')
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
110 tmsg = tmsg.split('\x00')
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
111 if charset:
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
112 msg = [unicode(x, charset) for x in msg]
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
113 tmsg = [unicode(x, charset) for x in tmsg]
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
114 else:
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
115 if charset:
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
116 msg = unicode(msg, charset)
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
117 tmsg = unicode(tmsg, charset)
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
118 catalog[msg] = Message(msg, tmsg)
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
119
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
120 # advance to next entry in the seek tables
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
121 masteridx += 8
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
122 transidx += 8
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
123
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
124 catalog.mime_headers = headers.items()
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
125 return catalog
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
126
160
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
127 def write_mo(fileobj, catalog, use_fuzzy=False):
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
128 """Write a catalog to the specified file-like object using the GNU MO file
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
129 format.
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
130
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
131 >>> from babel.messages import Catalog
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
132 >>> from gettext import GNUTranslations
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
133 >>> from StringIO import StringIO
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
134
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
135 >>> catalog = Catalog(locale='en_US')
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
136 >>> catalog.add('foo', 'Voh')
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
137 >>> catalog.add((u'bar', u'baz'), (u'Bahr', u'Batz'))
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
138 >>> catalog.add('fuz', 'Futz', flags=['fuzzy'])
172
208c7dfb9041 Fix for #28 with updated doctest.
palgarvio
parents: 161
diff changeset
139 >>> catalog.add('Fizz', '')
174
bd256296086c Extended the doctest to include tests for the fix on [176].
palgarvio
parents: 173
diff changeset
140 >>> catalog.add(('Fuzz', 'Fuzzes'), ('', ''))
160
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
141 >>> buf = StringIO()
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
142
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
143 >>> write_mo(buf, catalog)
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
144 >>> buf.seek(0)
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
145 >>> translations = GNUTranslations(fp=buf)
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
146 >>> translations.ugettext('foo')
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
147 u'Voh'
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
148 >>> translations.ungettext('bar', 'baz', 1)
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
149 u'Bahr'
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
150 >>> translations.ungettext('bar', 'baz', 2)
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
151 u'Batz'
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
152 >>> translations.ugettext('fuz')
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
153 u'fuz'
172
208c7dfb9041 Fix for #28 with updated doctest.
palgarvio
parents: 161
diff changeset
154 >>> translations.ugettext('Fizz')
208c7dfb9041 Fix for #28 with updated doctest.
palgarvio
parents: 161
diff changeset
155 u'Fizz'
174
bd256296086c Extended the doctest to include tests for the fix on [176].
palgarvio
parents: 173
diff changeset
156 >>> translations.ugettext('Fuzz')
bd256296086c Extended the doctest to include tests for the fix on [176].
palgarvio
parents: 173
diff changeset
157 u'Fuzz'
bd256296086c Extended the doctest to include tests for the fix on [176].
palgarvio
parents: 173
diff changeset
158 >>> translations.ugettext('Fuzzes')
bd256296086c Extended the doctest to include tests for the fix on [176].
palgarvio
parents: 173
diff changeset
159 u'Fuzzes'
160
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
160
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
161 :param fileobj: the file-like object to write to
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
162 :param catalog: the `Catalog` instance
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
163 :param use_fuzzy: whether translations marked as "fuzzy" should be included
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
164 in the output
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
165 """
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
166 messages = list(catalog)
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
167 if not use_fuzzy:
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
168 messages[1:] = [m for m in messages[1:] if not m.fuzzy]
248
f0b1ee94628c add a __cmp__ to Message that correctly sorts by id, taking into account plurals
pjenvey
parents: 234
diff changeset
169 messages.sort()
160
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
170
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
171 ids = strs = ''
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
172 offsets = []
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
173
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
174 for message in messages:
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
175 # For each string, we need size and file offset. Each string is NUL
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
176 # terminated; the NUL does not count into the size.
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
177 if message.pluralizable:
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
178 msgid = '\x00'.join([
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
179 msgid.encode(catalog.charset) for msgid in message.id
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
180 ])
173
c100331c727c Forgot to fix the pluralizable messages, regarding #28.
palgarvio
parents: 172
diff changeset
181 msgstrs = []
c100331c727c Forgot to fix the pluralizable messages, regarding #28.
palgarvio
parents: 172
diff changeset
182 for idx, string in enumerate(message.string):
c100331c727c Forgot to fix the pluralizable messages, regarding #28.
palgarvio
parents: 172
diff changeset
183 if not string:
330
465a0582d308 Fix for #97, compilation of message catalogs for locales with more than two plural forms where the translations were empty was failing.
cmlenz
parents: 248
diff changeset
184 msgstrs.append(message.id[min(int(idx), 1)])
173
c100331c727c Forgot to fix the pluralizable messages, regarding #28.
palgarvio
parents: 172
diff changeset
185 else:
c100331c727c Forgot to fix the pluralizable messages, regarding #28.
palgarvio
parents: 172
diff changeset
186 msgstrs.append(string)
160
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
187 msgstr = '\x00'.join([
173
c100331c727c Forgot to fix the pluralizable messages, regarding #28.
palgarvio
parents: 172
diff changeset
188 msgstr.encode(catalog.charset) for msgstr in msgstrs
160
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
189 ])
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
190 else:
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
191 msgid = message.id.encode(catalog.charset)
172
208c7dfb9041 Fix for #28 with updated doctest.
palgarvio
parents: 161
diff changeset
192 if not message.string:
208c7dfb9041 Fix for #28 with updated doctest.
palgarvio
parents: 161
diff changeset
193 msgstr = message.id.encode(catalog.charset)
208c7dfb9041 Fix for #28 with updated doctest.
palgarvio
parents: 161
diff changeset
194 else:
208c7dfb9041 Fix for #28 with updated doctest.
palgarvio
parents: 161
diff changeset
195 msgstr = message.string.encode(catalog.charset)
160
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
196 offsets.append((len(ids), len(msgid), len(strs), len(msgstr)))
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
197 ids += msgid + '\x00'
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
198 strs += msgstr + '\x00'
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
199
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
200 # The header is 7 32-bit unsigned integers. We don't use hash tables, so
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
201 # the keys start right after the index tables.
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
202 keystart = 7 * 4 + 16 * len(messages)
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
203 valuestart = keystart + len(ids)
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
204
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
205 # The string table first has the list of keys, then the list of values.
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
206 # Each entry has first the size of the string, then the file offset.
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
207 koffsets = []
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
208 voffsets = []
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
209 for o1, l1, o2, l2 in offsets:
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
210 koffsets += [l1, o1 + keystart]
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
211 voffsets += [l2, o2 + valuestart]
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
212 offsets = koffsets + voffsets
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
213
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
214 fileobj.write(struct.pack('Iiiiiii',
334
1786dce4b1b0 Add basic MO file reading in preparation for #54.
cmlenz
parents: 330
diff changeset
215 LE_MAGIC, # magic
160
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
216 0, # version
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
217 len(messages), # number of entries
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
218 7 * 4, # start of key index
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
219 7 * 4 + len(messages) * 8, # start of value index
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
220 0, 0 # size and offset of hash table
23005b4efc99 Add MO file generation. Closes #21.
cmlenz
parents:
diff changeset
221 ) + array.array("i", offsets).tostring() + ids + strs)
Copyright (C) 2012-2017 Edgewall Software