babel/old/mirror: babel/messages/pofile.py annotate

annotate babel/messages/pofile.py @ 198:982d7e704fdc

Fix for #35, and a minor improvement to how we parse the catalog fuzzy bit.

author	cmlenz
date	Tue, 03 Jul 2007 12:52:44 +0000
parents	b5e58a22ebd2
children	10e8d072e2d1

rev	line source
3 e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	1 # -- coding: utf-8 --
e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	2 #
e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	3 # Copyright (C) 2007 Edgewall Software
e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	4 # All rights reserved.
e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	5 #
e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	6 # This software is licensed as described in the file COPYING, which
e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	7 # you should have received as part of this distribution. The terms
e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	8 # are also available at http://babel.edgewall.org/wiki/License.
e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	9 #
e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	10 # This software consists of voluntary contributions made by many
e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	11 # individuals. For the exact contribution history, see the revision
e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	12 # history and logs, available at http://babel.edgewall.org/log/.
e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	13
e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	14 """Reading and writing of files in the ``gettext`` PO (portable object)
e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	15 format.
e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	16
e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	17 :see: `The Format of PO Files
e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	18 <http://www.gnu.org/software/gettext/manual/gettext.html#PO-Files>`_
e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	19 """
e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	20
7 8d7b3077e6d1 * The creation-date header in generated PO files now includes the timezone offset. cmlenz parents: 3 diff changeset	21 from datetime import date, datetime
136 9e3d2b227ec3 More fixes for Windows compatibility: cmlenz parents: 122 diff changeset	22 import os
3 e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	23 import re
8 ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy"). cmlenz parents: 7 diff changeset	24 try:
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy"). cmlenz parents: 7 diff changeset	25 set
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy"). cmlenz parents: 7 diff changeset	26 except NameError:
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy"). cmlenz parents: 7 diff changeset	27 from sets import Set as set
105 abd3a594dab4 Implement wrapping of header comments in PO(T) output. Related to #14. cmlenz parents: 104 diff changeset	28 from textwrap import wrap
3 e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	29
e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	30 from babel import __version__ as VERSION
58 068952b4d4c0 Add actual data structures for handling message catalogs, so that more code can be reused here between the frontends. cmlenz parents: 57 diff changeset	31 from babel.messages.catalog import Catalog
99 b6b5992daa6c Renamed `LOCAL` to `LOCALTZ`. cmlenz parents: 98 diff changeset	32 from babel.util import LOCALTZ
3 e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	33
180 7e88950ab661 Minor change to what symbols are ?exported?, primarily for the generated docs. cmlenz parents: 177 diff changeset	34 __all__ = ['read_po', 'write_po']
163 2faa5dc63068 Slightly simplified CLI-frontend class. cmlenz parents: 160 diff changeset	35 __docformat__ = 'restructuredtext en'
160 b5659b7779be Minor cleanup in the `pofile` module. cmlenz parents: 151 diff changeset	36
b5659b7779be Minor cleanup in the `pofile` module. cmlenz parents: 151 diff changeset	37 def unescape(string):
b5659b7779be Minor cleanup in the `pofile` module. cmlenz parents: 151 diff changeset	38 r"""Reverse `escape` the given string.
b5659b7779be Minor cleanup in the `pofile` module. cmlenz parents: 151 diff changeset	39
b5659b7779be Minor cleanup in the `pofile` module. cmlenz parents: 151 diff changeset	40 >>> print unescape('"Say:\\n \\"hello, world!\\"\\n"')
b5659b7779be Minor cleanup in the `pofile` module. cmlenz parents: 151 diff changeset	41 Say:
b5659b7779be Minor cleanup in the `pofile` module. cmlenz parents: 151 diff changeset	42 "hello, world!"
b5659b7779be Minor cleanup in the `pofile` module. cmlenz parents: 151 diff changeset	43 <BLANKLINE>
b5659b7779be Minor cleanup in the `pofile` module. cmlenz parents: 151 diff changeset	44
b5659b7779be Minor cleanup in the `pofile` module. cmlenz parents: 151 diff changeset	45 :param string: the string to unescape
b5659b7779be Minor cleanup in the `pofile` module. cmlenz parents: 151 diff changeset	46 :return: the unescaped string
b5659b7779be Minor cleanup in the `pofile` module. cmlenz parents: 151 diff changeset	47 :rtype: `str` or `unicode`
b5659b7779be Minor cleanup in the `pofile` module. cmlenz parents: 151 diff changeset	48 """
b5659b7779be Minor cleanup in the `pofile` module. cmlenz parents: 151 diff changeset	49 return string[1:-1].replace('\\\\', '\\') \
b5659b7779be Minor cleanup in the `pofile` module. cmlenz parents: 151 diff changeset	50 .replace('\\t', '\t') \
b5659b7779be Minor cleanup in the `pofile` module. cmlenz parents: 151 diff changeset	51 .replace('\\r', '\r') \
b5659b7779be Minor cleanup in the `pofile` module. cmlenz parents: 151 diff changeset	52 .replace('\\n', '\n') \
b5659b7779be Minor cleanup in the `pofile` module. cmlenz parents: 151 diff changeset	53 .replace('\\"', '\"')
b5659b7779be Minor cleanup in the `pofile` module. cmlenz parents: 151 diff changeset	54
b5659b7779be Minor cleanup in the `pofile` module. cmlenz parents: 151 diff changeset	55 def denormalize(string):
b5659b7779be Minor cleanup in the `pofile` module. cmlenz parents: 151 diff changeset	56 r"""Reverse the normalization done by the `normalize` function.
b5659b7779be Minor cleanup in the `pofile` module. cmlenz parents: 151 diff changeset	57
b5659b7779be Minor cleanup in the `pofile` module. cmlenz parents: 151 diff changeset	58 >>> print denormalize(r'''""
b5659b7779be Minor cleanup in the `pofile` module. cmlenz parents: 151 diff changeset	59 ... "Say:\n"
b5659b7779be Minor cleanup in the `pofile` module. cmlenz parents: 151 diff changeset	60 ... " \"hello, world!\"\n"''')
b5659b7779be Minor cleanup in the `pofile` module. cmlenz parents: 151 diff changeset	61 Say:
b5659b7779be Minor cleanup in the `pofile` module. cmlenz parents: 151 diff changeset	62 "hello, world!"
b5659b7779be Minor cleanup in the `pofile` module. cmlenz parents: 151 diff changeset	63 <BLANKLINE>
b5659b7779be Minor cleanup in the `pofile` module. cmlenz parents: 151 diff changeset	64
b5659b7779be Minor cleanup in the `pofile` module. cmlenz parents: 151 diff changeset	65 >>> print denormalize(r'''""
b5659b7779be Minor cleanup in the `pofile` module. cmlenz parents: 151 diff changeset	66 ... "Say:\n"
b5659b7779be Minor cleanup in the `pofile` module. cmlenz parents: 151 diff changeset	67 ... " \"Lorem ipsum dolor sit "
b5659b7779be Minor cleanup in the `pofile` module. cmlenz parents: 151 diff changeset	68 ... "amet, consectetur adipisicing"
b5659b7779be Minor cleanup in the `pofile` module. cmlenz parents: 151 diff changeset	69 ... " elit, \"\n"''')
b5659b7779be Minor cleanup in the `pofile` module. cmlenz parents: 151 diff changeset	70 Say:
b5659b7779be Minor cleanup in the `pofile` module. cmlenz parents: 151 diff changeset	71 "Lorem ipsum dolor sit amet, consectetur adipisicing elit, "
b5659b7779be Minor cleanup in the `pofile` module. cmlenz parents: 151 diff changeset	72 <BLANKLINE>
b5659b7779be Minor cleanup in the `pofile` module. cmlenz parents: 151 diff changeset	73
b5659b7779be Minor cleanup in the `pofile` module. cmlenz parents: 151 diff changeset	74 :param string: the string to denormalize
b5659b7779be Minor cleanup in the `pofile` module. cmlenz parents: 151 diff changeset	75 :return: the denormalized string
b5659b7779be Minor cleanup in the `pofile` module. cmlenz parents: 151 diff changeset	76 :rtype: `unicode` or `str`
b5659b7779be Minor cleanup in the `pofile` module. cmlenz parents: 151 diff changeset	77 """
b5659b7779be Minor cleanup in the `pofile` module. cmlenz parents: 151 diff changeset	78 if string.startswith('""'):
b5659b7779be Minor cleanup in the `pofile` module. cmlenz parents: 151 diff changeset	79 lines = []
b5659b7779be Minor cleanup in the `pofile` module. cmlenz parents: 151 diff changeset	80 for line in string.splitlines()[1:]:
b5659b7779be Minor cleanup in the `pofile` module. cmlenz parents: 151 diff changeset	81 lines.append(unescape(line))
b5659b7779be Minor cleanup in the `pofile` module. cmlenz parents: 151 diff changeset	82 return ''.join(lines)
b5659b7779be Minor cleanup in the `pofile` module. cmlenz parents: 151 diff changeset	83 else:
b5659b7779be Minor cleanup in the `pofile` module. cmlenz parents: 151 diff changeset	84 return unescape(string)
3 e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	85
198 982d7e704fdc Fix for #35, and a minor improvement to how we parse the catalog fuzzy bit. cmlenz parents: 193 diff changeset	86 def read_po(fileobj, locale=None, domain=None):
8 ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy"). cmlenz parents: 7 diff changeset	87 """Read messages from a ``gettext`` PO (portable object) file from the given
66 d1a7425739d3 `read_po` now returns a `Catalog`. cmlenz parents: 58 diff changeset	88 file-like object and return a `Catalog`.
8 ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy"). cmlenz parents: 7 diff changeset	89
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy"). cmlenz parents: 7 diff changeset	90 >>> from StringIO import StringIO
108 8ea225f33f28 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`. cmlenz parents: 107 diff changeset	91 >>> buf = StringIO('''
8 ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy"). cmlenz parents: 7 diff changeset	92 ... #: main.py:1
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy"). cmlenz parents: 7 diff changeset	93 ... #, fuzzy, python-format
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy"). cmlenz parents: 7 diff changeset	94 ... msgid "foo %(name)s"
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy"). cmlenz parents: 7 diff changeset	95 ... msgstr ""
23 f828705c3bce Change pot header's first line, "Translations Template for %%(project)s." instead of "SOME DESCRIPTIVE TITLE.". '''`project`''' and '''`version`''' now default to '''PROJECT''' and '''VERSION''' respectively. Fixed a bug regarding '''Content-Transfer-Encoding''', it shouldn't be the charset, and we're defaulting to `8bit` untill someone complains. palgarvio parents: 19 diff changeset	96 ...
96 6c07c38e23aa Updated `read_po` to add user comments besides just auto comments. palgarvio parents: 86 diff changeset	97 ... # A user comment
6c07c38e23aa Updated `read_po` to add user comments besides just auto comments. palgarvio parents: 86 diff changeset	98 ... #. An auto comment
8 ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy"). cmlenz parents: 7 diff changeset	99 ... #: main.py:3
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy"). cmlenz parents: 7 diff changeset	100 ... msgid "bar"
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy"). cmlenz parents: 7 diff changeset	101 ... msgid_plural "baz"
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy"). cmlenz parents: 7 diff changeset	102 ... msgstr[0] ""
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy"). cmlenz parents: 7 diff changeset	103 ... msgstr[1] ""
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy"). cmlenz parents: 7 diff changeset	104 ... ''')
66 d1a7425739d3 `read_po` now returns a `Catalog`. cmlenz parents: 58 diff changeset	105 >>> catalog = read_po(buf)
106 2a00e352c986 Merged `write_pot` and `write_po` functions by moving more functionality to the `Catalog` class. This is certainly not perfect yet, but moves us in the right direction. cmlenz parents: 105 diff changeset	106 >>> catalog.revision_date = datetime(2007, 04, 01)
2a00e352c986 Merged `write_pot` and `write_po` functions by moving more functionality to the `Catalog` class. This is certainly not perfect yet, but moves us in the right direction. cmlenz parents: 105 diff changeset	107
66 d1a7425739d3 `read_po` now returns a `Catalog`. cmlenz parents: 58 diff changeset	108 >>> for message in catalog:
69 1d8e81bfedf9 Enhance catalog to also manage the MIME headers. cmlenz parents: 66 diff changeset	109 ... if message.id:
1d8e81bfedf9 Enhance catalog to also manage the MIME headers. cmlenz parents: 66 diff changeset	110 ... print (message.id, message.string)
107 4b42e23644e5 `Message`, `read_po` and `write_po` now all handle user/auto comments correctly. palgarvio parents: 106 diff changeset	111 ... print ' ', (message.locations, message.flags)
4b42e23644e5 `Message`, `read_po` and `write_po` now all handle user/auto comments correctly. palgarvio parents: 106 diff changeset	112 ... print ' ', (message.user_comments, message.auto_comments)
151 12e5f21dfcda Respect charset specified in PO headers in `read_po()`. Fixes #17. cmlenz parents: 136 diff changeset	113 (u'foo %(name)s', '')
12e5f21dfcda Respect charset specified in PO headers in `read_po()`. Fixes #17. cmlenz parents: 136 diff changeset	114 ([(u'main.py', 1)], set([u'fuzzy', u'python-format']))
107 4b42e23644e5 `Message`, `read_po` and `write_po` now all handle user/auto comments correctly. palgarvio parents: 106 diff changeset	115 ([], [])
151 12e5f21dfcda Respect charset specified in PO headers in `read_po()`. Fixes #17. cmlenz parents: 136 diff changeset	116 ((u'bar', u'baz'), ('', ''))
12e5f21dfcda Respect charset specified in PO headers in `read_po()`. Fixes #17. cmlenz parents: 136 diff changeset	117 ([(u'main.py', 3)], set([]))
12e5f21dfcda Respect charset specified in PO headers in `read_po()`. Fixes #17. cmlenz parents: 136 diff changeset	118 ([u'A user comment'], [u'An auto comment'])
3 e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	119
e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	120 :param fileobj: the file-like object to read the PO file from
198 982d7e704fdc Fix for #35, and a minor improvement to how we parse the catalog fuzzy bit. cmlenz parents: 193 diff changeset	121 :param locale: the locale identifier or `Locale` object, or `None`
982d7e704fdc Fix for #35, and a minor improvement to how we parse the catalog fuzzy bit. cmlenz parents: 193 diff changeset	122 if the catalog is not bound to a locale (which basically
982d7e704fdc Fix for #35, and a minor improvement to how we parse the catalog fuzzy bit. cmlenz parents: 193 diff changeset	123 means it's a template)
982d7e704fdc Fix for #35, and a minor improvement to how we parse the catalog fuzzy bit. cmlenz parents: 193 diff changeset	124 :param domain: the message domain
3 e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	125 :return: an iterator over ``(message, translation, location)`` tuples
e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	126 :rtype: ``iterator``
e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	127 """
198 982d7e704fdc Fix for #35, and a minor improvement to how we parse the catalog fuzzy bit. cmlenz parents: 193 diff changeset	128 catalog = Catalog(locale=locale, domain=domain)
66 d1a7425739d3 `read_po` now returns a `Catalog`. cmlenz parents: 58 diff changeset	129
198 982d7e704fdc Fix for #35, and a minor improvement to how we parse the catalog fuzzy bit. cmlenz parents: 193 diff changeset	130 counter = [0]
8 ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy"). cmlenz parents: 7 diff changeset	131 messages = []
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy"). cmlenz parents: 7 diff changeset	132 translations = []
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy"). cmlenz parents: 7 diff changeset	133 locations = []
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy"). cmlenz parents: 7 diff changeset	134 flags = []
107 4b42e23644e5 `Message`, `read_po` and `write_po` now all handle user/auto comments correctly. palgarvio parents: 106 diff changeset	135 user_comments = []
4b42e23644e5 `Message`, `read_po` and `write_po` now all handle user/auto comments correctly. palgarvio parents: 106 diff changeset	136 auto_comments = []
8 ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy"). cmlenz parents: 7 diff changeset	137 in_msgid = in_msgstr = False
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy"). cmlenz parents: 7 diff changeset	138
66 d1a7425739d3 `read_po` now returns a `Catalog`. cmlenz parents: 58 diff changeset	139 def _add_message():
8 ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy"). cmlenz parents: 7 diff changeset	140 translations.sort()
66 d1a7425739d3 `read_po` now returns a `Catalog`. cmlenz parents: 58 diff changeset	141 if len(messages) > 1:
108 8ea225f33f28 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`. cmlenz parents: 107 diff changeset	142 msgid = tuple([denormalize(m) for m in messages])
66 d1a7425739d3 `read_po` now returns a `Catalog`. cmlenz parents: 58 diff changeset	143 else:
108 8ea225f33f28 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`. cmlenz parents: 107 diff changeset	144 msgid = denormalize(messages[0])
66 d1a7425739d3 `read_po` now returns a `Catalog`. cmlenz parents: 58 diff changeset	145 if len(translations) > 1:
108 8ea225f33f28 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`. cmlenz parents: 107 diff changeset	146 string = tuple([denormalize(t[1]) for t in translations])
66 d1a7425739d3 `read_po` now returns a `Catalog`. cmlenz parents: 58 diff changeset	147 else:
108 8ea225f33f28 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`. cmlenz parents: 107 diff changeset	148 string = denormalize(translations[0][1])
107 4b42e23644e5 `Message`, `read_po` and `write_po` now all handle user/auto comments correctly. palgarvio parents: 106 diff changeset	149 catalog.add(msgid, string, list(locations), set(flags),
110 ecc04be42086 Fixed a bug introduced in [106]. palgarvio parents: 108 diff changeset	150 list(auto_comments), list(user_comments))
86 8a703ecdba91 Some cosmetic changes for the new translator comments support. cmlenz parents: 82 diff changeset	151 del messages[:]; del translations[:]; del locations[:];
107 4b42e23644e5 `Message`, `read_po` and `write_po` now all handle user/auto comments correctly. palgarvio parents: 106 diff changeset	152 del flags[:]; del auto_comments[:]; del user_comments[:]
198 982d7e704fdc Fix for #35, and a minor improvement to how we parse the catalog fuzzy bit. cmlenz parents: 193 diff changeset	153 counter[0] += 1
8 ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy"). cmlenz parents: 7 diff changeset	154
3 e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	155 for line in fileobj.readlines():
151 12e5f21dfcda Respect charset specified in PO headers in `read_po()`. Fixes #17. cmlenz parents: 136 diff changeset	156 line = line.strip().decode(catalog.charset)
3 e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	157 if line.startswith('#'):
108 8ea225f33f28 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`. cmlenz parents: 107 diff changeset	158 in_msgid = in_msgstr = False
8ea225f33f28 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`. cmlenz parents: 107 diff changeset	159 if messages:
8ea225f33f28 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`. cmlenz parents: 107 diff changeset	160 _add_message()
8ea225f33f28 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`. cmlenz parents: 107 diff changeset	161 if line[1:].startswith(':'):
8ea225f33f28 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`. cmlenz parents: 107 diff changeset	162 for location in line[2:].lstrip().split():
8ea225f33f28 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`. cmlenz parents: 107 diff changeset	163 filename, lineno = location.split(':', 1)
8ea225f33f28 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`. cmlenz parents: 107 diff changeset	164 locations.append((filename, int(lineno)))
8ea225f33f28 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`. cmlenz parents: 107 diff changeset	165 elif line[1:].startswith(','):
8ea225f33f28 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`. cmlenz parents: 107 diff changeset	166 for flag in line[2:].lstrip().split(','):
8ea225f33f28 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`. cmlenz parents: 107 diff changeset	167 flags.append(flag.strip())
198 982d7e704fdc Fix for #35, and a minor improvement to how we parse the catalog fuzzy bit. cmlenz parents: 193 diff changeset	168
108 8ea225f33f28 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`. cmlenz parents: 107 diff changeset	169 elif line[1:].startswith('.'):
8ea225f33f28 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`. cmlenz parents: 107 diff changeset	170 # These are called auto-comments
8ea225f33f28 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`. cmlenz parents: 107 diff changeset	171 comment = line[2:].strip()
8ea225f33f28 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`. cmlenz parents: 107 diff changeset	172 if comment:
8ea225f33f28 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`. cmlenz parents: 107 diff changeset	173 # Just check that we're not adding empty comments
8ea225f33f28 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`. cmlenz parents: 107 diff changeset	174 auto_comments.append(comment)
122 03f106700f02 Added tests for `new_catalog` distutils command. cmlenz parents: 110 diff changeset	175 else:
108 8ea225f33f28 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`. cmlenz parents: 107 diff changeset	176 # These are called user comments
122 03f106700f02 Added tests for `new_catalog` distutils command. cmlenz parents: 110 diff changeset	177 user_comments.append(line[1:].strip())
106 2a00e352c986 Merged `write_pot` and `write_po` functions by moving more functionality to the `Catalog` class. This is certainly not perfect yet, but moves us in the right direction. cmlenz parents: 105 diff changeset	178 else:
3 e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	179 if line.startswith('msgid_plural'):
8 ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy"). cmlenz parents: 7 diff changeset	180 in_msgid = True
3 e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	181 msg = line[12:].lstrip()
108 8ea225f33f28 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`. cmlenz parents: 107 diff changeset	182 messages.append(msg)
3 e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	183 elif line.startswith('msgid'):
8 ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy"). cmlenz parents: 7 diff changeset	184 in_msgid = True
177 47f6c31e9a24 Changed the `__repr__` output to include the flags(it can be changed back, but it was usefull to implement the fuzzy header parsing). palgarvio parents: 163 diff changeset	185 txt = line[5:].lstrip()
8 ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy"). cmlenz parents: 7 diff changeset	186 if messages:
66 d1a7425739d3 `read_po` now returns a `Catalog`. cmlenz parents: 58 diff changeset	187 _add_message()
177 47f6c31e9a24 Changed the `__repr__` output to include the flags(it can be changed back, but it was usefull to implement the fuzzy header parsing). palgarvio parents: 163 diff changeset	188 messages.append(txt)
3 e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	189 elif line.startswith('msgstr'):
8 ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy"). cmlenz parents: 7 diff changeset	190 in_msgid = False
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy"). cmlenz parents: 7 diff changeset	191 in_msgstr = True
3 e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	192 msg = line[6:].lstrip()
e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	193 if msg.startswith('['):
8 ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy"). cmlenz parents: 7 diff changeset	194 idx, msg = msg[1:].split(']')
108 8ea225f33f28 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`. cmlenz parents: 107 diff changeset	195 translations.append([int(idx), msg.lstrip()])
8 ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy"). cmlenz parents: 7 diff changeset	196 else:
108 8ea225f33f28 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`. cmlenz parents: 107 diff changeset	197 translations.append([0, msg])
8 ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy"). cmlenz parents: 7 diff changeset	198 elif line.startswith('"'):
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy"). cmlenz parents: 7 diff changeset	199 if in_msgid:
108 8ea225f33f28 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`. cmlenz parents: 107 diff changeset	200 messages[-1] += u'\n' + line.rstrip()
8 ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy"). cmlenz parents: 7 diff changeset	201 elif in_msgstr:
108 8ea225f33f28 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`. cmlenz parents: 107 diff changeset	202 translations[-1][1] += u'\n' + line.rstrip()
8 ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy"). cmlenz parents: 7 diff changeset	203
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy"). cmlenz parents: 7 diff changeset	204 if messages:
66 d1a7425739d3 `read_po` now returns a `Catalog`. cmlenz parents: 58 diff changeset	205 _add_message()
198 982d7e704fdc Fix for #35, and a minor improvement to how we parse the catalog fuzzy bit. cmlenz parents: 193 diff changeset	206
982d7e704fdc Fix for #35, and a minor improvement to how we parse the catalog fuzzy bit. cmlenz parents: 193 diff changeset	207 # No actual messages found, but there was some info in comments, from which
982d7e704fdc Fix for #35, and a minor improvement to how we parse the catalog fuzzy bit. cmlenz parents: 193 diff changeset	208 # we'll construct an empty header message
982d7e704fdc Fix for #35, and a minor improvement to how we parse the catalog fuzzy bit. cmlenz parents: 193 diff changeset	209 elif not counter[0] and (flags or user_comments or auto_comments):
982d7e704fdc Fix for #35, and a minor improvement to how we parse the catalog fuzzy bit. cmlenz parents: 193 diff changeset	210 messages.append(u'')
982d7e704fdc Fix for #35, and a minor improvement to how we parse the catalog fuzzy bit. cmlenz parents: 193 diff changeset	211 translations.append([0, u''])
982d7e704fdc Fix for #35, and a minor improvement to how we parse the catalog fuzzy bit. cmlenz parents: 193 diff changeset	212 _add_message()
982d7e704fdc Fix for #35, and a minor improvement to how we parse the catalog fuzzy bit. cmlenz parents: 193 diff changeset	213
66 d1a7425739d3 `read_po` now returns a `Catalog`. cmlenz parents: 58 diff changeset	214 return catalog
3 e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	215
26 93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	216 WORD_SEP = re.compile('('
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	217 r'\s+\|' # any whitespace
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	218 r'[^\s\w]*\w+[a-zA-Z]-(?=\w+[a-zA-Z])\|' # hyphenated words
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	219 r'(?<=[\w\!\"\'\&\.\,\?])-{2,}(?=\w)' # em-dash
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	220 ')')
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	221
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	222 def escape(string):
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	223 r"""Escape the given string so that it can be included in double-quoted
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	224 strings in ``PO`` files.
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	225
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	226 >>> escape('''Say:
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	227 ... "hello, world!"
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	228 ... ''')
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	229 '"Say:\\n \\"hello, world!\\"\\n"'
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	230
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	231 :param string: the string to escape
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	232 :return: the escaped string
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	233 :rtype: `str` or `unicode`
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	234 """
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	235 return '"%s"' % string.replace('\\', '\\\\') \
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	236 .replace('\t', '\\t') \
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	237 .replace('\r', '\\r') \
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	238 .replace('\n', '\\n') \
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	239 .replace('\"', '\\"')
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	240
192 8f5805197198 Correctly write out obsolete messages spanning multiple lines. Fixes #33. cmlenz parents: 183 diff changeset	241 def normalize(string, prefix='', width=76):
108 8ea225f33f28 Fix for #16: the header message (`msgid = ""`) is now treated specially by `read_po` and `Catalog`. cmlenz parents: 107 diff changeset	242 r"""Convert a string into a format that is appropriate for .po files.
26 93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	243
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	244 >>> print normalize('''Say:
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	245 ... "hello, world!"
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	246 ... ''', width=None)
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	247 ""
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	248 "Say:\n"
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	249 " \"hello, world!\"\n"
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	250
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	251 >>> print normalize('''Say:
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	252 ... "Lorem ipsum dolor sit amet, consectetur adipisicing elit, "
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	253 ... ''', width=32)
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	254 ""
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	255 "Say:\n"
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	256 " \"Lorem ipsum dolor sit "
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	257 "amet, consectetur adipisicing"
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	258 " elit, \"\n"
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	259
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	260 :param string: the string to normalize
192 8f5805197198 Correctly write out obsolete messages spanning multiple lines. Fixes #33. cmlenz parents: 183 diff changeset	261 :param prefix: a string that should be prepended to every line
26 93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	262 :param width: the maximum line width; use `None`, 0, or a negative number
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	263 to completely disable line wrapping
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	264 :return: the normalized string
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	265 :rtype: `unicode`
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	266 """
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	267 if width and width > 0:
192 8f5805197198 Correctly write out obsolete messages spanning multiple lines. Fixes #33. cmlenz parents: 183 diff changeset	268 prefixlen = len(prefix)
26 93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	269 lines = []
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	270 for idx, line in enumerate(string.splitlines(True)):
192 8f5805197198 Correctly write out obsolete messages spanning multiple lines. Fixes #33. cmlenz parents: 183 diff changeset	271 if len(escape(line)) + prefixlen > width:
26 93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	272 chunks = WORD_SEP.split(line)
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	273 chunks.reverse()
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	274 while chunks:
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	275 buf = []
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	276 size = 2
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	277 while chunks:
192 8f5805197198 Correctly write out obsolete messages spanning multiple lines. Fixes #33. cmlenz parents: 183 diff changeset	278 l = len(escape(chunks[-1])) - 2 + prefixlen
26 93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	279 if size + l < width:
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	280 buf.append(chunks.pop())
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	281 size += l
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	282 else:
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	283 if not buf:
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	284 # handle long chunks by putting them on a
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	285 # separate line
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	286 buf.append(chunks.pop())
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	287 break
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	288 lines.append(u''.join(buf))
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	289 else:
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	290 lines.append(line)
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	291 else:
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	292 lines = string.splitlines(True)
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	293
69 1d8e81bfedf9 Enhance catalog to also manage the MIME headers. cmlenz parents: 66 diff changeset	294 if len(lines) <= 1:
26 93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	295 return escape(string)
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	296
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	297 # Remove empty trailing line
69 1d8e81bfedf9 Enhance catalog to also manage the MIME headers. cmlenz parents: 66 diff changeset	298 if lines and not lines[-1]:
26 93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	299 del lines[-1]
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	300 lines[-1] += '\n'
192 8f5805197198 Correctly write out obsolete messages spanning multiple lines. Fixes #33. cmlenz parents: 183 diff changeset	301 return u'""\n' + u'\n'.join([(prefix + escape(l)) for l in lines])
26 93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	302
106 2a00e352c986 Merged `write_pot` and `write_po` functions by moving more functionality to the `Catalog` class. This is certainly not perfect yet, but moves us in the right direction. cmlenz parents: 105 diff changeset	303 def write_po(fileobj, catalog, width=76, no_location=False, omit_header=False,
193 b5e58a22ebd2 Add an option to the frontend commands for catalog updating that removes completely any obsolete messages, instead of putting them comments. cmlenz parents: 192 diff changeset	304 sort_output=False, sort_by_file=False, ignore_obsolete=False):
58 068952b4d4c0 Add actual data structures for handling message catalogs, so that more code can be reused here between the frontends. cmlenz parents: 57 diff changeset	305 r"""Write a ``gettext`` PO (portable object) template file for a given
068952b4d4c0 Add actual data structures for handling message catalogs, so that more code can be reused here between the frontends. cmlenz parents: 57 diff changeset	306 message catalog to the provided file-like object.
3 e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	307
58 068952b4d4c0 Add actual data structures for handling message catalogs, so that more code can be reused here between the frontends. cmlenz parents: 57 diff changeset	308 >>> catalog = Catalog()
068952b4d4c0 Add actual data structures for handling message catalogs, so that more code can be reused here between the frontends. cmlenz parents: 57 diff changeset	309 >>> catalog.add(u'foo %(name)s', locations=[('main.py', 1)],
068952b4d4c0 Add actual data structures for handling message catalogs, so that more code can be reused here between the frontends. cmlenz parents: 57 diff changeset	310 ... flags=('fuzzy',))
068952b4d4c0 Add actual data structures for handling message catalogs, so that more code can be reused here between the frontends. cmlenz parents: 57 diff changeset	311 >>> catalog.add((u'bar', u'baz'), locations=[('main.py', 3)])
3 e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	312 >>> from StringIO import StringIO
e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	313 >>> buf = StringIO()
106 2a00e352c986 Merged `write_pot` and `write_po` functions by moving more functionality to the `Catalog` class. This is certainly not perfect yet, but moves us in the right direction. cmlenz parents: 105 diff changeset	314 >>> write_po(buf, catalog, omit_header=True)
3 e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	315 >>> print buf.getvalue()
e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	316 #: main.py:1
8 ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy"). cmlenz parents: 7 diff changeset	317 #, fuzzy, python-format
ff5481545bfd Add basic PO file parsing, and change the PO writing procedure to also take flags (such as "python-format" or "fuzzy"). cmlenz parents: 7 diff changeset	318 msgid "foo %(name)s"
3 e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	319 msgstr ""
e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	320 <BLANKLINE>
e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	321 #: main.py:3
e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	322 msgid "bar"
e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	323 msgid_plural "baz"
e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	324 msgstr[0] ""
e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	325 msgstr[1] ""
e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	326 <BLANKLINE>
e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	327 <BLANKLINE>
e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	328
e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	329 :param fileobj: the file-like object to write to
69 1d8e81bfedf9 Enhance catalog to also manage the MIME headers. cmlenz parents: 66 diff changeset	330 :param catalog: the `Catalog` instance
26 93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	331 :param width: the maximum line width for the generated output; use `None`,
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	332 0, or a negative number to completely disable line wrapping
3 e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	333 :param no_location: do not emit a location comment for every message
e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	334 :param omit_header: do not include the ``msgid ""`` entry at the top of the
e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	335 output
193 b5e58a22ebd2 Add an option to the frontend commands for catalog updating that removes completely any obsolete messages, instead of putting them comments. cmlenz parents: 192 diff changeset	336 :sort_output: whether to sort the messages in the output by msgid
b5e58a22ebd2 Add an option to the frontend commands for catalog updating that removes completely any obsolete messages, instead of putting them comments. cmlenz parents: 192 diff changeset	337 :sort_by_file: whether to sort the messages in the output by their locations
b5e58a22ebd2 Add an option to the frontend commands for catalog updating that removes completely any obsolete messages, instead of putting them comments. cmlenz parents: 192 diff changeset	338 :ignore_obsolete: whether to ignore obsolete messages and not include them
b5e58a22ebd2 Add an option to the frontend commands for catalog updating that removes completely any obsolete messages, instead of putting them comments. cmlenz parents: 192 diff changeset	339 in the output; by default they are included as comments
3 e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	340 """
192 8f5805197198 Correctly write out obsolete messages spanning multiple lines. Fixes #33. cmlenz parents: 183 diff changeset	341 def _normalize(key, prefix=''):
8f5805197198 Correctly write out obsolete messages spanning multiple lines. Fixes #33. cmlenz parents: 183 diff changeset	342 return normalize(key, prefix=prefix, width=width) \
8f5805197198 Correctly write out obsolete messages spanning multiple lines. Fixes #33. cmlenz parents: 183 diff changeset	343 .encode(catalog.charset, 'backslashreplace')
26 93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	344
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	345 def _write(text):
93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	346 if isinstance(text, unicode):
104 57d2f21a1fcc Project name and version, and the charset are available via the `Catalog` object, and do not need to be passed to `write_pot()`. cmlenz parents: 99 diff changeset	347 text = text.encode(catalog.charset)
26 93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	348 fileobj.write(text)
3 e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	349
183 e927dffc9ab4 The frontends now provide ways to update existing translations catalogs from a template. Closes #22. cmlenz parents: 180 diff changeset	350 def _write_comment(comment, prefix=''):
e927dffc9ab4 The frontends now provide ways to update existing translations catalogs from a template. Closes #22. cmlenz parents: 180 diff changeset	351 lines = comment
e927dffc9ab4 The frontends now provide ways to update existing translations catalogs from a template. Closes #22. cmlenz parents: 180 diff changeset	352 if width and width > 0:
e927dffc9ab4 The frontends now provide ways to update existing translations catalogs from a template. Closes #22. cmlenz parents: 180 diff changeset	353 lines = wrap(comment, width, break_long_words=False)
e927dffc9ab4 The frontends now provide ways to update existing translations catalogs from a template. Closes #22. cmlenz parents: 180 diff changeset	354 for line in lines:
e927dffc9ab4 The frontends now provide ways to update existing translations catalogs from a template. Closes #22. cmlenz parents: 180 diff changeset	355 _write('#%s %s\n' % (prefix, line.strip()))
e927dffc9ab4 The frontends now provide ways to update existing translations catalogs from a template. Closes #22. cmlenz parents: 180 diff changeset	356
e927dffc9ab4 The frontends now provide ways to update existing translations catalogs from a template. Closes #22. cmlenz parents: 180 diff changeset	357 def _write_message(message, prefix=''):
e927dffc9ab4 The frontends now provide ways to update existing translations catalogs from a template. Closes #22. cmlenz parents: 180 diff changeset	358 if isinstance(message.id, (list, tuple)):
192 8f5805197198 Correctly write out obsolete messages spanning multiple lines. Fixes #33. cmlenz parents: 183 diff changeset	359 _write('%smsgid %s\n' % (prefix, _normalize(message.id[0], prefix)))
8f5805197198 Correctly write out obsolete messages spanning multiple lines. Fixes #33. cmlenz parents: 183 diff changeset	360 _write('%smsgid_plural %s\n' % (
8f5805197198 Correctly write out obsolete messages spanning multiple lines. Fixes #33. cmlenz parents: 183 diff changeset	361 prefix, _normalize(message.id[1], prefix)
8f5805197198 Correctly write out obsolete messages spanning multiple lines. Fixes #33. cmlenz parents: 183 diff changeset	362 ))
183 e927dffc9ab4 The frontends now provide ways to update existing translations catalogs from a template. Closes #22. cmlenz parents: 180 diff changeset	363 for i, string in enumerate(message.string):
192 8f5805197198 Correctly write out obsolete messages spanning multiple lines. Fixes #33. cmlenz parents: 183 diff changeset	364 _write('%smsgstr[%d] %s\n' % (
8f5805197198 Correctly write out obsolete messages spanning multiple lines. Fixes #33. cmlenz parents: 183 diff changeset	365 prefix, i, _normalize(message.string[i], prefix)
8f5805197198 Correctly write out obsolete messages spanning multiple lines. Fixes #33. cmlenz parents: 183 diff changeset	366 ))
183 e927dffc9ab4 The frontends now provide ways to update existing translations catalogs from a template. Closes #22. cmlenz parents: 180 diff changeset	367 else:
192 8f5805197198 Correctly write out obsolete messages spanning multiple lines. Fixes #33. cmlenz parents: 183 diff changeset	368 _write('%smsgid %s\n' % (prefix, _normalize(message.id, prefix)))
8f5805197198 Correctly write out obsolete messages spanning multiple lines. Fixes #33. cmlenz parents: 183 diff changeset	369 _write('%smsgstr %s\n' % (
8f5805197198 Correctly write out obsolete messages spanning multiple lines. Fixes #33. cmlenz parents: 183 diff changeset	370 prefix, _normalize(message.string or '', prefix)
8f5805197198 Correctly write out obsolete messages spanning multiple lines. Fixes #33. cmlenz parents: 183 diff changeset	371 ))
183 e927dffc9ab4 The frontends now provide ways to update existing translations catalogs from a template. Closes #22. cmlenz parents: 180 diff changeset	372
106 2a00e352c986 Merged `write_pot` and `write_po` functions by moving more functionality to the `Catalog` class. This is certainly not perfect yet, but moves us in the right direction. cmlenz parents: 105 diff changeset	373 messages = list(catalog)
73 5d8e87acdcc7 Implemented message sorting, see #7. palgarvio parents: 70 diff changeset	374 if sort_output:
5d8e87acdcc7 Implemented message sorting, see #7. palgarvio parents: 70 diff changeset	375 messages.sort(lambda x,y: cmp(x.id, y.id))
5d8e87acdcc7 Implemented message sorting, see #7. palgarvio parents: 70 diff changeset	376 elif sort_by_file:
5d8e87acdcc7 Implemented message sorting, see #7. palgarvio parents: 70 diff changeset	377 messages.sort(lambda x,y: cmp(x.locations, y.locations))
70 620fdd25657a Add back POT header broken in previous check-in. cmlenz parents: 69 diff changeset	378
73 5d8e87acdcc7 Implemented message sorting, see #7. palgarvio parents: 70 diff changeset	379 for message in messages:
69 1d8e81bfedf9 Enhance catalog to also manage the MIME headers. cmlenz parents: 66 diff changeset	380 if not message.id: # This is the header "message"
1d8e81bfedf9 Enhance catalog to also manage the MIME headers. cmlenz parents: 66 diff changeset	381 if omit_header:
1d8e81bfedf9 Enhance catalog to also manage the MIME headers. cmlenz parents: 66 diff changeset	382 continue
106 2a00e352c986 Merged `write_pot` and `write_po` functions by moving more functionality to the `Catalog` class. This is certainly not perfect yet, but moves us in the right direction. cmlenz parents: 105 diff changeset	383 comment_header = catalog.header_comment
105 abd3a594dab4 Implement wrapping of header comments in PO(T) output. Related to #14. cmlenz parents: 104 diff changeset	384 if width and width > 0:
abd3a594dab4 Implement wrapping of header comments in PO(T) output. Related to #14. cmlenz parents: 104 diff changeset	385 lines = []
106 2a00e352c986 Merged `write_pot` and `write_po` functions by moving more functionality to the `Catalog` class. This is certainly not perfect yet, but moves us in the right direction. cmlenz parents: 105 diff changeset	386 for line in comment_header.splitlines():
105 abd3a594dab4 Implement wrapping of header comments in PO(T) output. Related to #14. cmlenz parents: 104 diff changeset	387 lines += wrap(line, width=width, subsequent_indent='# ',
abd3a594dab4 Implement wrapping of header comments in PO(T) output. Related to #14. cmlenz parents: 104 diff changeset	388 break_long_words=False)
106 2a00e352c986 Merged `write_pot` and `write_po` functions by moving more functionality to the `Catalog` class. This is certainly not perfect yet, but moves us in the right direction. cmlenz parents: 105 diff changeset	389 comment_header = u'\n'.join(lines) + u'\n'
2a00e352c986 Merged `write_pot` and `write_po` functions by moving more functionality to the `Catalog` class. This is certainly not perfect yet, but moves us in the right direction. cmlenz parents: 105 diff changeset	390 _write(comment_header)
104 57d2f21a1fcc Project name and version, and the charset are available via the `Catalog` object, and do not need to be passed to `write_pot()`. cmlenz parents: 99 diff changeset	391
183 e927dffc9ab4 The frontends now provide ways to update existing translations catalogs from a template. Closes #22. cmlenz parents: 180 diff changeset	392 for comment in message.user_comments:
e927dffc9ab4 The frontends now provide ways to update existing translations catalogs from a template. Closes #22. cmlenz parents: 180 diff changeset	393 _write_comment(comment)
e927dffc9ab4 The frontends now provide ways to update existing translations catalogs from a template. Closes #22. cmlenz parents: 180 diff changeset	394 for comment in message.auto_comments:
e927dffc9ab4 The frontends now provide ways to update existing translations catalogs from a template. Closes #22. cmlenz parents: 180 diff changeset	395 _write_comment(comment, prefix='.')
3 e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	396
e9eaddab598e Import of initial code base. cmlenz parents: diff changeset	397 if not no_location:
136 9e3d2b227ec3 More fixes for Windows compatibility: cmlenz parents: 122 diff changeset	398 locs = u' '.join([u'%s:%d' % (filename.replace(os.sep, '/'), lineno)
9e3d2b227ec3 More fixes for Windows compatibility: cmlenz parents: 122 diff changeset	399 for filename, lineno in message.locations])
183 e927dffc9ab4 The frontends now provide ways to update existing translations catalogs from a template. Closes #22. cmlenz parents: 180 diff changeset	400 _write_comment(locs, prefix=':')
58 068952b4d4c0 Add actual data structures for handling message catalogs, so that more code can be reused here between the frontends. cmlenz parents: 57 diff changeset	401 if message.flags:
068952b4d4c0 Add actual data structures for handling message catalogs, so that more code can be reused here between the frontends. cmlenz parents: 57 diff changeset	402 _write('#%s\n' % ', '.join([''] + list(message.flags)))
26 93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	403
183 e927dffc9ab4 The frontends now provide ways to update existing translations catalogs from a template. Closes #22. cmlenz parents: 180 diff changeset	404 _write_message(message)
26 93eaa2f4a0a2 Reimplement line wrapping for PO writing (as the `textwrap` module is too destructive with white space) and move it to the `normalize` function (which was already doing some handling of line breaks). cmlenz parents: 25 diff changeset	405 _write('\n')
183 e927dffc9ab4 The frontends now provide ways to update existing translations catalogs from a template. Closes #22. cmlenz parents: 180 diff changeset	406
193 b5e58a22ebd2 Add an option to the frontend commands for catalog updating that removes completely any obsolete messages, instead of putting them comments. cmlenz parents: 192 diff changeset	407 if not ignore_obsolete:
b5e58a22ebd2 Add an option to the frontend commands for catalog updating that removes completely any obsolete messages, instead of putting them comments. cmlenz parents: 192 diff changeset	408 for message in catalog.obsolete.values():
b5e58a22ebd2 Add an option to the frontend commands for catalog updating that removes completely any obsolete messages, instead of putting them comments. cmlenz parents: 192 diff changeset	409 for comment in message.user_comments:
b5e58a22ebd2 Add an option to the frontend commands for catalog updating that removes completely any obsolete messages, instead of putting them comments. cmlenz parents: 192 diff changeset	410 _write_comment(comment)
b5e58a22ebd2 Add an option to the frontend commands for catalog updating that removes completely any obsolete messages, instead of putting them comments. cmlenz parents: 192 diff changeset	411 _write_message(message, prefix='#~ ')
b5e58a22ebd2 Add an option to the frontend commands for catalog updating that removes completely any obsolete messages, instead of putting them comments. cmlenz parents: 192 diff changeset	412 _write('\n')

Mercurial > babel > old > mirror

annotate babel/messages/pofile.py @ 198:982d7e704fdc