# HG changeset patch # User cmlenz # Date 1132317519 0 # Node ID 13b290f5f1eeb9e01ee5e2de03bae2865ac55ca1 # Parent 5979bcb0892e2f7707b54333e1c5a6ff22839eaa Encode text in parsed XML elements as UTF-8. Closes #75. Based on patch by Matt Hughes. Thanks! diff --git a/bitten/util/xmlio.py b/bitten/util/xmlio.py --- a/bitten/util/xmlio.py +++ b/bitten/util/xmlio.py @@ -240,13 +240,13 @@ attr = self._node.getAttributeNode(name) if not attr: raise KeyError, name - return attr.value.encode() + return attr.value.encode('utf-8') def __setitem__(self, name, value): self._node.setAttribute(name, value) def __delitem__(self, name): self._node.removeAttribute(name) def keys(self): - return [name.encode() for name in self._node.attributes.keys()] + return [key.encode('utf-8') for key in self._node.attributes.keys()] def __init__(self, node): self._node = node @@ -274,7 +274,8 @@ This concatenates the values of all text nodes that are immediate children of this element. """ - return ''.join([c.nodeValue or '' for c in self._node.childNodes]) + return ''.join([c.nodeValue.encode('utf-8') + for c in self._node.childNodes if c.nodeType == 3]) def write(self, out, newlines=False): """Serializes the element and writes the XML to the given output