Mercurial > genshi > genshi-test
annotate doc/streams.txt @ 721:15c31c0e0df4
Properly wrap exceptions we want to catch.
Submitted by: Armin Ronacher
author | jruigrok |
---|---|
date | Thu, 17 Apr 2008 14:50:21 +0000 |
parents | ca7d707d51b0 |
children | 1447d40df660 67d324a62cc0 |
rev | line source |
---|---|
226 | 1 .. -*- mode: rst; encoding: utf-8 -*- |
2 | |
3 ============== | |
4 Markup Streams | |
5 ============== | |
6 | |
7 A stream is the common representation of markup as a *stream of events*. | |
8 | |
9 | |
10 .. contents:: Contents | |
382
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
11 :depth: 1 |
226 | 12 .. sectnum:: |
13 | |
14 | |
15 Basics | |
16 ====== | |
17 | |
18 A stream can be attained in a number of ways. It can be: | |
19 | |
20 * the result of parsing XML or HTML text, or | |
438
6fd7e4dc0318
Added documentation page on the builtin stream filters.
cmlenz
parents:
394
diff
changeset
|
21 * the result of selecting a subset of another stream using XPath, or |
6fd7e4dc0318
Added documentation page on the builtin stream filters.
cmlenz
parents:
394
diff
changeset
|
22 * programmatically generated. |
226 | 23 |
24 For example, the functions ``XML()`` and ``HTML()`` can be used to convert | |
508
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
25 literal XML or HTML text to a markup stream: |
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
26 |
510
ca7d707d51b0
Use syntax highlighting on all the other doc pages, too.
cmlenz
parents:
508
diff
changeset
|
27 .. code-block:: pycon |
226 | 28 |
230 | 29 >>> from genshi import XML |
226 | 30 >>> stream = XML('<p class="intro">Some text and ' |
31 ... '<a href="http://example.org/">a link</a>.' | |
32 ... '<br/></p>') | |
33 >>> stream | |
382
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
34 <genshi.core.Stream object at ...> |
226 | 35 |
36 The stream is the result of parsing the text into events. Each event is a tuple | |
37 of the form ``(kind, data, pos)``, where: | |
38 | |
39 * ``kind`` defines what kind of event it is (such as the start of an element, | |
40 text, a comment, etc). | |
41 * ``data`` is the actual data associated with the event. How this looks depends | |
382
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
42 on the event kind (see `event kinds`_) |
226 | 43 * ``pos`` is a ``(filename, lineno, column)`` tuple that describes where the |
44 event “comes from”. | |
45 | |
510
ca7d707d51b0
Use syntax highlighting on all the other doc pages, too.
cmlenz
parents:
508
diff
changeset
|
46 .. code-block:: pycon |
226 | 47 |
48 >>> for kind, data, pos in stream: | |
49 ... print kind, `data`, pos | |
50 ... | |
382
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
51 START (QName(u'p'), Attrs([(QName(u'class'), u'intro')])) (None, 1, 0) |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
52 TEXT u'Some text and ' (None, 1, 17) |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
53 START (QName(u'a'), Attrs([(QName(u'href'), u'http://example.org/')])) (None, 1, 31) |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
54 TEXT u'a link' (None, 1, 61) |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
55 END QName(u'a') (None, 1, 67) |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
56 TEXT u'.' (None, 1, 71) |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
57 START (QName(u'br'), Attrs()) (None, 1, 72) |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
58 END QName(u'br') (None, 1, 77) |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
59 END QName(u'p') (None, 1, 77) |
226 | 60 |
61 | |
62 Filtering | |
63 ========= | |
64 | |
65 One important feature of markup streams is that you can apply *filters* to the | |
230 | 66 stream, either filters that come with Genshi, or your own custom filters. |
226 | 67 |
68 A filter is simply a callable that accepts the stream as parameter, and returns | |
508
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
69 the filtered stream: |
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
70 |
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
71 .. code-block:: python |
226 | 72 |
73 def noop(stream): | |
74 """A filter that doesn't actually do anything with the stream.""" | |
75 for kind, data, pos in stream: | |
76 yield kind, data, pos | |
77 | |
78 Filters can be applied in a number of ways. The simplest is to just call the | |
508
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
79 filter directly: |
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
80 |
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
81 .. code-block:: python |
226 | 82 |
83 stream = noop(stream) | |
84 | |
85 The ``Stream`` class also provides a ``filter()`` method, which takes an | |
508
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
86 arbitrary number of filter callables and applies them all: |
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
87 |
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
88 .. code-block:: python |
226 | 89 |
90 stream = stream.filter(noop) | |
91 | |
92 Finally, filters can also be applied using the *bitwise or* operator (``|``), | |
508
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
93 which allows a syntax similar to pipes on Unix shells: |
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
94 |
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
95 .. code-block:: python |
226 | 96 |
97 stream = stream | noop | |
98 | |
230 | 99 One example of a filter included with Genshi is the ``HTMLSanitizer`` in |
100 ``genshi.filters``. It processes a stream of HTML markup, and strips out any | |
226 | 101 potentially dangerous constructs, such as Javascript event handlers. |
102 ``HTMLSanitizer`` is not a function, but rather a class that implements | |
508
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
103 ``__call__``, which means instances of the class are callable: |
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
104 |
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
105 .. code-block:: python |
438
6fd7e4dc0318
Added documentation page on the builtin stream filters.
cmlenz
parents:
394
diff
changeset
|
106 |
6fd7e4dc0318
Added documentation page on the builtin stream filters.
cmlenz
parents:
394
diff
changeset
|
107 stream = stream | HTMLSanitizer() |
226 | 108 |
109 Both the ``filter()`` method and the pipe operator allow easy chaining of | |
508
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
110 filters: |
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
111 |
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
112 .. code-block:: python |
226 | 113 |
230 | 114 from genshi.filters import HTMLSanitizer |
226 | 115 stream = stream.filter(noop, HTMLSanitizer()) |
116 | |
508
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
117 That is equivalent to: |
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
118 |
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
119 .. code-block:: python |
226 | 120 |
121 stream = stream | noop | HTMLSanitizer() | |
122 | |
438
6fd7e4dc0318
Added documentation page on the builtin stream filters.
cmlenz
parents:
394
diff
changeset
|
123 For more information about the built-in filters, see `Stream Filters`_. |
6fd7e4dc0318
Added documentation page on the builtin stream filters.
cmlenz
parents:
394
diff
changeset
|
124 |
6fd7e4dc0318
Added documentation page on the builtin stream filters.
cmlenz
parents:
394
diff
changeset
|
125 .. _`Stream Filters`: filters.html |
6fd7e4dc0318
Added documentation page on the builtin stream filters.
cmlenz
parents:
394
diff
changeset
|
126 |
226 | 127 |
128 Serialization | |
129 ============= | |
130 | |
438
6fd7e4dc0318
Added documentation page on the builtin stream filters.
cmlenz
parents:
394
diff
changeset
|
131 Serialization means producing some kind of textual output from a stream of |
6fd7e4dc0318
Added documentation page on the builtin stream filters.
cmlenz
parents:
394
diff
changeset
|
132 events, which you'll need when you want to transmit or store the results of |
6fd7e4dc0318
Added documentation page on the builtin stream filters.
cmlenz
parents:
394
diff
changeset
|
133 generating or otherwise processing markup. |
6fd7e4dc0318
Added documentation page on the builtin stream filters.
cmlenz
parents:
394
diff
changeset
|
134 |
6fd7e4dc0318
Added documentation page on the builtin stream filters.
cmlenz
parents:
394
diff
changeset
|
135 The ``Stream`` class provides two methods for serialization: ``serialize()`` and |
6fd7e4dc0318
Added documentation page on the builtin stream filters.
cmlenz
parents:
394
diff
changeset
|
136 ``render()``. The former is a generator that yields chunks of ``Markup`` objects |
6fd7e4dc0318
Added documentation page on the builtin stream filters.
cmlenz
parents:
394
diff
changeset
|
137 (which are basically unicode strings that are considered safe for output on the |
6fd7e4dc0318
Added documentation page on the builtin stream filters.
cmlenz
parents:
394
diff
changeset
|
138 web). The latter returns a single string, by default UTF-8 encoded. |
226 | 139 |
508
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
140 Here's the output from ``serialize()``: |
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
141 |
510
ca7d707d51b0
Use syntax highlighting on all the other doc pages, too.
cmlenz
parents:
508
diff
changeset
|
142 .. code-block:: pycon |
226 | 143 |
144 >>> for output in stream.serialize(): | |
145 ... print `output` | |
146 ... | |
147 <Markup u'<p class="intro">'> | |
148 <Markup u'Some text and '> | |
149 <Markup u'<a href="http://example.org/">'> | |
150 <Markup u'a link'> | |
151 <Markup u'</a>'> | |
152 <Markup u'.'> | |
153 <Markup u'<br/>'> | |
154 <Markup u'</p>'> | |
155 | |
508
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
156 And here's the output from ``render()``: |
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
157 |
510
ca7d707d51b0
Use syntax highlighting on all the other doc pages, too.
cmlenz
parents:
508
diff
changeset
|
158 .. code-block:: pycon |
226 | 159 |
160 >>> print stream.render() | |
161 <p class="intro">Some text and <a href="http://example.org/">a link</a>.<br/></p> | |
162 | |
163 Both methods can be passed a ``method`` parameter that determines how exactly | |
164 the events are serialzed to text. This parameter can be either “xml” (the | |
508
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
165 default), “xhtml”, “html”, “text”, or a custom serializer class: |
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
166 |
510
ca7d707d51b0
Use syntax highlighting on all the other doc pages, too.
cmlenz
parents:
508
diff
changeset
|
167 .. code-block:: pycon |
226 | 168 |
169 >>> print stream.render('html') | |
170 <p class="intro">Some text and <a href="http://example.org/">a link</a>.<br></p> | |
171 | |
172 Note how the `<br>` element isn't closed, which is the right thing to do for | |
173 HTML. | |
174 | |
175 In addition, the ``render()`` method takes an ``encoding`` parameter, which | |
176 defaults to “UTF-8”. If set to ``None``, the result will be a unicode string. | |
177 | |
230 | 178 The different serializer classes in ``genshi.output`` can also be used |
508
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
179 directly: |
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
180 |
510
ca7d707d51b0
Use syntax highlighting on all the other doc pages, too.
cmlenz
parents:
508
diff
changeset
|
181 .. code-block:: pycon |
226 | 182 |
230 | 183 >>> from genshi.filters import HTMLSanitizer |
184 >>> from genshi.output import TextSerializer | |
382
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
185 >>> print ''.join(TextSerializer()(HTMLSanitizer()(stream))) |
226 | 186 Some text and a link. |
187 | |
508
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
188 The pipe operator allows a nicer syntax: |
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
189 |
510
ca7d707d51b0
Use syntax highlighting on all the other doc pages, too.
cmlenz
parents:
508
diff
changeset
|
190 .. code-block:: pycon |
226 | 191 |
192 >>> print stream | HTMLSanitizer() | TextSerializer() | |
193 Some text and a link. | |
194 | |
382
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
195 |
438
6fd7e4dc0318
Added documentation page on the builtin stream filters.
cmlenz
parents:
394
diff
changeset
|
196 Serialization Options |
6fd7e4dc0318
Added documentation page on the builtin stream filters.
cmlenz
parents:
394
diff
changeset
|
197 --------------------- |
6fd7e4dc0318
Added documentation page on the builtin stream filters.
cmlenz
parents:
394
diff
changeset
|
198 |
6fd7e4dc0318
Added documentation page on the builtin stream filters.
cmlenz
parents:
394
diff
changeset
|
199 Both ``serialize()`` and ``render()`` support additional keyword arguments that |
6fd7e4dc0318
Added documentation page on the builtin stream filters.
cmlenz
parents:
394
diff
changeset
|
200 are passed through to the initializer of the serializer class. The following |
6fd7e4dc0318
Added documentation page on the builtin stream filters.
cmlenz
parents:
394
diff
changeset
|
201 options are supported by the built-in serializers: |
6fd7e4dc0318
Added documentation page on the builtin stream filters.
cmlenz
parents:
394
diff
changeset
|
202 |
6fd7e4dc0318
Added documentation page on the builtin stream filters.
cmlenz
parents:
394
diff
changeset
|
203 ``strip_whitespace`` |
6fd7e4dc0318
Added documentation page on the builtin stream filters.
cmlenz
parents:
394
diff
changeset
|
204 Whether the serializer should remove trailing spaces and empty lines. Defaults |
6fd7e4dc0318
Added documentation page on the builtin stream filters.
cmlenz
parents:
394
diff
changeset
|
205 to ``True``. |
6fd7e4dc0318
Added documentation page on the builtin stream filters.
cmlenz
parents:
394
diff
changeset
|
206 |
6fd7e4dc0318
Added documentation page on the builtin stream filters.
cmlenz
parents:
394
diff
changeset
|
207 (This option is not available for serialization to plain text.) |
6fd7e4dc0318
Added documentation page on the builtin stream filters.
cmlenz
parents:
394
diff
changeset
|
208 |
6fd7e4dc0318
Added documentation page on the builtin stream filters.
cmlenz
parents:
394
diff
changeset
|
209 ``doctype`` |
6fd7e4dc0318
Added documentation page on the builtin stream filters.
cmlenz
parents:
394
diff
changeset
|
210 A ``(name, pubid, sysid)`` tuple defining the name, publid identifier, and |
6fd7e4dc0318
Added documentation page on the builtin stream filters.
cmlenz
parents:
394
diff
changeset
|
211 system identifier of a ``DOCTYPE`` declaration to prepend to the generated |
6fd7e4dc0318
Added documentation page on the builtin stream filters.
cmlenz
parents:
394
diff
changeset
|
212 output. If provided, this declaration will override any ``DOCTYPE`` |
6fd7e4dc0318
Added documentation page on the builtin stream filters.
cmlenz
parents:
394
diff
changeset
|
213 declaration in the stream. |
6fd7e4dc0318
Added documentation page on the builtin stream filters.
cmlenz
parents:
394
diff
changeset
|
214 |
6fd7e4dc0318
Added documentation page on the builtin stream filters.
cmlenz
parents:
394
diff
changeset
|
215 (This option is not available for serialization to plain text.) |
6fd7e4dc0318
Added documentation page on the builtin stream filters.
cmlenz
parents:
394
diff
changeset
|
216 |
6fd7e4dc0318
Added documentation page on the builtin stream filters.
cmlenz
parents:
394
diff
changeset
|
217 ``namespace_prefixes`` |
6fd7e4dc0318
Added documentation page on the builtin stream filters.
cmlenz
parents:
394
diff
changeset
|
218 The namespace prefixes to use for namespace that are not bound to a prefix |
6fd7e4dc0318
Added documentation page on the builtin stream filters.
cmlenz
parents:
394
diff
changeset
|
219 in the stream itself. |
6fd7e4dc0318
Added documentation page on the builtin stream filters.
cmlenz
parents:
394
diff
changeset
|
220 |
6fd7e4dc0318
Added documentation page on the builtin stream filters.
cmlenz
parents:
394
diff
changeset
|
221 (This option is not available for serialization to HTML or plain text.) |
6fd7e4dc0318
Added documentation page on the builtin stream filters.
cmlenz
parents:
394
diff
changeset
|
222 |
6fd7e4dc0318
Added documentation page on the builtin stream filters.
cmlenz
parents:
394
diff
changeset
|
223 |
6fd7e4dc0318
Added documentation page on the builtin stream filters.
cmlenz
parents:
394
diff
changeset
|
224 |
226 | 225 Using XPath |
226 =========== | |
227 | |
228 XPath can be used to extract a specific subset of the stream via the | |
508
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
229 ``select()`` method: |
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
230 |
510
ca7d707d51b0
Use syntax highlighting on all the other doc pages, too.
cmlenz
parents:
508
diff
changeset
|
231 .. code-block:: pycon |
226 | 232 |
233 >>> substream = stream.select('a') | |
234 >>> substream | |
382
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
235 <genshi.core.Stream object at ...> |
226 | 236 >>> print substream |
237 <a href="http://example.org/">a link</a> | |
238 | |
239 Often, streams cannot be reused: in the above example, the sub-stream is based | |
240 on a generator. Once it has been serialized, it will have been fully consumed, | |
241 and cannot be rendered again. To work around this, you can wrap such a stream | |
508
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
242 in a ``list``: |
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
243 |
510
ca7d707d51b0
Use syntax highlighting on all the other doc pages, too.
cmlenz
parents:
508
diff
changeset
|
244 .. code-block:: pycon |
226 | 245 |
230 | 246 >>> from genshi import Stream |
226 | 247 >>> substream = Stream(list(stream.select('a'))) |
248 >>> substream | |
382
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
249 <genshi.core.Stream object at ...> |
226 | 250 >>> print substream |
251 <a href="http://example.org/">a link</a> | |
252 >>> print substream.select('@href') | |
253 http://example.org/ | |
254 >>> print substream.select('text()') | |
255 a link | |
382
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
256 |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
257 See `Using XPath in Genshi`_ for more information about the XPath support in |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
258 Genshi. |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
259 |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
260 .. _`Using XPath in Genshi`: xpath.html |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
261 |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
262 |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
263 .. _`event kinds`: |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
264 |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
265 Event Kinds |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
266 =========== |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
267 |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
268 Every event in a stream is of one of several *kinds*, which also determines |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
269 what the ``data`` item of the event tuple looks like. The different kinds of |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
270 events are documented below. |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
271 |
394 | 272 .. note:: The ``data`` item is generally immutable. If the data is to be |
382
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
273 modified when processing a stream, it must be replaced by a new tuple. |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
274 Effectively, this means the entire event tuple is immutable. |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
275 |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
276 START |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
277 ----- |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
278 The opening tag of an element. |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
279 |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
280 For this kind of event, the ``data`` item is a tuple of the form |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
281 ``(tagname, attrs)``, where ``tagname`` is a ``QName`` instance describing the |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
282 qualified name of the tag, and ``attrs`` is an ``Attrs`` instance containing |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
283 the attribute names and values associated with the tag (excluding namespace |
508
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
284 declarations): |
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
285 |
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
286 .. code-block:: python |
382
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
287 |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
288 START, (QName(u'p'), Attrs([(u'class', u'intro')])), pos |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
289 |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
290 END |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
291 --- |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
292 The closing tag of an element. |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
293 |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
294 The ``data`` item of end events consists of just a ``QName`` instance |
508
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
295 describing the qualified name of the tag: |
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
296 |
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
297 .. code-block:: python |
382
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
298 |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
299 END, QName(u'p'), pos |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
300 |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
301 TEXT |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
302 ---- |
394 | 303 Character data outside of elements and comments. |
382
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
304 |
508
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
305 For text events, the ``data`` item should be a unicode object: |
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
306 |
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
307 .. code-block:: python |
382
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
308 |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
309 TEXT, u'Hello, world!', pos |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
310 |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
311 START_NS |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
312 -------- |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
313 The start of a namespace mapping, binding a namespace prefix to a URI. |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
314 |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
315 The ``data`` item of this kind of event is a tuple of the form |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
316 ``(prefix, uri)``, where ``prefix`` is the namespace prefix and ``uri`` is the |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
317 full URI to which the prefix is bound. Both should be unicode objects. If the |
508
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
318 namespace is not bound to any prefix, the ``prefix`` item is an empty string: |
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
319 |
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
320 .. code-block:: python |
382
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
321 |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
322 START_NS, (u'svg', u'http://www.w3.org/2000/svg'), pos |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
323 |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
324 END_NS |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
325 ------ |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
326 The end of a namespace mapping. |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
327 |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
328 The ``data`` item of such events consists of only the namespace prefix (a |
508
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
329 unicode object): |
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
330 |
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
331 .. code-block:: python |
382
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
332 |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
333 END_NS, u'svg', pos |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
334 |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
335 DOCTYPE |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
336 ------- |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
337 A document type declaration. |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
338 |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
339 For this type of event, the ``data`` item is a tuple of the form |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
340 ``(name, pubid, sysid)``, where ``name`` is the name of the root element, |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
341 ``pubid`` is the public identifier of the DTD (or ``None``), and ``sysid`` is |
508
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
342 the system identifier of the DTD (or ``None``): |
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
343 |
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
344 .. code-block:: python |
382
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
345 |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
346 DOCTYPE, (u'html', u'-//W3C//DTD XHTML 1.0 Transitional//EN', \ |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
347 u'http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd'), pos |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
348 |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
349 COMMENT |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
350 ------- |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
351 A comment. |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
352 |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
353 For such events, the ``data`` item is a unicode object containing all character |
508
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
354 data between the comment delimiters: |
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
355 |
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
356 .. code-block:: python |
382
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
357 |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
358 COMMENT, u'Commented out', pos |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
359 |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
360 PI |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
361 -- |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
362 A processing instruction. |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
363 |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
364 The ``data`` item is a tuple of the form ``(target, data)`` for processing |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
365 instructions, where ``target`` is the target of the PI (used to identify the |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
366 application by which the instruction should be processed), and ``data`` is text |
508
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
367 following the target (excluding the terminating question mark): |
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
368 |
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
369 .. code-block:: python |
382
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
370 |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
371 PI, (u'php', u'echo "Yo" '), pos |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
372 |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
373 START_CDATA |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
374 ----------- |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
375 Marks the beginning of a ``CDATA`` section. |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
376 |
508
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
377 The ``data`` item for such events is always ``None``: |
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
378 |
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
379 .. code-block:: python |
382
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
380 |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
381 START_CDATA, None, pos |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
382 |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
383 END_CDATA |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
384 --------- |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
385 Marks the end of a ``CDATA`` section. |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
386 |
508
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
387 The ``data`` item for such events is always ``None``: |
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
388 |
cabd80e75dad
Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents:
438
diff
changeset
|
389 .. code-block:: python |
382
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
390 |
d7da3fba7faf
* Added documentation for the various stream event kinds.
cmlenz
parents:
230
diff
changeset
|
391 END_CDATA, None, pos |