annotate doc/streams.txt @ 938:a5a1c9a11135 tip

update tags
author convert-repo
date Tue, 31 May 2011 20:05:15 +0000
parents 24733a5854d9
children
rev   line source
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
1 .. -*- mode: rst; encoding: utf-8 -*-
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
2
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
3 ==============
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
4 Markup Streams
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
5 ==============
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
6
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
7 A stream is the common representation of markup as a *stream of events*.
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
8
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
9
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
10 .. contents:: Contents
745
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
11 :depth: 2
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
12 .. sectnum::
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
13
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
14
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
15 Basics
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
16 ======
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
17
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
18 A stream can be attained in a number of ways. It can be:
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
19
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
20 * the result of parsing XML or HTML text, or
438
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
21 * the result of selecting a subset of another stream using XPath, or
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
22 * programmatically generated.
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
23
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
24 For example, the functions ``XML()`` and ``HTML()`` can be used to convert
508
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
25 literal XML or HTML text to a markup stream:
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
26
510
ca7d707d51b0 Use syntax highlighting on all the other doc pages, too.
cmlenz
parents: 508
diff changeset
27 .. code-block:: pycon
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
28
230
24757b771651 Renamed Markup to Genshi in repository.
cmlenz
parents: 226
diff changeset
29 >>> from genshi import XML
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
30 >>> stream = XML('<p class="intro">Some text and '
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
31 ... '<a href="http://example.org/">a link</a>.'
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
32 ... '<br/></p>')
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
33 >>> stream
382
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
34 <genshi.core.Stream object at ...>
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
35
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
36 The stream is the result of parsing the text into events. Each event is a tuple
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
37 of the form ``(kind, data, pos)``, where:
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
38
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
39 * ``kind`` defines what kind of event it is (such as the start of an element,
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
40 text, a comment, etc).
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
41 * ``data`` is the actual data associated with the event. How this looks depends
382
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
42 on the event kind (see `event kinds`_)
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
43 * ``pos`` is a ``(filename, lineno, column)`` tuple that describes where the
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
44 event “comes from”.
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
45
510
ca7d707d51b0 Use syntax highlighting on all the other doc pages, too.
cmlenz
parents: 508
diff changeset
46 .. code-block:: pycon
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
47
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
48 >>> for kind, data, pos in stream:
853
4376010bb97e Convert a bunch of print statements to py3k compatible syntax.
cmlenz
parents: 774
diff changeset
49 ... print('%s %r %r' % (kind, data, pos))
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
50 ...
857
24733a5854d9 Avoid unicode literals in `repr`s of `QName` and `Namespace` when not necessary.
cmlenz
parents: 853
diff changeset
51 START (QName('p'), Attrs([(QName('class'), u'intro')])) (None, 1, 0)
382
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
52 TEXT u'Some text and ' (None, 1, 17)
857
24733a5854d9 Avoid unicode literals in `repr`s of `QName` and `Namespace` when not necessary.
cmlenz
parents: 853
diff changeset
53 START (QName('a'), Attrs([(QName('href'), u'http://example.org/')])) (None, 1, 31)
382
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
54 TEXT u'a link' (None, 1, 61)
857
24733a5854d9 Avoid unicode literals in `repr`s of `QName` and `Namespace` when not necessary.
cmlenz
parents: 853
diff changeset
55 END QName('a') (None, 1, 67)
382
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
56 TEXT u'.' (None, 1, 71)
857
24733a5854d9 Avoid unicode literals in `repr`s of `QName` and `Namespace` when not necessary.
cmlenz
parents: 853
diff changeset
57 START (QName('br'), Attrs()) (None, 1, 72)
24733a5854d9 Avoid unicode literals in `repr`s of `QName` and `Namespace` when not necessary.
cmlenz
parents: 853
diff changeset
58 END QName('br') (None, 1, 77)
24733a5854d9 Avoid unicode literals in `repr`s of `QName` and `Namespace` when not necessary.
cmlenz
parents: 853
diff changeset
59 END QName('p') (None, 1, 77)
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
60
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
61
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
62 Filtering
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
63 =========
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
64
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
65 One important feature of markup streams is that you can apply *filters* to the
230
24757b771651 Renamed Markup to Genshi in repository.
cmlenz
parents: 226
diff changeset
66 stream, either filters that come with Genshi, or your own custom filters.
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
67
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
68 A filter is simply a callable that accepts the stream as parameter, and returns
508
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
69 the filtered stream:
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
70
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
71 .. code-block:: python
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
72
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
73 def noop(stream):
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
74 """A filter that doesn't actually do anything with the stream."""
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
75 for kind, data, pos in stream:
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
76 yield kind, data, pos
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
77
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
78 Filters can be applied in a number of ways. The simplest is to just call the
508
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
79 filter directly:
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
80
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
81 .. code-block:: python
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
82
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
83 stream = noop(stream)
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
84
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
85 The ``Stream`` class also provides a ``filter()`` method, which takes an
508
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
86 arbitrary number of filter callables and applies them all:
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
87
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
88 .. code-block:: python
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
89
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
90 stream = stream.filter(noop)
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
91
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
92 Finally, filters can also be applied using the *bitwise or* operator (``|``),
508
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
93 which allows a syntax similar to pipes on Unix shells:
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
94
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
95 .. code-block:: python
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
96
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
97 stream = stream | noop
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
98
230
24757b771651 Renamed Markup to Genshi in repository.
cmlenz
parents: 226
diff changeset
99 One example of a filter included with Genshi is the ``HTMLSanitizer`` in
24757b771651 Renamed Markup to Genshi in repository.
cmlenz
parents: 226
diff changeset
100 ``genshi.filters``. It processes a stream of HTML markup, and strips out any
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
101 potentially dangerous constructs, such as Javascript event handlers.
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
102 ``HTMLSanitizer`` is not a function, but rather a class that implements
508
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
103 ``__call__``, which means instances of the class are callable:
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
104
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
105 .. code-block:: python
438
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
106
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
107 stream = stream | HTMLSanitizer()
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
108
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
109 Both the ``filter()`` method and the pipe operator allow easy chaining of
508
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
110 filters:
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
111
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
112 .. code-block:: python
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
113
230
24757b771651 Renamed Markup to Genshi in repository.
cmlenz
parents: 226
diff changeset
114 from genshi.filters import HTMLSanitizer
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
115 stream = stream.filter(noop, HTMLSanitizer())
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
116
508
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
117 That is equivalent to:
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
118
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
119 .. code-block:: python
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
120
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
121 stream = stream | noop | HTMLSanitizer()
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
122
438
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
123 For more information about the built-in filters, see `Stream Filters`_.
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
124
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
125 .. _`Stream Filters`: filters.html
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
126
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
127
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
128 Serialization
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
129 =============
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
130
438
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
131 Serialization means producing some kind of textual output from a stream of
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
132 events, which you'll need when you want to transmit or store the results of
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
133 generating or otherwise processing markup.
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
134
745
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
135 The ``Stream`` class provides two methods for serialization: ``serialize()``
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
136 and ``render()``. The former is a generator that yields chunks of ``Markup``
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
137 objects (which are basically unicode strings that are considered safe for
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
138 output on the web). The latter returns a single string, by default UTF-8
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
139 encoded.
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
140
508
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
141 Here's the output from ``serialize()``:
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
142
510
ca7d707d51b0 Use syntax highlighting on all the other doc pages, too.
cmlenz
parents: 508
diff changeset
143 .. code-block:: pycon
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
144
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
145 >>> for output in stream.serialize():
853
4376010bb97e Convert a bunch of print statements to py3k compatible syntax.
cmlenz
parents: 774
diff changeset
146 ... print(repr(output))
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
147 ...
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
148 <Markup u'<p class="intro">'>
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
149 <Markup u'Some text and '>
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
150 <Markup u'<a href="http://example.org/">'>
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
151 <Markup u'a link'>
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
152 <Markup u'</a>'>
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
153 <Markup u'.'>
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
154 <Markup u'<br/>'>
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
155 <Markup u'</p>'>
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
156
508
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
157 And here's the output from ``render()``:
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
158
510
ca7d707d51b0 Use syntax highlighting on all the other doc pages, too.
cmlenz
parents: 508
diff changeset
159 .. code-block:: pycon
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
160
853
4376010bb97e Convert a bunch of print statements to py3k compatible syntax.
cmlenz
parents: 774
diff changeset
161 >>> print(stream.render())
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
162 <p class="intro">Some text and <a href="http://example.org/">a link</a>.<br/></p>
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
163
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
164 Both methods can be passed a ``method`` parameter that determines how exactly
745
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
165 the events are serialized to text. This parameter can be either a string or a
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
166 custom serializer class:
508
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
167
510
ca7d707d51b0 Use syntax highlighting on all the other doc pages, too.
cmlenz
parents: 508
diff changeset
168 .. code-block:: pycon
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
169
853
4376010bb97e Convert a bunch of print statements to py3k compatible syntax.
cmlenz
parents: 774
diff changeset
170 >>> print(stream.render('html'))
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
171 <p class="intro">Some text and <a href="http://example.org/">a link</a>.<br></p>
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
172
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
173 Note how the `<br>` element isn't closed, which is the right thing to do for
745
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
174 HTML. See `serialization methods`_ for more details.
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
175
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
176 In addition, the ``render()`` method takes an ``encoding`` parameter, which
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
177 defaults to “UTF-8”. If set to ``None``, the result will be a unicode string.
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
178
230
24757b771651 Renamed Markup to Genshi in repository.
cmlenz
parents: 226
diff changeset
179 The different serializer classes in ``genshi.output`` can also be used
508
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
180 directly:
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
181
510
ca7d707d51b0 Use syntax highlighting on all the other doc pages, too.
cmlenz
parents: 508
diff changeset
182 .. code-block:: pycon
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
183
230
24757b771651 Renamed Markup to Genshi in repository.
cmlenz
parents: 226
diff changeset
184 >>> from genshi.filters import HTMLSanitizer
24757b771651 Renamed Markup to Genshi in repository.
cmlenz
parents: 226
diff changeset
185 >>> from genshi.output import TextSerializer
853
4376010bb97e Convert a bunch of print statements to py3k compatible syntax.
cmlenz
parents: 774
diff changeset
186 >>> print(''.join(TextSerializer()(HTMLSanitizer()(stream))))
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
187 Some text and a link.
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
188
508
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
189 The pipe operator allows a nicer syntax:
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
190
510
ca7d707d51b0 Use syntax highlighting on all the other doc pages, too.
cmlenz
parents: 508
diff changeset
191 .. code-block:: pycon
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
192
853
4376010bb97e Convert a bunch of print statements to py3k compatible syntax.
cmlenz
parents: 774
diff changeset
193 >>> print(stream | HTMLSanitizer() | TextSerializer())
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
194 Some text and a link.
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
195
382
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
196
745
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
197 .. _`serialization methods`:
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
198
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
199 Serialization Methods
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
200 ---------------------
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
201
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
202 Genshi supports the use of different serialization methods to use for creating
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
203 a text representation of a markup stream.
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
204
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
205 ``xml``
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
206 The ``XMLSerializer`` is the default serialization method and results in
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
207 proper XML output including namespace support, the XML declaration, CDATA
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
208 sections, and so on. It is not generally not suitable for serving HTML or
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
209 XHTML web pages (unless you want to use true XHTML 1.1), for which the
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
210 ``xhtml`` and ``html`` serializers described below should be preferred.
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
211
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
212 ``xhtml``
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
213 The ``XHTMLSerializer`` is a specialization of the generic ``XMLSerializer``
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
214 that understands the pecularities of producing XML-compliant output that can
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
215 also be parsed without problems by the HTML parsers found in modern web
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
216 browsers. Thus, the output by this serializer should be usable whether sent
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
217 as "text/html" or "application/xhtml+html" (although there are a lot of
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
218 subtle issues to pay attention to when switching between the two, in
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
219 particular with respect to differences in the DOM and CSS).
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
220
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
221 For example, instead of rendering a script tag as ``<script/>`` (which
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
222 confuses the HTML parser in many browsers), it will produce
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
223 ``<script></script>``. Also, it will normalize any boolean attributes values
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
224 that are minimized in HTML, so that for example ``<hr noshade="1"/>``
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
225 becomes ``<hr noshade="noshade" />``.
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
226
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
227 This serializer supports the use of namespaces for compound documents, for
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
228 example to use inline SVG inside an XHTML document.
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
229
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
230 ``html``
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
231 The ``HTMLSerializer`` produces proper HTML markup. The main differences
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
232 compared to ``xhtml`` serialization are that boolean attributes are
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
233 minimized, empty tags are not self-closing (so it's ``<br>`` instead of
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
234 ``<br />``), and that the contents of ``<script>`` and ``<style>`` elements
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
235 are not escaped.
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
236
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
237 ``text``
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
238 The ``TextSerializer`` produces plain text from markup streams. This is
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
239 useful primarily for `text templates`_, but can also be used to produce
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
240 plain text output from markup templates or other sources.
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
241
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
242 .. _`text templates`: text-templates.html
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
243
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
244
438
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
245 Serialization Options
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
246 ---------------------
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
247
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
248 Both ``serialize()`` and ``render()`` support additional keyword arguments that
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
249 are passed through to the initializer of the serializer class. The following
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
250 options are supported by the built-in serializers:
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
251
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
252 ``strip_whitespace``
745
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
253 Whether the serializer should remove trailing spaces and empty lines.
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
254 Defaults to ``True``.
438
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
255
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
256 (This option is not available for serialization to plain text.)
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
257
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
258 ``doctype``
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
259 A ``(name, pubid, sysid)`` tuple defining the name, publid identifier, and
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
260 system identifier of a ``DOCTYPE`` declaration to prepend to the generated
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
261 output. If provided, this declaration will override any ``DOCTYPE``
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
262 declaration in the stream.
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
263
745
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
264 The parameter can also be specified as a string to refer to commonly used
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
265 doctypes:
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
266
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
267 +-----------------------------+-------------------------------------------+
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
268 | Shorthand | DOCTYPE |
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
269 +=============================+===========================================+
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
270 | ``html`` or | HTML 4.01 Strict |
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
271 | ``html-strict`` | |
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
272 +-----------------------------+-------------------------------------------+
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
273 | ``html-transitional`` | HTML 4.01 Transitional |
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
274 +-----------------------------+-------------------------------------------+
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
275 | ``html-frameset`` | HTML 4.01 Frameset |
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
276 +-----------------------------+-------------------------------------------+
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
277 | ``html5`` | DOCTYPE proposed for the work-in-progress |
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
278 | | HTML5 standard |
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
279 +-----------------------------+-------------------------------------------+
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
280 | ``xhtml`` or | XHTML 1.0 Strict |
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
281 | ``xhtml-strict`` | |
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
282 +-----------------------------+-------------------------------------------+
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
283 | ``xhtml-transitional`` | XHTML 1.0 Transitional |
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
284 +-----------------------------+-------------------------------------------+
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
285 | ``xhtml-frameset`` | XHTML 1.0 Frameset |
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
286 +-----------------------------+-------------------------------------------+
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
287 | ``xhtml11`` | XHTML 1.1 |
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
288 +-----------------------------+-------------------------------------------+
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
289 | ``svg`` or ``svg-full`` | SVG 1.1 |
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
290 +-----------------------------+-------------------------------------------+
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
291 | ``svg-basic`` | SVG 1.1 Basic |
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
292 +-----------------------------+-------------------------------------------+
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
293 | ``svg-tiny`` | SVG 1.1 Tiny |
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
294 +-----------------------------+-------------------------------------------+
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
295
438
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
296 (This option is not available for serialization to plain text.)
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
297
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
298 ``namespace_prefixes``
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
299 The namespace prefixes to use for namespace that are not bound to a prefix
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
300 in the stream itself.
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
301
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
302 (This option is not available for serialization to HTML or plain text.)
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
303
729
1447d40df660 * Add XHTML 1.1 doctype (closes #228).
cmlenz
parents: 510
diff changeset
304 ``drop_xml_decl``
1447d40df660 * Add XHTML 1.1 doctype (closes #228).
cmlenz
parents: 510
diff changeset
305 Whether to remove the XML declaration (the ``<?xml ?>`` part at the
1447d40df660 * Add XHTML 1.1 doctype (closes #228).
cmlenz
parents: 510
diff changeset
306 beginning of a document) when serializing. This defaults to ``True`` as an
1447d40df660 * Add XHTML 1.1 doctype (closes #228).
cmlenz
parents: 510
diff changeset
307 XML declaration throws some older browsers into "Quirks" rendering mode.
1447d40df660 * Add XHTML 1.1 doctype (closes #228).
cmlenz
parents: 510
diff changeset
308
1447d40df660 * Add XHTML 1.1 doctype (closes #228).
cmlenz
parents: 510
diff changeset
309 (This option is only available for serialization to XHTML.)
1447d40df660 * Add XHTML 1.1 doctype (closes #228).
cmlenz
parents: 510
diff changeset
310
745
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
311 ``strip_markup``
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
312 Whether the text serializer should detect and remove any tags or entity
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
313 encoded characters in the text.
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
314
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
315 (This option is only available for serialization to plain text.)
f9544a7cc57a Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
316
438
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
317
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
318
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
319 Using XPath
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
320 ===========
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
321
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
322 XPath can be used to extract a specific subset of the stream via the
508
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
323 ``select()`` method:
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
324
510
ca7d707d51b0 Use syntax highlighting on all the other doc pages, too.
cmlenz
parents: 508
diff changeset
325 .. code-block:: pycon
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
326
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
327 >>> substream = stream.select('a')
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
328 >>> substream
382
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
329 <genshi.core.Stream object at ...>
853
4376010bb97e Convert a bunch of print statements to py3k compatible syntax.
cmlenz
parents: 774
diff changeset
330 >>> print(substream)
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
331 <a href="http://example.org/">a link</a>
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
332
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
333 Often, streams cannot be reused: in the above example, the sub-stream is based
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
334 on a generator. Once it has been serialized, it will have been fully consumed,
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
335 and cannot be rendered again. To work around this, you can wrap such a stream
508
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
336 in a ``list``:
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
337
510
ca7d707d51b0 Use syntax highlighting on all the other doc pages, too.
cmlenz
parents: 508
diff changeset
338 .. code-block:: pycon
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
339
230
24757b771651 Renamed Markup to Genshi in repository.
cmlenz
parents: 226
diff changeset
340 >>> from genshi import Stream
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
341 >>> substream = Stream(list(stream.select('a')))
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
342 >>> substream
382
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
343 <genshi.core.Stream object at ...>
853
4376010bb97e Convert a bunch of print statements to py3k compatible syntax.
cmlenz
parents: 774
diff changeset
344 >>> print(substream)
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
345 <a href="http://example.org/">a link</a>
853
4376010bb97e Convert a bunch of print statements to py3k compatible syntax.
cmlenz
parents: 774
diff changeset
346 >>> print(substream.select('@href'))
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
347 http://example.org/
853
4376010bb97e Convert a bunch of print statements to py3k compatible syntax.
cmlenz
parents: 774
diff changeset
348 >>> print(substream.select('text()'))
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
349 a link
382
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
350
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
351 See `Using XPath in Genshi`_ for more information about the XPath support in
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
352 Genshi.
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
353
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
354 .. _`Using XPath in Genshi`: xpath.html
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
355
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
356
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
357 .. _`event kinds`:
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
358
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
359 Event Kinds
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
360 ===========
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
361
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
362 Every event in a stream is of one of several *kinds*, which also determines
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
363 what the ``data`` item of the event tuple looks like. The different kinds of
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
364 events are documented below.
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
365
394
ebc7c1a3bc4d Minor doc fixes.
cmlenz
parents: 382
diff changeset
366 .. note:: The ``data`` item is generally immutable. If the data is to be
382
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
367 modified when processing a stream, it must be replaced by a new tuple.
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
368 Effectively, this means the entire event tuple is immutable.
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
369
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
370 START
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
371 -----
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
372 The opening tag of an element.
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
373
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
374 For this kind of event, the ``data`` item is a tuple of the form
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
375 ``(tagname, attrs)``, where ``tagname`` is a ``QName`` instance describing the
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
376 qualified name of the tag, and ``attrs`` is an ``Attrs`` instance containing
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
377 the attribute names and values associated with the tag (excluding namespace
508
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
378 declarations):
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
379
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
380 .. code-block:: python
382
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
381
857
24733a5854d9 Avoid unicode literals in `repr`s of `QName` and `Namespace` when not necessary.
cmlenz
parents: 853
diff changeset
382 START, (QName('p'), Attrs([(QName('class'), u'intro')])), pos
382
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
383
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
384 END
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
385 ---
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
386 The closing tag of an element.
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
387
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
388 The ``data`` item of end events consists of just a ``QName`` instance
508
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
389 describing the qualified name of the tag:
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
390
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
391 .. code-block:: python
382
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
392
857
24733a5854d9 Avoid unicode literals in `repr`s of `QName` and `Namespace` when not necessary.
cmlenz
parents: 853
diff changeset
393 END, QName('p'), pos
382
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
394
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
395 TEXT
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
396 ----
394
ebc7c1a3bc4d Minor doc fixes.
cmlenz
parents: 382
diff changeset
397 Character data outside of elements and comments.
382
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
398
508
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
399 For text events, the ``data`` item should be a unicode object:
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
400
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
401 .. code-block:: python
382
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
402
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
403 TEXT, u'Hello, world!', pos
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
404
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
405 START_NS
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
406 --------
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
407 The start of a namespace mapping, binding a namespace prefix to a URI.
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
408
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
409 The ``data`` item of this kind of event is a tuple of the form
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
410 ``(prefix, uri)``, where ``prefix`` is the namespace prefix and ``uri`` is the
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
411 full URI to which the prefix is bound. Both should be unicode objects. If the
508
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
412 namespace is not bound to any prefix, the ``prefix`` item is an empty string:
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
413
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
414 .. code-block:: python
382
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
415
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
416 START_NS, (u'svg', u'http://www.w3.org/2000/svg'), pos
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
417
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
418 END_NS
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
419 ------
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
420 The end of a namespace mapping.
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
421
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
422 The ``data`` item of such events consists of only the namespace prefix (a
508
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
423 unicode object):
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
424
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
425 .. code-block:: python
382
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
426
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
427 END_NS, u'svg', pos
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
428
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
429 DOCTYPE
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
430 -------
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
431 A document type declaration.
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
432
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
433 For this type of event, the ``data`` item is a tuple of the form
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
434 ``(name, pubid, sysid)``, where ``name`` is the name of the root element,
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
435 ``pubid`` is the public identifier of the DTD (or ``None``), and ``sysid`` is
508
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
436 the system identifier of the DTD (or ``None``):
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
437
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
438 .. code-block:: python
382
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
439
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
440 DOCTYPE, (u'html', u'-//W3C//DTD XHTML 1.0 Transitional//EN', \
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
441 u'http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd'), pos
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
442
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
443 COMMENT
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
444 -------
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
445 A comment.
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
446
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
447 For such events, the ``data`` item is a unicode object containing all character
508
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
448 data between the comment delimiters:
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
449
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
450 .. code-block:: python
382
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
451
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
452 COMMENT, u'Commented out', pos
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
453
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
454 PI
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
455 --
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
456 A processing instruction.
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
457
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
458 The ``data`` item is a tuple of the form ``(target, data)`` for processing
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
459 instructions, where ``target`` is the target of the PI (used to identify the
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
460 application by which the instruction should be processed), and ``data`` is text
508
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
461 following the target (excluding the terminating question mark):
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
462
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
463 .. code-block:: python
382
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
464
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
465 PI, (u'php', u'echo "Yo" '), pos
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
466
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
467 START_CDATA
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
468 -----------
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
469 Marks the beginning of a ``CDATA`` section.
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
470
508
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
471 The ``data`` item for such events is always ``None``:
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
472
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
473 .. code-block:: python
382
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
474
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
475 START_CDATA, None, pos
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
476
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
477 END_CDATA
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
478 ---------
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
479 Marks the end of a ``CDATA`` section.
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
480
508
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
481 The ``data`` item for such events is always ``None``:
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
482
cabd80e75dad Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
483 .. code-block:: python
382
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
484
d7da3fba7faf * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
485 END_CDATA, None, pos
Copyright (C) 2012-2017 Edgewall Software