annotate doc/streams.txt @ 745:74b5c5476ddb trunk

Preparing for [milestone:0.5] release.
author cmlenz
date Mon, 09 Jun 2008 09:50:03 +0000
parents be0b4a7b2fd4
children f459f22f7ad2
rev   line source
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
1 .. -*- mode: rst; encoding: utf-8 -*-
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
2
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
3 ==============
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
4 Markup Streams
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
5 ==============
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
6
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
7 A stream is the common representation of markup as a *stream of events*.
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
8
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
9
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
10 .. contents:: Contents
745
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
11 :depth: 2
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
12 .. sectnum::
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
13
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
14
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
15 Basics
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
16 ======
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
17
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
18 A stream can be attained in a number of ways. It can be:
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
19
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
20 * the result of parsing XML or HTML text, or
438
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
21 * the result of selecting a subset of another stream using XPath, or
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
22 * programmatically generated.
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
23
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
24 For example, the functions ``XML()`` and ``HTML()`` can be used to convert
508
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
25 literal XML or HTML text to a markup stream:
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
26
510
1bdccd3bda00 Use syntax highlighting on all the other doc pages, too.
cmlenz
parents: 508
diff changeset
27 .. code-block:: pycon
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
28
230
84168828b074 Renamed Markup to Genshi in repository.
cmlenz
parents: 226
diff changeset
29 >>> from genshi import XML
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
30 >>> stream = XML('<p class="intro">Some text and '
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
31 ... '<a href="http://example.org/">a link</a>.'
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
32 ... '<br/></p>')
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
33 >>> stream
382
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
34 <genshi.core.Stream object at ...>
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
35
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
36 The stream is the result of parsing the text into events. Each event is a tuple
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
37 of the form ``(kind, data, pos)``, where:
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
38
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
39 * ``kind`` defines what kind of event it is (such as the start of an element,
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
40 text, a comment, etc).
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
41 * ``data`` is the actual data associated with the event. How this looks depends
382
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
42 on the event kind (see `event kinds`_)
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
43 * ``pos`` is a ``(filename, lineno, column)`` tuple that describes where the
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
44 event “comes from”.
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
45
510
1bdccd3bda00 Use syntax highlighting on all the other doc pages, too.
cmlenz
parents: 508
diff changeset
46 .. code-block:: pycon
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
47
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
48 >>> for kind, data, pos in stream:
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
49 ... print kind, `data`, pos
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
50 ...
382
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
51 START (QName(u'p'), Attrs([(QName(u'class'), u'intro')])) (None, 1, 0)
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
52 TEXT u'Some text and ' (None, 1, 17)
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
53 START (QName(u'a'), Attrs([(QName(u'href'), u'http://example.org/')])) (None, 1, 31)
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
54 TEXT u'a link' (None, 1, 61)
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
55 END QName(u'a') (None, 1, 67)
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
56 TEXT u'.' (None, 1, 71)
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
57 START (QName(u'br'), Attrs()) (None, 1, 72)
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
58 END QName(u'br') (None, 1, 77)
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
59 END QName(u'p') (None, 1, 77)
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
60
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
61
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
62 Filtering
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
63 =========
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
64
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
65 One important feature of markup streams is that you can apply *filters* to the
230
84168828b074 Renamed Markup to Genshi in repository.
cmlenz
parents: 226
diff changeset
66 stream, either filters that come with Genshi, or your own custom filters.
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
67
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
68 A filter is simply a callable that accepts the stream as parameter, and returns
508
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
69 the filtered stream:
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
70
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
71 .. code-block:: python
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
72
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
73 def noop(stream):
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
74 """A filter that doesn't actually do anything with the stream."""
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
75 for kind, data, pos in stream:
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
76 yield kind, data, pos
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
77
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
78 Filters can be applied in a number of ways. The simplest is to just call the
508
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
79 filter directly:
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
80
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
81 .. code-block:: python
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
82
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
83 stream = noop(stream)
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
84
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
85 The ``Stream`` class also provides a ``filter()`` method, which takes an
508
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
86 arbitrary number of filter callables and applies them all:
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
87
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
88 .. code-block:: python
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
89
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
90 stream = stream.filter(noop)
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
91
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
92 Finally, filters can also be applied using the *bitwise or* operator (``|``),
508
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
93 which allows a syntax similar to pipes on Unix shells:
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
94
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
95 .. code-block:: python
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
96
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
97 stream = stream | noop
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
98
230
84168828b074 Renamed Markup to Genshi in repository.
cmlenz
parents: 226
diff changeset
99 One example of a filter included with Genshi is the ``HTMLSanitizer`` in
84168828b074 Renamed Markup to Genshi in repository.
cmlenz
parents: 226
diff changeset
100 ``genshi.filters``. It processes a stream of HTML markup, and strips out any
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
101 potentially dangerous constructs, such as Javascript event handlers.
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
102 ``HTMLSanitizer`` is not a function, but rather a class that implements
508
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
103 ``__call__``, which means instances of the class are callable:
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
104
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
105 .. code-block:: python
438
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
106
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
107 stream = stream | HTMLSanitizer()
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
108
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
109 Both the ``filter()`` method and the pipe operator allow easy chaining of
508
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
110 filters:
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
111
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
112 .. code-block:: python
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
113
230
84168828b074 Renamed Markup to Genshi in repository.
cmlenz
parents: 226
diff changeset
114 from genshi.filters import HTMLSanitizer
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
115 stream = stream.filter(noop, HTMLSanitizer())
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
116
508
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
117 That is equivalent to:
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
118
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
119 .. code-block:: python
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
120
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
121 stream = stream | noop | HTMLSanitizer()
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
122
438
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
123 For more information about the built-in filters, see `Stream Filters`_.
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
124
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
125 .. _`Stream Filters`: filters.html
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
126
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
127
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
128 Serialization
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
129 =============
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
130
438
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
131 Serialization means producing some kind of textual output from a stream of
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
132 events, which you'll need when you want to transmit or store the results of
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
133 generating or otherwise processing markup.
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
134
745
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
135 The ``Stream`` class provides two methods for serialization: ``serialize()``
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
136 and ``render()``. The former is a generator that yields chunks of ``Markup``
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
137 objects (which are basically unicode strings that are considered safe for
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
138 output on the web). The latter returns a single string, by default UTF-8
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
139 encoded.
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
140
508
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
141 Here's the output from ``serialize()``:
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
142
510
1bdccd3bda00 Use syntax highlighting on all the other doc pages, too.
cmlenz
parents: 508
diff changeset
143 .. code-block:: pycon
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
144
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
145 >>> for output in stream.serialize():
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
146 ... print `output`
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
147 ...
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
148 <Markup u'<p class="intro">'>
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
149 <Markup u'Some text and '>
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
150 <Markup u'<a href="http://example.org/">'>
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
151 <Markup u'a link'>
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
152 <Markup u'</a>'>
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
153 <Markup u'.'>
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
154 <Markup u'<br/>'>
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
155 <Markup u'</p>'>
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
156
508
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
157 And here's the output from ``render()``:
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
158
510
1bdccd3bda00 Use syntax highlighting on all the other doc pages, too.
cmlenz
parents: 508
diff changeset
159 .. code-block:: pycon
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
160
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
161 >>> print stream.render()
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
162 <p class="intro">Some text and <a href="http://example.org/">a link</a>.<br/></p>
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
163
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
164 Both methods can be passed a ``method`` parameter that determines how exactly
745
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
165 the events are serialized to text. This parameter can be either a string or a
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
166 custom serializer class:
508
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
167
510
1bdccd3bda00 Use syntax highlighting on all the other doc pages, too.
cmlenz
parents: 508
diff changeset
168 .. code-block:: pycon
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
169
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
170 >>> print stream.render('html')
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
171 <p class="intro">Some text and <a href="http://example.org/">a link</a>.<br></p>
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
172
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
173 Note how the `<br>` element isn't closed, which is the right thing to do for
745
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
174 HTML. See `serialization methods`_ for more details.
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
175
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
176 In addition, the ``render()`` method takes an ``encoding`` parameter, which
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
177 defaults to “UTF-8”. If set to ``None``, the result will be a unicode string.
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
178
230
84168828b074 Renamed Markup to Genshi in repository.
cmlenz
parents: 226
diff changeset
179 The different serializer classes in ``genshi.output`` can also be used
508
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
180 directly:
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
181
510
1bdccd3bda00 Use syntax highlighting on all the other doc pages, too.
cmlenz
parents: 508
diff changeset
182 .. code-block:: pycon
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
183
230
84168828b074 Renamed Markup to Genshi in repository.
cmlenz
parents: 226
diff changeset
184 >>> from genshi.filters import HTMLSanitizer
84168828b074 Renamed Markup to Genshi in repository.
cmlenz
parents: 226
diff changeset
185 >>> from genshi.output import TextSerializer
382
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
186 >>> print ''.join(TextSerializer()(HTMLSanitizer()(stream)))
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
187 Some text and a link.
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
188
508
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
189 The pipe operator allows a nicer syntax:
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
190
510
1bdccd3bda00 Use syntax highlighting on all the other doc pages, too.
cmlenz
parents: 508
diff changeset
191 .. code-block:: pycon
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
192
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
193 >>> print stream | HTMLSanitizer() | TextSerializer()
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
194 Some text and a link.
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
195
382
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
196
745
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
197 .. _`serialization methods`:
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
198
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
199 Serialization Methods
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
200 ---------------------
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
201
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
202 Genshi supports the use of different serialization methods to use for creating
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
203 a text representation of a markup stream.
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
204
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
205 ``xml``
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
206 The ``XMLSerializer`` is the default serialization method and results in
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
207 proper XML output including namespace support, the XML declaration, CDATA
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
208 sections, and so on. It is not generally not suitable for serving HTML or
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
209 XHTML web pages (unless you want to use true XHTML 1.1), for which the
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
210 ``xhtml`` and ``html`` serializers described below should be preferred.
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
211
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
212 ``xhtml``
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
213 The ``XHTMLSerializer`` is a specialization of the generic ``XMLSerializer``
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
214 that understands the pecularities of producing XML-compliant output that can
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
215 also be parsed without problems by the HTML parsers found in modern web
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
216 browsers. Thus, the output by this serializer should be usable whether sent
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
217 as "text/html" or "application/xhtml+html" (although there are a lot of
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
218 subtle issues to pay attention to when switching between the two, in
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
219 particular with respect to differences in the DOM and CSS).
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
220
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
221 For example, instead of rendering a script tag as ``<script/>`` (which
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
222 confuses the HTML parser in many browsers), it will produce
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
223 ``<script></script>``. Also, it will normalize any boolean attributes values
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
224 that are minimized in HTML, so that for example ``<hr noshade="1"/>``
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
225 becomes ``<hr noshade="noshade" />``.
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
226
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
227 This serializer supports the use of namespaces for compound documents, for
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
228 example to use inline SVG inside an XHTML document.
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
229
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
230 ``html``
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
231 The ``HTMLSerializer`` produces proper HTML markup. The main differences
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
232 compared to ``xhtml`` serialization are that boolean attributes are
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
233 minimized, empty tags are not self-closing (so it's ``<br>`` instead of
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
234 ``<br />``), and that the contents of ``<script>`` and ``<style>`` elements
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
235 are not escaped.
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
236
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
237 ``text``
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
238 The ``TextSerializer`` produces plain text from markup streams. This is
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
239 useful primarily for `text templates`_, but can also be used to produce
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
240 plain text output from markup templates or other sources.
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
241
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
242 .. _`text templates`: text-templates.html
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
243
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
244
438
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
245 Serialization Options
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
246 ---------------------
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
247
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
248 Both ``serialize()`` and ``render()`` support additional keyword arguments that
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
249 are passed through to the initializer of the serializer class. The following
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
250 options are supported by the built-in serializers:
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
251
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
252 ``strip_whitespace``
745
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
253 Whether the serializer should remove trailing spaces and empty lines.
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
254 Defaults to ``True``.
438
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
255
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
256 (This option is not available for serialization to plain text.)
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
257
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
258 ``doctype``
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
259 A ``(name, pubid, sysid)`` tuple defining the name, publid identifier, and
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
260 system identifier of a ``DOCTYPE`` declaration to prepend to the generated
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
261 output. If provided, this declaration will override any ``DOCTYPE``
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
262 declaration in the stream.
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
263
745
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
264 The parameter can also be specified as a string to refer to commonly used
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
265 doctypes:
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
266
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
267 +-----------------------------+-------------------------------------------+
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
268 | Shorthand | DOCTYPE |
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
269 +=============================+===========================================+
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
270 | ``html`` or | HTML 4.01 Strict |
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
271 | ``html-strict`` | |
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
272 +-----------------------------+-------------------------------------------+
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
273 | ``html-transitional`` | HTML 4.01 Transitional |
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
274 +-----------------------------+-------------------------------------------+
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
275 | ``html-frameset`` | HTML 4.01 Frameset |
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
276 +-----------------------------+-------------------------------------------+
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
277 | ``html5`` | DOCTYPE proposed for the work-in-progress |
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
278 | | HTML5 standard |
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
279 +-----------------------------+-------------------------------------------+
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
280 | ``xhtml`` or | XHTML 1.0 Strict |
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
281 | ``xhtml-strict`` | |
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
282 +-----------------------------+-------------------------------------------+
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
283 | ``xhtml-transitional`` | XHTML 1.0 Transitional |
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
284 +-----------------------------+-------------------------------------------+
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
285 | ``xhtml-frameset`` | XHTML 1.0 Frameset |
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
286 +-----------------------------+-------------------------------------------+
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
287 | ``xhtml11`` | XHTML 1.1 |
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
288 +-----------------------------+-------------------------------------------+
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
289 | ``svg`` or ``svg-full`` | SVG 1.1 |
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
290 +-----------------------------+-------------------------------------------+
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
291 | ``svg-basic`` | SVG 1.1 Basic |
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
292 +-----------------------------+-------------------------------------------+
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
293 | ``svg-tiny`` | SVG 1.1 Tiny |
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
294 +-----------------------------+-------------------------------------------+
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
295
438
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
296 (This option is not available for serialization to plain text.)
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
297
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
298 ``namespace_prefixes``
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
299 The namespace prefixes to use for namespace that are not bound to a prefix
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
300 in the stream itself.
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
301
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
302 (This option is not available for serialization to HTML or plain text.)
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
303
729
be0b4a7b2fd4 * Add XHTML 1.1 doctype (closes #228).
cmlenz
parents: 510
diff changeset
304 ``drop_xml_decl``
be0b4a7b2fd4 * Add XHTML 1.1 doctype (closes #228).
cmlenz
parents: 510
diff changeset
305 Whether to remove the XML declaration (the ``<?xml ?>`` part at the
be0b4a7b2fd4 * Add XHTML 1.1 doctype (closes #228).
cmlenz
parents: 510
diff changeset
306 beginning of a document) when serializing. This defaults to ``True`` as an
be0b4a7b2fd4 * Add XHTML 1.1 doctype (closes #228).
cmlenz
parents: 510
diff changeset
307 XML declaration throws some older browsers into "Quirks" rendering mode.
be0b4a7b2fd4 * Add XHTML 1.1 doctype (closes #228).
cmlenz
parents: 510
diff changeset
308
be0b4a7b2fd4 * Add XHTML 1.1 doctype (closes #228).
cmlenz
parents: 510
diff changeset
309 (This option is only available for serialization to XHTML.)
be0b4a7b2fd4 * Add XHTML 1.1 doctype (closes #228).
cmlenz
parents: 510
diff changeset
310
745
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
311 ``strip_markup``
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
312 Whether the text serializer should detect and remove any tags or entity
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
313 encoded characters in the text.
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
314
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
315 (This option is only available for serialization to plain text.)
74b5c5476ddb Preparing for [milestone:0.5] release.
cmlenz
parents: 729
diff changeset
316
438
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
317
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
318
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
319 Using XPath
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
320 ===========
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
321
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
322 XPath can be used to extract a specific subset of the stream via the
508
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
323 ``select()`` method:
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
324
510
1bdccd3bda00 Use syntax highlighting on all the other doc pages, too.
cmlenz
parents: 508
diff changeset
325 .. code-block:: pycon
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
326
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
327 >>> substream = stream.select('a')
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
328 >>> substream
382
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
329 <genshi.core.Stream object at ...>
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
330 >>> print substream
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
331 <a href="http://example.org/">a link</a>
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
332
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
333 Often, streams cannot be reused: in the above example, the sub-stream is based
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
334 on a generator. Once it has been serialized, it will have been fully consumed,
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
335 and cannot be rendered again. To work around this, you can wrap such a stream
508
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
336 in a ``list``:
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
337
510
1bdccd3bda00 Use syntax highlighting on all the other doc pages, too.
cmlenz
parents: 508
diff changeset
338 .. code-block:: pycon
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
339
230
84168828b074 Renamed Markup to Genshi in repository.
cmlenz
parents: 226
diff changeset
340 >>> from genshi import Stream
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
341 >>> substream = Stream(list(stream.select('a')))
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
342 >>> substream
382
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
343 <genshi.core.Stream object at ...>
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
344 >>> print substream
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
345 <a href="http://example.org/">a link</a>
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
346 >>> print substream.select('@href')
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
347 http://example.org/
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
348 >>> print substream.select('text()')
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
349 a link
382
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
350
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
351 See `Using XPath in Genshi`_ for more information about the XPath support in
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
352 Genshi.
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
353
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
354 .. _`Using XPath in Genshi`: xpath.html
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
355
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
356
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
357 .. _`event kinds`:
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
358
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
359 Event Kinds
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
360 ===========
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
361
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
362 Every event in a stream is of one of several *kinds*, which also determines
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
363 what the ``data`` item of the event tuple looks like. The different kinds of
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
364 events are documented below.
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
365
394
cab6b0256019 Minor doc fixes.
cmlenz
parents: 382
diff changeset
366 .. note:: The ``data`` item is generally immutable. If the data is to be
382
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
367 modified when processing a stream, it must be replaced by a new tuple.
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
368 Effectively, this means the entire event tuple is immutable.
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
369
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
370 START
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
371 -----
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
372 The opening tag of an element.
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
373
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
374 For this kind of event, the ``data`` item is a tuple of the form
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
375 ``(tagname, attrs)``, where ``tagname`` is a ``QName`` instance describing the
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
376 qualified name of the tag, and ``attrs`` is an ``Attrs`` instance containing
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
377 the attribute names and values associated with the tag (excluding namespace
508
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
378 declarations):
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
379
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
380 .. code-block:: python
382
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
381
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
382 START, (QName(u'p'), Attrs([(u'class', u'intro')])), pos
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
383
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
384 END
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
385 ---
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
386 The closing tag of an element.
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
387
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
388 The ``data`` item of end events consists of just a ``QName`` instance
508
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
389 describing the qualified name of the tag:
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
390
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
391 .. code-block:: python
382
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
392
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
393 END, QName(u'p'), pos
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
394
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
395 TEXT
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
396 ----
394
cab6b0256019 Minor doc fixes.
cmlenz
parents: 382
diff changeset
397 Character data outside of elements and comments.
382
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
398
508
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
399 For text events, the ``data`` item should be a unicode object:
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
400
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
401 .. code-block:: python
382
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
402
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
403 TEXT, u'Hello, world!', pos
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
404
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
405 START_NS
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
406 --------
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
407 The start of a namespace mapping, binding a namespace prefix to a URI.
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
408
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
409 The ``data`` item of this kind of event is a tuple of the form
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
410 ``(prefix, uri)``, where ``prefix`` is the namespace prefix and ``uri`` is the
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
411 full URI to which the prefix is bound. Both should be unicode objects. If the
508
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
412 namespace is not bound to any prefix, the ``prefix`` item is an empty string:
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
413
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
414 .. code-block:: python
382
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
415
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
416 START_NS, (u'svg', u'http://www.w3.org/2000/svg'), pos
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
417
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
418 END_NS
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
419 ------
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
420 The end of a namespace mapping.
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
421
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
422 The ``data`` item of such events consists of only the namespace prefix (a
508
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
423 unicode object):
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
424
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
425 .. code-block:: python
382
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
426
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
427 END_NS, u'svg', pos
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
428
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
429 DOCTYPE
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
430 -------
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
431 A document type declaration.
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
432
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
433 For this type of event, the ``data`` item is a tuple of the form
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
434 ``(name, pubid, sysid)``, where ``name`` is the name of the root element,
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
435 ``pubid`` is the public identifier of the DTD (or ``None``), and ``sysid`` is
508
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
436 the system identifier of the DTD (or ``None``):
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
437
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
438 .. code-block:: python
382
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
439
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
440 DOCTYPE, (u'html', u'-//W3C//DTD XHTML 1.0 Transitional//EN', \
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
441 u'http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd'), pos
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
442
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
443 COMMENT
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
444 -------
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
445 A comment.
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
446
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
447 For such events, the ``data`` item is a unicode object containing all character
508
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
448 data between the comment delimiters:
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
449
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
450 .. code-block:: python
382
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
451
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
452 COMMENT, u'Commented out', pos
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
453
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
454 PI
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
455 --
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
456 A processing instruction.
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
457
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
458 The ``data`` item is a tuple of the form ``(target, data)`` for processing
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
459 instructions, where ``target`` is the target of the PI (used to identify the
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
460 application by which the instruction should be processed), and ``data`` is text
508
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
461 following the target (excluding the terminating question mark):
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
462
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
463 .. code-block:: python
382
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
464
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
465 PI, (u'php', u'echo "Yo" '), pos
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
466
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
467 START_CDATA
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
468 -----------
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
469 Marks the beginning of a ``CDATA`` section.
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
470
508
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
471 The ``data`` item for such events is always ``None``:
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
472
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
473 .. code-block:: python
382
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
474
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
475 START_CDATA, None, pos
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
476
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
477 END_CDATA
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
478 ---------
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
479 Marks the end of a ``CDATA`` section.
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
480
508
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
481 The ``data`` item for such events is always ``None``:
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
482
5fbc1cde74d6 Enable syntax highlighting (with Pygments) on doc page.
cmlenz
parents: 438
diff changeset
483 .. code-block:: python
382
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
484
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
485 END_CDATA, None, pos
Copyright (C) 2012-2017 Edgewall Software