annotate doc/streams.txt @ 902:09cc3627654c experimental-inline

Sync `experimental/inline` branch with [source:trunk@1126].
author cmlenz
date Fri, 23 Apr 2010 21:08:26 +0000
parents 1837f39efd6f
children
rev   line source
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
1 .. -*- mode: rst; encoding: utf-8 -*-
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
2
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
3 ==============
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
4 Markup Streams
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
5 ==============
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
6
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
7 A stream is the common representation of markup as a *stream of events*.
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
8
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
9
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
10 .. contents:: Contents
820
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
11 :depth: 2
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
12 .. sectnum::
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
13
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
14
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
15 Basics
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
16 ======
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
17
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
18 A stream can be attained in a number of ways. It can be:
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
19
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
20 * the result of parsing XML or HTML text, or
500
0742f421caba Merged revisions 487-603 via svnmerge from
cmlenz
parents: 395
diff changeset
21 * the result of selecting a subset of another stream using XPath, or
0742f421caba Merged revisions 487-603 via svnmerge from
cmlenz
parents: 395
diff changeset
22 * programmatically generated.
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
23
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
24 For example, the functions ``XML()`` and ``HTML()`` can be used to convert
820
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
25 literal XML or HTML text to a markup stream:
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
26
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
27 .. code-block:: pycon
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
28
230
24757b771651 Renamed Markup to Genshi in repository.
cmlenz
parents: 226
diff changeset
29 >>> from genshi import XML
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
30 >>> stream = XML('<p class="intro">Some text and '
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
31 ... '<a href="http://example.org/">a link</a>.'
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
32 ... '<br/></p>')
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
33 >>> stream
395
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
34 <genshi.core.Stream object at ...>
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
35
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
36 The stream is the result of parsing the text into events. Each event is a tuple
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
37 of the form ``(kind, data, pos)``, where:
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
38
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
39 * ``kind`` defines what kind of event it is (such as the start of an element,
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
40 text, a comment, etc).
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
41 * ``data`` is the actual data associated with the event. How this looks depends
395
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
42 on the event kind (see `event kinds`_)
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
43 * ``pos`` is a ``(filename, lineno, column)`` tuple that describes where the
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
44 event “comes from”.
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
45
820
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
46 .. code-block:: pycon
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
47
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
48 >>> for kind, data, pos in stream:
902
09cc3627654c Sync `experimental/inline` branch with [source:trunk@1126].
cmlenz
parents: 820
diff changeset
49 ... print('%s %r %r' % (kind, data, pos))
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
50 ...
902
09cc3627654c Sync `experimental/inline` branch with [source:trunk@1126].
cmlenz
parents: 820
diff changeset
51 START (QName('p'), Attrs([(QName('class'), u'intro')])) (None, 1, 0)
395
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
52 TEXT u'Some text and ' (None, 1, 17)
902
09cc3627654c Sync `experimental/inline` branch with [source:trunk@1126].
cmlenz
parents: 820
diff changeset
53 START (QName('a'), Attrs([(QName('href'), u'http://example.org/')])) (None, 1, 31)
395
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
54 TEXT u'a link' (None, 1, 61)
902
09cc3627654c Sync `experimental/inline` branch with [source:trunk@1126].
cmlenz
parents: 820
diff changeset
55 END QName('a') (None, 1, 67)
395
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
56 TEXT u'.' (None, 1, 71)
902
09cc3627654c Sync `experimental/inline` branch with [source:trunk@1126].
cmlenz
parents: 820
diff changeset
57 START (QName('br'), Attrs()) (None, 1, 72)
09cc3627654c Sync `experimental/inline` branch with [source:trunk@1126].
cmlenz
parents: 820
diff changeset
58 END QName('br') (None, 1, 77)
09cc3627654c Sync `experimental/inline` branch with [source:trunk@1126].
cmlenz
parents: 820
diff changeset
59 END QName('p') (None, 1, 77)
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
60
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
61
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
62 Filtering
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
63 =========
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
64
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
65 One important feature of markup streams is that you can apply *filters* to the
230
24757b771651 Renamed Markup to Genshi in repository.
cmlenz
parents: 226
diff changeset
66 stream, either filters that come with Genshi, or your own custom filters.
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
67
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
68 A filter is simply a callable that accepts the stream as parameter, and returns
820
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
69 the filtered stream:
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
70
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
71 .. code-block:: python
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
72
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
73 def noop(stream):
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
74 """A filter that doesn't actually do anything with the stream."""
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
75 for kind, data, pos in stream:
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
76 yield kind, data, pos
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
77
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
78 Filters can be applied in a number of ways. The simplest is to just call the
820
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
79 filter directly:
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
80
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
81 .. code-block:: python
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
82
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
83 stream = noop(stream)
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
84
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
85 The ``Stream`` class also provides a ``filter()`` method, which takes an
820
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
86 arbitrary number of filter callables and applies them all:
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
87
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
88 .. code-block:: python
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
89
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
90 stream = stream.filter(noop)
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
91
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
92 Finally, filters can also be applied using the *bitwise or* operator (``|``),
820
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
93 which allows a syntax similar to pipes on Unix shells:
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
94
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
95 .. code-block:: python
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
96
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
97 stream = stream | noop
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
98
230
24757b771651 Renamed Markup to Genshi in repository.
cmlenz
parents: 226
diff changeset
99 One example of a filter included with Genshi is the ``HTMLSanitizer`` in
24757b771651 Renamed Markup to Genshi in repository.
cmlenz
parents: 226
diff changeset
100 ``genshi.filters``. It processes a stream of HTML markup, and strips out any
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
101 potentially dangerous constructs, such as Javascript event handlers.
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
102 ``HTMLSanitizer`` is not a function, but rather a class that implements
820
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
103 ``__call__``, which means instances of the class are callable:
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
104
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
105 .. code-block:: python
500
0742f421caba Merged revisions 487-603 via svnmerge from
cmlenz
parents: 395
diff changeset
106
0742f421caba Merged revisions 487-603 via svnmerge from
cmlenz
parents: 395
diff changeset
107 stream = stream | HTMLSanitizer()
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
108
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
109 Both the ``filter()`` method and the pipe operator allow easy chaining of
820
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
110 filters:
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
111
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
112 .. code-block:: python
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
113
230
24757b771651 Renamed Markup to Genshi in repository.
cmlenz
parents: 226
diff changeset
114 from genshi.filters import HTMLSanitizer
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
115 stream = stream.filter(noop, HTMLSanitizer())
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
116
820
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
117 That is equivalent to:
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
118
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
119 .. code-block:: python
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
120
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
121 stream = stream | noop | HTMLSanitizer()
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
122
500
0742f421caba Merged revisions 487-603 via svnmerge from
cmlenz
parents: 395
diff changeset
123 For more information about the built-in filters, see `Stream Filters`_.
0742f421caba Merged revisions 487-603 via svnmerge from
cmlenz
parents: 395
diff changeset
124
0742f421caba Merged revisions 487-603 via svnmerge from
cmlenz
parents: 395
diff changeset
125 .. _`Stream Filters`: filters.html
0742f421caba Merged revisions 487-603 via svnmerge from
cmlenz
parents: 395
diff changeset
126
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
127
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
128 Serialization
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
129 =============
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
130
500
0742f421caba Merged revisions 487-603 via svnmerge from
cmlenz
parents: 395
diff changeset
131 Serialization means producing some kind of textual output from a stream of
0742f421caba Merged revisions 487-603 via svnmerge from
cmlenz
parents: 395
diff changeset
132 events, which you'll need when you want to transmit or store the results of
0742f421caba Merged revisions 487-603 via svnmerge from
cmlenz
parents: 395
diff changeset
133 generating or otherwise processing markup.
0742f421caba Merged revisions 487-603 via svnmerge from
cmlenz
parents: 395
diff changeset
134
820
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
135 The ``Stream`` class provides two methods for serialization: ``serialize()``
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
136 and ``render()``. The former is a generator that yields chunks of ``Markup``
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
137 objects (which are basically unicode strings that are considered safe for
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
138 output on the web). The latter returns a single string, by default UTF-8
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
139 encoded.
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
140
820
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
141 Here's the output from ``serialize()``:
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
142
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
143 .. code-block:: pycon
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
144
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
145 >>> for output in stream.serialize():
902
09cc3627654c Sync `experimental/inline` branch with [source:trunk@1126].
cmlenz
parents: 820
diff changeset
146 ... print(repr(output))
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
147 ...
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
148 <Markup u'<p class="intro">'>
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
149 <Markup u'Some text and '>
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
150 <Markup u'<a href="http://example.org/">'>
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
151 <Markup u'a link'>
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
152 <Markup u'</a>'>
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
153 <Markup u'.'>
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
154 <Markup u'<br/>'>
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
155 <Markup u'</p>'>
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
156
820
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
157 And here's the output from ``render()``:
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
158
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
159 .. code-block:: pycon
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
160
902
09cc3627654c Sync `experimental/inline` branch with [source:trunk@1126].
cmlenz
parents: 820
diff changeset
161 >>> print(stream.render())
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
162 <p class="intro">Some text and <a href="http://example.org/">a link</a>.<br/></p>
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
163
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
164 Both methods can be passed a ``method`` parameter that determines how exactly
820
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
165 the events are serialized to text. This parameter can be either a string or a
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
166 custom serializer class:
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
167
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
168 .. code-block:: pycon
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
169
902
09cc3627654c Sync `experimental/inline` branch with [source:trunk@1126].
cmlenz
parents: 820
diff changeset
170 >>> print(stream.render('html'))
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
171 <p class="intro">Some text and <a href="http://example.org/">a link</a>.<br></p>
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
172
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
173 Note how the `<br>` element isn't closed, which is the right thing to do for
820
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
174 HTML. See `serialization methods`_ for more details.
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
175
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
176 In addition, the ``render()`` method takes an ``encoding`` parameter, which
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
177 defaults to “UTF-8”. If set to ``None``, the result will be a unicode string.
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
178
230
24757b771651 Renamed Markup to Genshi in repository.
cmlenz
parents: 226
diff changeset
179 The different serializer classes in ``genshi.output`` can also be used
820
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
180 directly:
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
181
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
182 .. code-block:: pycon
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
183
230
24757b771651 Renamed Markup to Genshi in repository.
cmlenz
parents: 226
diff changeset
184 >>> from genshi.filters import HTMLSanitizer
24757b771651 Renamed Markup to Genshi in repository.
cmlenz
parents: 226
diff changeset
185 >>> from genshi.output import TextSerializer
902
09cc3627654c Sync `experimental/inline` branch with [source:trunk@1126].
cmlenz
parents: 820
diff changeset
186 >>> print(''.join(TextSerializer()(HTMLSanitizer()(stream))))
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
187 Some text and a link.
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
188
820
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
189 The pipe operator allows a nicer syntax:
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
190
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
191 .. code-block:: pycon
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
192
902
09cc3627654c Sync `experimental/inline` branch with [source:trunk@1126].
cmlenz
parents: 820
diff changeset
193 >>> print(stream | HTMLSanitizer() | TextSerializer())
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
194 Some text and a link.
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
195
395
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
196
820
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
197 .. _`serialization methods`:
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
198
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
199 Serialization Methods
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
200 ---------------------
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
201
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
202 Genshi supports the use of different serialization methods to use for creating
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
203 a text representation of a markup stream.
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
204
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
205 ``xml``
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
206 The ``XMLSerializer`` is the default serialization method and results in
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
207 proper XML output including namespace support, the XML declaration, CDATA
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
208 sections, and so on. It is not generally not suitable for serving HTML or
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
209 XHTML web pages (unless you want to use true XHTML 1.1), for which the
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
210 ``xhtml`` and ``html`` serializers described below should be preferred.
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
211
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
212 ``xhtml``
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
213 The ``XHTMLSerializer`` is a specialization of the generic ``XMLSerializer``
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
214 that understands the pecularities of producing XML-compliant output that can
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
215 also be parsed without problems by the HTML parsers found in modern web
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
216 browsers. Thus, the output by this serializer should be usable whether sent
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
217 as "text/html" or "application/xhtml+html" (although there are a lot of
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
218 subtle issues to pay attention to when switching between the two, in
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
219 particular with respect to differences in the DOM and CSS).
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
220
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
221 For example, instead of rendering a script tag as ``<script/>`` (which
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
222 confuses the HTML parser in many browsers), it will produce
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
223 ``<script></script>``. Also, it will normalize any boolean attributes values
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
224 that are minimized in HTML, so that for example ``<hr noshade="1"/>``
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
225 becomes ``<hr noshade="noshade" />``.
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
226
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
227 This serializer supports the use of namespaces for compound documents, for
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
228 example to use inline SVG inside an XHTML document.
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
229
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
230 ``html``
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
231 The ``HTMLSerializer`` produces proper HTML markup. The main differences
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
232 compared to ``xhtml`` serialization are that boolean attributes are
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
233 minimized, empty tags are not self-closing (so it's ``<br>`` instead of
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
234 ``<br />``), and that the contents of ``<script>`` and ``<style>`` elements
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
235 are not escaped.
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
236
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
237 ``text``
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
238 The ``TextSerializer`` produces plain text from markup streams. This is
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
239 useful primarily for `text templates`_, but can also be used to produce
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
240 plain text output from markup templates or other sources.
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
241
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
242 .. _`text templates`: text-templates.html
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
243
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
244
500
0742f421caba Merged revisions 487-603 via svnmerge from
cmlenz
parents: 395
diff changeset
245 Serialization Options
0742f421caba Merged revisions 487-603 via svnmerge from
cmlenz
parents: 395
diff changeset
246 ---------------------
0742f421caba Merged revisions 487-603 via svnmerge from
cmlenz
parents: 395
diff changeset
247
0742f421caba Merged revisions 487-603 via svnmerge from
cmlenz
parents: 395
diff changeset
248 Both ``serialize()`` and ``render()`` support additional keyword arguments that
0742f421caba Merged revisions 487-603 via svnmerge from
cmlenz
parents: 395
diff changeset
249 are passed through to the initializer of the serializer class. The following
0742f421caba Merged revisions 487-603 via svnmerge from
cmlenz
parents: 395
diff changeset
250 options are supported by the built-in serializers:
0742f421caba Merged revisions 487-603 via svnmerge from
cmlenz
parents: 395
diff changeset
251
0742f421caba Merged revisions 487-603 via svnmerge from
cmlenz
parents: 395
diff changeset
252 ``strip_whitespace``
820
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
253 Whether the serializer should remove trailing spaces and empty lines.
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
254 Defaults to ``True``.
500
0742f421caba Merged revisions 487-603 via svnmerge from
cmlenz
parents: 395
diff changeset
255
0742f421caba Merged revisions 487-603 via svnmerge from
cmlenz
parents: 395
diff changeset
256 (This option is not available for serialization to plain text.)
0742f421caba Merged revisions 487-603 via svnmerge from
cmlenz
parents: 395
diff changeset
257
0742f421caba Merged revisions 487-603 via svnmerge from
cmlenz
parents: 395
diff changeset
258 ``doctype``
0742f421caba Merged revisions 487-603 via svnmerge from
cmlenz
parents: 395
diff changeset
259 A ``(name, pubid, sysid)`` tuple defining the name, publid identifier, and
0742f421caba Merged revisions 487-603 via svnmerge from
cmlenz
parents: 395
diff changeset
260 system identifier of a ``DOCTYPE`` declaration to prepend to the generated
0742f421caba Merged revisions 487-603 via svnmerge from
cmlenz
parents: 395
diff changeset
261 output. If provided, this declaration will override any ``DOCTYPE``
0742f421caba Merged revisions 487-603 via svnmerge from
cmlenz
parents: 395
diff changeset
262 declaration in the stream.
0742f421caba Merged revisions 487-603 via svnmerge from
cmlenz
parents: 395
diff changeset
263
820
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
264 The parameter can also be specified as a string to refer to commonly used
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
265 doctypes:
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
266
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
267 +-----------------------------+-------------------------------------------+
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
268 | Shorthand | DOCTYPE |
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
269 +=============================+===========================================+
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
270 | ``html`` or | HTML 4.01 Strict |
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
271 | ``html-strict`` | |
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
272 +-----------------------------+-------------------------------------------+
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
273 | ``html-transitional`` | HTML 4.01 Transitional |
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
274 +-----------------------------+-------------------------------------------+
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
275 | ``html-frameset`` | HTML 4.01 Frameset |
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
276 +-----------------------------+-------------------------------------------+
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
277 | ``html5`` | DOCTYPE proposed for the work-in-progress |
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
278 | | HTML5 standard |
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
279 +-----------------------------+-------------------------------------------+
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
280 | ``xhtml`` or | XHTML 1.0 Strict |
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
281 | ``xhtml-strict`` | |
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
282 +-----------------------------+-------------------------------------------+
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
283 | ``xhtml-transitional`` | XHTML 1.0 Transitional |
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
284 +-----------------------------+-------------------------------------------+
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
285 | ``xhtml-frameset`` | XHTML 1.0 Frameset |
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
286 +-----------------------------+-------------------------------------------+
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
287 | ``xhtml11`` | XHTML 1.1 |
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
288 +-----------------------------+-------------------------------------------+
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
289 | ``svg`` or ``svg-full`` | SVG 1.1 |
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
290 +-----------------------------+-------------------------------------------+
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
291 | ``svg-basic`` | SVG 1.1 Basic |
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
292 +-----------------------------+-------------------------------------------+
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
293 | ``svg-tiny`` | SVG 1.1 Tiny |
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
294 +-----------------------------+-------------------------------------------+
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
295
500
0742f421caba Merged revisions 487-603 via svnmerge from
cmlenz
parents: 395
diff changeset
296 (This option is not available for serialization to plain text.)
0742f421caba Merged revisions 487-603 via svnmerge from
cmlenz
parents: 395
diff changeset
297
0742f421caba Merged revisions 487-603 via svnmerge from
cmlenz
parents: 395
diff changeset
298 ``namespace_prefixes``
0742f421caba Merged revisions 487-603 via svnmerge from
cmlenz
parents: 395
diff changeset
299 The namespace prefixes to use for namespace that are not bound to a prefix
0742f421caba Merged revisions 487-603 via svnmerge from
cmlenz
parents: 395
diff changeset
300 in the stream itself.
0742f421caba Merged revisions 487-603 via svnmerge from
cmlenz
parents: 395
diff changeset
301
0742f421caba Merged revisions 487-603 via svnmerge from
cmlenz
parents: 395
diff changeset
302 (This option is not available for serialization to HTML or plain text.)
0742f421caba Merged revisions 487-603 via svnmerge from
cmlenz
parents: 395
diff changeset
303
820
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
304 ``drop_xml_decl``
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
305 Whether to remove the XML declaration (the ``<?xml ?>`` part at the
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
306 beginning of a document) when serializing. This defaults to ``True`` as an
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
307 XML declaration throws some older browsers into "Quirks" rendering mode.
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
308
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
309 (This option is only available for serialization to XHTML.)
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
310
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
311 ``strip_markup``
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
312 Whether the text serializer should detect and remove any tags or entity
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
313 encoded characters in the text.
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
314
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
315 (This option is only available for serialization to plain text.)
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
316
500
0742f421caba Merged revisions 487-603 via svnmerge from
cmlenz
parents: 395
diff changeset
317
0742f421caba Merged revisions 487-603 via svnmerge from
cmlenz
parents: 395
diff changeset
318
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
319 Using XPath
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
320 ===========
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
321
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
322 XPath can be used to extract a specific subset of the stream via the
820
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
323 ``select()`` method:
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
324
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
325 .. code-block:: pycon
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
326
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
327 >>> substream = stream.select('a')
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
328 >>> substream
395
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
329 <genshi.core.Stream object at ...>
902
09cc3627654c Sync `experimental/inline` branch with [source:trunk@1126].
cmlenz
parents: 820
diff changeset
330 >>> print(substream)
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
331 <a href="http://example.org/">a link</a>
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
332
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
333 Often, streams cannot be reused: in the above example, the sub-stream is based
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
334 on a generator. Once it has been serialized, it will have been fully consumed,
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
335 and cannot be rendered again. To work around this, you can wrap such a stream
820
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
336 in a ``list``:
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
337
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
338 .. code-block:: pycon
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
339
230
24757b771651 Renamed Markup to Genshi in repository.
cmlenz
parents: 226
diff changeset
340 >>> from genshi import Stream
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
341 >>> substream = Stream(list(stream.select('a')))
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
342 >>> substream
395
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
343 <genshi.core.Stream object at ...>
902
09cc3627654c Sync `experimental/inline` branch with [source:trunk@1126].
cmlenz
parents: 820
diff changeset
344 >>> print(substream)
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
345 <a href="http://example.org/">a link</a>
902
09cc3627654c Sync `experimental/inline` branch with [source:trunk@1126].
cmlenz
parents: 820
diff changeset
346 >>> print(substream.select('@href'))
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
347 http://example.org/
902
09cc3627654c Sync `experimental/inline` branch with [source:trunk@1126].
cmlenz
parents: 820
diff changeset
348 >>> print(substream.select('text()'))
226
09f869a98149 Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
349 a link
395
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
350
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
351 See `Using XPath in Genshi`_ for more information about the XPath support in
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
352 Genshi.
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
353
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
354 .. _`Using XPath in Genshi`: xpath.html
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
355
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
356
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
357 .. _`event kinds`:
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
358
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
359 Event Kinds
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
360 ===========
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
361
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
362 Every event in a stream is of one of several *kinds*, which also determines
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
363 what the ``data`` item of the event tuple looks like. The different kinds of
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
364 events are documented below.
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
365
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
366 .. note:: The ``data`` item is generally immutable. If the data is to be
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
367 modified when processing a stream, it must be replaced by a new tuple.
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
368 Effectively, this means the entire event tuple is immutable.
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
369
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
370 START
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
371 -----
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
372 The opening tag of an element.
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
373
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
374 For this kind of event, the ``data`` item is a tuple of the form
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
375 ``(tagname, attrs)``, where ``tagname`` is a ``QName`` instance describing the
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
376 qualified name of the tag, and ``attrs`` is an ``Attrs`` instance containing
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
377 the attribute names and values associated with the tag (excluding namespace
820
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
378 declarations):
395
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
379
820
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
380 .. code-block:: python
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
381
902
09cc3627654c Sync `experimental/inline` branch with [source:trunk@1126].
cmlenz
parents: 820
diff changeset
382 START, (QName('p'), Attrs([(QName('class'), u'intro')])), pos
395
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
383
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
384 END
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
385 ---
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
386 The closing tag of an element.
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
387
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
388 The ``data`` item of end events consists of just a ``QName`` instance
820
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
389 describing the qualified name of the tag:
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
390
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
391 .. code-block:: python
395
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
392
902
09cc3627654c Sync `experimental/inline` branch with [source:trunk@1126].
cmlenz
parents: 820
diff changeset
393 END, QName('p'), pos
395
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
394
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
395 TEXT
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
396 ----
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
397 Character data outside of elements and comments.
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
398
820
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
399 For text events, the ``data`` item should be a unicode object:
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
400
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
401 .. code-block:: python
395
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
402
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
403 TEXT, u'Hello, world!', pos
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
404
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
405 START_NS
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
406 --------
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
407 The start of a namespace mapping, binding a namespace prefix to a URI.
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
408
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
409 The ``data`` item of this kind of event is a tuple of the form
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
410 ``(prefix, uri)``, where ``prefix`` is the namespace prefix and ``uri`` is the
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
411 full URI to which the prefix is bound. Both should be unicode objects. If the
820
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
412 namespace is not bound to any prefix, the ``prefix`` item is an empty string:
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
413
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
414 .. code-block:: python
395
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
415
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
416 START_NS, (u'svg', u'http://www.w3.org/2000/svg'), pos
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
417
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
418 END_NS
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
419 ------
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
420 The end of a namespace mapping.
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
421
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
422 The ``data`` item of such events consists of only the namespace prefix (a
820
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
423 unicode object):
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
424
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
425 .. code-block:: python
395
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
426
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
427 END_NS, u'svg', pos
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
428
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
429 DOCTYPE
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
430 -------
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
431 A document type declaration.
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
432
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
433 For this type of event, the ``data`` item is a tuple of the form
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
434 ``(name, pubid, sysid)``, where ``name`` is the name of the root element,
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
435 ``pubid`` is the public identifier of the DTD (or ``None``), and ``sysid`` is
820
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
436 the system identifier of the DTD (or ``None``):
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
437
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
438 .. code-block:: python
395
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
439
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
440 DOCTYPE, (u'html', u'-//W3C//DTD XHTML 1.0 Transitional//EN', \
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
441 u'http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd'), pos
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
442
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
443 COMMENT
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
444 -------
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
445 A comment.
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
446
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
447 For such events, the ``data`` item is a unicode object containing all character
820
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
448 data between the comment delimiters:
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
449
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
450 .. code-block:: python
395
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
451
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
452 COMMENT, u'Commented out', pos
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
453
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
454 PI
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
455 --
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
456 A processing instruction.
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
457
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
458 The ``data`` item is a tuple of the form ``(target, data)`` for processing
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
459 instructions, where ``target`` is the target of the PI (used to identify the
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
460 application by which the instruction should be processed), and ``data`` is text
820
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
461 following the target (excluding the terminating question mark):
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
462
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
463 .. code-block:: python
395
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
464
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
465 PI, (u'php', u'echo "Yo" '), pos
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
466
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
467 START_CDATA
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
468 -----------
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
469 Marks the beginning of a ``CDATA`` section.
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
470
820
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
471 The ``data`` item for such events is always ``None``:
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
472
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
473 .. code-block:: python
395
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
474
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
475 START_CDATA, None, pos
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
476
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
477 END_CDATA
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
478 ---------
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
479 Marks the end of a ``CDATA`` section.
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
480
820
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
481 The ``data`` item for such events is always ``None``:
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
482
1837f39efd6f Sync (old) experimental inline branch with trunk@1027.
cmlenz
parents: 500
diff changeset
483 .. code-block:: python
395
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
484
55cf81951686 inline branch: Merged [439:479/trunk].
cmlenz
parents: 230
diff changeset
485 END_CDATA, None, pos
Copyright (C) 2012-2017 Edgewall Software