annotate doc/streams.txt @ 453:432b00a8357b stable-0.4.x 0.4.0

Branch for 0.4.x releases.
author cmlenz
date Mon, 16 Apr 2007 13:53:07 +0000
parents 2c38ec4e2dff
children 5fbc1cde74d6 26cf27d4f2b3
rev   line source
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
1 .. -*- mode: rst; encoding: utf-8 -*-
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
2
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
3 ==============
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
4 Markup Streams
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
5 ==============
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
6
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
7 A stream is the common representation of markup as a *stream of events*.
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
8
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
9
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
10 .. contents:: Contents
382
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
11 :depth: 1
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
12 .. sectnum::
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
13
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
14
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
15 Basics
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
16 ======
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
17
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
18 A stream can be attained in a number of ways. It can be:
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
19
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
20 * the result of parsing XML or HTML text, or
438
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
21 * the result of selecting a subset of another stream using XPath, or
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
22 * programmatically generated.
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
23
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
24 For example, the functions ``XML()`` and ``HTML()`` can be used to convert
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
25 literal XML or HTML text to a markup stream::
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
26
230
84168828b074 Renamed Markup to Genshi in repository.
cmlenz
parents: 226
diff changeset
27 >>> from genshi import XML
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
28 >>> stream = XML('<p class="intro">Some text and '
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
29 ... '<a href="http://example.org/">a link</a>.'
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
30 ... '<br/></p>')
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
31 >>> stream
382
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
32 <genshi.core.Stream object at ...>
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
33
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
34 The stream is the result of parsing the text into events. Each event is a tuple
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
35 of the form ``(kind, data, pos)``, where:
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
36
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
37 * ``kind`` defines what kind of event it is (such as the start of an element,
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
38 text, a comment, etc).
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
39 * ``data`` is the actual data associated with the event. How this looks depends
382
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
40 on the event kind (see `event kinds`_)
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
41 * ``pos`` is a ``(filename, lineno, column)`` tuple that describes where the
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
42 event “comes from”.
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
43
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
44 ::
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
45
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
46 >>> for kind, data, pos in stream:
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
47 ... print kind, `data`, pos
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
48 ...
382
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
49 START (QName(u'p'), Attrs([(QName(u'class'), u'intro')])) (None, 1, 0)
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
50 TEXT u'Some text and ' (None, 1, 17)
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
51 START (QName(u'a'), Attrs([(QName(u'href'), u'http://example.org/')])) (None, 1, 31)
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
52 TEXT u'a link' (None, 1, 61)
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
53 END QName(u'a') (None, 1, 67)
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
54 TEXT u'.' (None, 1, 71)
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
55 START (QName(u'br'), Attrs()) (None, 1, 72)
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
56 END QName(u'br') (None, 1, 77)
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
57 END QName(u'p') (None, 1, 77)
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
58
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
59
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
60 Filtering
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
61 =========
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
62
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
63 One important feature of markup streams is that you can apply *filters* to the
230
84168828b074 Renamed Markup to Genshi in repository.
cmlenz
parents: 226
diff changeset
64 stream, either filters that come with Genshi, or your own custom filters.
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
65
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
66 A filter is simply a callable that accepts the stream as parameter, and returns
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
67 the filtered stream::
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
68
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
69 def noop(stream):
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
70 """A filter that doesn't actually do anything with the stream."""
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
71 for kind, data, pos in stream:
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
72 yield kind, data, pos
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
73
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
74 Filters can be applied in a number of ways. The simplest is to just call the
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
75 filter directly::
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
76
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
77 stream = noop(stream)
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
78
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
79 The ``Stream`` class also provides a ``filter()`` method, which takes an
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
80 arbitrary number of filter callables and applies them all::
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
81
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
82 stream = stream.filter(noop)
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
83
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
84 Finally, filters can also be applied using the *bitwise or* operator (``|``),
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
85 which allows a syntax similar to pipes on Unix shells::
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
86
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
87 stream = stream | noop
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
88
230
84168828b074 Renamed Markup to Genshi in repository.
cmlenz
parents: 226
diff changeset
89 One example of a filter included with Genshi is the ``HTMLSanitizer`` in
84168828b074 Renamed Markup to Genshi in repository.
cmlenz
parents: 226
diff changeset
90 ``genshi.filters``. It processes a stream of HTML markup, and strips out any
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
91 potentially dangerous constructs, such as Javascript event handlers.
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
92 ``HTMLSanitizer`` is not a function, but rather a class that implements
438
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
93 ``__call__``, which means instances of the class are callable::
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
94
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
95 stream = stream | HTMLSanitizer()
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
96
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
97 Both the ``filter()`` method and the pipe operator allow easy chaining of
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
98 filters::
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
99
230
84168828b074 Renamed Markup to Genshi in repository.
cmlenz
parents: 226
diff changeset
100 from genshi.filters import HTMLSanitizer
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
101 stream = stream.filter(noop, HTMLSanitizer())
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
102
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
103 That is equivalent to::
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
104
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
105 stream = stream | noop | HTMLSanitizer()
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
106
438
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
107 For more information about the built-in filters, see `Stream Filters`_.
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
108
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
109 .. _`Stream Filters`: filters.html
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
110
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
111
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
112 Serialization
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
113 =============
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
114
438
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
115 Serialization means producing some kind of textual output from a stream of
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
116 events, which you'll need when you want to transmit or store the results of
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
117 generating or otherwise processing markup.
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
118
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
119 The ``Stream`` class provides two methods for serialization: ``serialize()`` and
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
120 ``render()``. The former is a generator that yields chunks of ``Markup`` objects
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
121 (which are basically unicode strings that are considered safe for output on the
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
122 web). The latter returns a single string, by default UTF-8 encoded.
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
123
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
124 Here's the output from ``serialize()``::
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
125
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
126 >>> for output in stream.serialize():
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
127 ... print `output`
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
128 ...
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
129 <Markup u'<p class="intro">'>
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
130 <Markup u'Some text and '>
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
131 <Markup u'<a href="http://example.org/">'>
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
132 <Markup u'a link'>
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
133 <Markup u'</a>'>
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
134 <Markup u'.'>
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
135 <Markup u'<br/>'>
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
136 <Markup u'</p>'>
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
137
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
138 And here's the output from ``render()``::
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
139
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
140 >>> print stream.render()
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
141 <p class="intro">Some text and <a href="http://example.org/">a link</a>.<br/></p>
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
142
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
143 Both methods can be passed a ``method`` parameter that determines how exactly
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
144 the events are serialzed to text. This parameter can be either “xml” (the
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
145 default), “xhtml”, “html”, “text”, or a custom serializer class::
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
146
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
147 >>> print stream.render('html')
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
148 <p class="intro">Some text and <a href="http://example.org/">a link</a>.<br></p>
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
149
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
150 Note how the `<br>` element isn't closed, which is the right thing to do for
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
151 HTML.
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
152
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
153 In addition, the ``render()`` method takes an ``encoding`` parameter, which
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
154 defaults to “UTF-8”. If set to ``None``, the result will be a unicode string.
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
155
230
84168828b074 Renamed Markup to Genshi in repository.
cmlenz
parents: 226
diff changeset
156 The different serializer classes in ``genshi.output`` can also be used
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
157 directly::
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
158
230
84168828b074 Renamed Markup to Genshi in repository.
cmlenz
parents: 226
diff changeset
159 >>> from genshi.filters import HTMLSanitizer
84168828b074 Renamed Markup to Genshi in repository.
cmlenz
parents: 226
diff changeset
160 >>> from genshi.output import TextSerializer
382
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
161 >>> print ''.join(TextSerializer()(HTMLSanitizer()(stream)))
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
162 Some text and a link.
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
163
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
164 The pipe operator allows a nicer syntax::
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
165
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
166 >>> print stream | HTMLSanitizer() | TextSerializer()
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
167 Some text and a link.
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
168
382
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
169
438
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
170 Serialization Options
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
171 ---------------------
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
172
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
173 Both ``serialize()`` and ``render()`` support additional keyword arguments that
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
174 are passed through to the initializer of the serializer class. The following
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
175 options are supported by the built-in serializers:
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
176
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
177 ``strip_whitespace``
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
178 Whether the serializer should remove trailing spaces and empty lines. Defaults
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
179 to ``True``.
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
180
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
181 (This option is not available for serialization to plain text.)
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
182
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
183 ``doctype``
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
184 A ``(name, pubid, sysid)`` tuple defining the name, publid identifier, and
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
185 system identifier of a ``DOCTYPE`` declaration to prepend to the generated
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
186 output. If provided, this declaration will override any ``DOCTYPE``
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
187 declaration in the stream.
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
188
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
189 (This option is not available for serialization to plain text.)
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
190
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
191 ``namespace_prefixes``
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
192 The namespace prefixes to use for namespace that are not bound to a prefix
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
193 in the stream itself.
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
194
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
195 (This option is not available for serialization to HTML or plain text.)
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
196
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
197
2c38ec4e2dff Added documentation page on the builtin stream filters.
cmlenz
parents: 394
diff changeset
198
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
199 Using XPath
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
200 ===========
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
201
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
202 XPath can be used to extract a specific subset of the stream via the
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
203 ``select()`` method::
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
204
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
205 >>> substream = stream.select('a')
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
206 >>> substream
382
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
207 <genshi.core.Stream object at ...>
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
208 >>> print substream
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
209 <a href="http://example.org/">a link</a>
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
210
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
211 Often, streams cannot be reused: in the above example, the sub-stream is based
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
212 on a generator. Once it has been serialized, it will have been fully consumed,
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
213 and cannot be rendered again. To work around this, you can wrap such a stream
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
214 in a ``list``::
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
215
230
84168828b074 Renamed Markup to Genshi in repository.
cmlenz
parents: 226
diff changeset
216 >>> from genshi import Stream
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
217 >>> substream = Stream(list(stream.select('a')))
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
218 >>> substream
382
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
219 <genshi.core.Stream object at ...>
226
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
220 >>> print substream
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
221 <a href="http://example.org/">a link</a>
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
222 >>> print substream.select('@href')
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
223 http://example.org/
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
224 >>> print substream.select('text()')
4d8a9e03b23d Add reStructuredText documentation files.
cmlenz
parents:
diff changeset
225 a link
382
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
226
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
227 See `Using XPath in Genshi`_ for more information about the XPath support in
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
228 Genshi.
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
229
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
230 .. _`Using XPath in Genshi`: xpath.html
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
231
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
232
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
233 .. _`event kinds`:
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
234
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
235 Event Kinds
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
236 ===========
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
237
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
238 Every event in a stream is of one of several *kinds*, which also determines
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
239 what the ``data`` item of the event tuple looks like. The different kinds of
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
240 events are documented below.
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
241
394
cab6b0256019 Minor doc fixes.
cmlenz
parents: 382
diff changeset
242 .. note:: The ``data`` item is generally immutable. If the data is to be
382
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
243 modified when processing a stream, it must be replaced by a new tuple.
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
244 Effectively, this means the entire event tuple is immutable.
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
245
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
246 START
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
247 -----
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
248 The opening tag of an element.
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
249
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
250 For this kind of event, the ``data`` item is a tuple of the form
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
251 ``(tagname, attrs)``, where ``tagname`` is a ``QName`` instance describing the
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
252 qualified name of the tag, and ``attrs`` is an ``Attrs`` instance containing
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
253 the attribute names and values associated with the tag (excluding namespace
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
254 declarations)::
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
255
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
256 START, (QName(u'p'), Attrs([(u'class', u'intro')])), pos
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
257
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
258 END
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
259 ---
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
260 The closing tag of an element.
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
261
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
262 The ``data`` item of end events consists of just a ``QName`` instance
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
263 describing the qualified name of the tag::
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
264
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
265 END, QName(u'p'), pos
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
266
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
267 TEXT
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
268 ----
394
cab6b0256019 Minor doc fixes.
cmlenz
parents: 382
diff changeset
269 Character data outside of elements and comments.
382
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
270
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
271 For text events, the ``data`` item should be a unicode object::
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
272
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
273 TEXT, u'Hello, world!', pos
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
274
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
275 START_NS
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
276 --------
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
277 The start of a namespace mapping, binding a namespace prefix to a URI.
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
278
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
279 The ``data`` item of this kind of event is a tuple of the form
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
280 ``(prefix, uri)``, where ``prefix`` is the namespace prefix and ``uri`` is the
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
281 full URI to which the prefix is bound. Both should be unicode objects. If the
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
282 namespace is not bound to any prefix, the ``prefix`` item is an empty string::
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
283
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
284 START_NS, (u'svg', u'http://www.w3.org/2000/svg'), pos
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
285
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
286 END_NS
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
287 ------
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
288 The end of a namespace mapping.
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
289
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
290 The ``data`` item of such events consists of only the namespace prefix (a
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
291 unicode object)::
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
292
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
293 END_NS, u'svg', pos
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
294
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
295 DOCTYPE
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
296 -------
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
297 A document type declaration.
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
298
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
299 For this type of event, the ``data`` item is a tuple of the form
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
300 ``(name, pubid, sysid)``, where ``name`` is the name of the root element,
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
301 ``pubid`` is the public identifier of the DTD (or ``None``), and ``sysid`` is
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
302 the system identifier of the DTD (or ``None``)::
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
303
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
304 DOCTYPE, (u'html', u'-//W3C//DTD XHTML 1.0 Transitional//EN', \
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
305 u'http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd'), pos
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
306
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
307 COMMENT
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
308 -------
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
309 A comment.
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
310
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
311 For such events, the ``data`` item is a unicode object containing all character
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
312 data between the comment delimiters::
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
313
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
314 COMMENT, u'Commented out', pos
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
315
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
316 PI
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
317 --
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
318 A processing instruction.
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
319
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
320 The ``data`` item is a tuple of the form ``(target, data)`` for processing
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
321 instructions, where ``target`` is the target of the PI (used to identify the
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
322 application by which the instruction should be processed), and ``data`` is text
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
323 following the target (excluding the terminating question mark)::
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
324
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
325 PI, (u'php', u'echo "Yo" '), pos
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
326
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
327 START_CDATA
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
328 -----------
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
329 Marks the beginning of a ``CDATA`` section.
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
330
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
331 The ``data`` item for such events is always ``None``::
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
332
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
333 START_CDATA, None, pos
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
334
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
335 END_CDATA
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
336 ---------
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
337 Marks the end of a ``CDATA`` section.
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
338
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
339 The ``data`` item for such events is always ``None``::
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
340
2682dabbcd04 * Added documentation for the various stream event kinds.
cmlenz
parents: 230
diff changeset
341 END_CDATA, None, pos
Copyright (C) 2012-2017 Edgewall Software