Mercurial > genshi > mirror
comparison doc/streams.txt @ 230:84168828b074 trunk
Renamed Markup to Genshi in repository.
author | cmlenz |
---|---|
date | Mon, 11 Sep 2006 15:07:07 +0000 |
parents | 4d8a9e03b23d |
children | 2682dabbcd04 a81675590258 |
comparison
equal
deleted
inserted
replaced
229:58d974683419 | 230:84168828b074 |
---|---|
23 expression. | 23 expression. |
24 | 24 |
25 For example, the functions ``XML()`` and ``HTML()`` can be used to convert | 25 For example, the functions ``XML()`` and ``HTML()`` can be used to convert |
26 literal XML or HTML text to a markup stream:: | 26 literal XML or HTML text to a markup stream:: |
27 | 27 |
28 >>> from markup import XML | 28 >>> from genshi import XML |
29 >>> stream = XML('<p class="intro">Some text and ' | 29 >>> stream = XML('<p class="intro">Some text and ' |
30 ... '<a href="http://example.org/">a link</a>.' | 30 ... '<a href="http://example.org/">a link</a>.' |
31 ... '<br/></p>') | 31 ... '<br/></p>') |
32 >>> stream | 32 >>> stream |
33 <markup.core.Stream object at 0x6bef0> | 33 <genshi.core.Stream object at 0x6bef0> |
34 | 34 |
35 The stream is the result of parsing the text into events. Each event is a tuple | 35 The stream is the result of parsing the text into events. Each event is a tuple |
36 of the form ``(kind, data, pos)``, where: | 36 of the form ``(kind, data, pos)``, where: |
37 | 37 |
38 * ``kind`` defines what kind of event it is (such as the start of an element, | 38 * ``kind`` defines what kind of event it is (such as the start of an element, |
60 | 60 |
61 Filtering | 61 Filtering |
62 ========= | 62 ========= |
63 | 63 |
64 One important feature of markup streams is that you can apply *filters* to the | 64 One important feature of markup streams is that you can apply *filters* to the |
65 stream, either filters that come with Markup, or your own custom filters. | 65 stream, either filters that come with Genshi, or your own custom filters. |
66 | 66 |
67 A filter is simply a callable that accepts the stream as parameter, and returns | 67 A filter is simply a callable that accepts the stream as parameter, and returns |
68 the filtered stream:: | 68 the filtered stream:: |
69 | 69 |
70 def noop(stream): | 70 def noop(stream): |
85 Finally, filters can also be applied using the *bitwise or* operator (``|``), | 85 Finally, filters can also be applied using the *bitwise or* operator (``|``), |
86 which allows a syntax similar to pipes on Unix shells:: | 86 which allows a syntax similar to pipes on Unix shells:: |
87 | 87 |
88 stream = stream | noop | 88 stream = stream | noop |
89 | 89 |
90 One example of a filter included with Markup is the ``HTMLSanitizer`` in | 90 One example of a filter included with Genshi is the ``HTMLSanitizer`` in |
91 ``markup.filters``. It processes a stream of HTML markup, and strips out any | 91 ``genshi.filters``. It processes a stream of HTML markup, and strips out any |
92 potentially dangerous constructs, such as Javascript event handlers. | 92 potentially dangerous constructs, such as Javascript event handlers. |
93 ``HTMLSanitizer`` is not a function, but rather a class that implements | 93 ``HTMLSanitizer`` is not a function, but rather a class that implements |
94 ``__call__``, which means instances of the class are callable. | 94 ``__call__``, which means instances of the class are callable. |
95 | 95 |
96 Both the ``filter()`` method and the pipe operator allow easy chaining of | 96 Both the ``filter()`` method and the pipe operator allow easy chaining of |
97 filters:: | 97 filters:: |
98 | 98 |
99 from markup.filters import HTMLSanitizer | 99 from genshi.filters import HTMLSanitizer |
100 stream = stream.filter(noop, HTMLSanitizer()) | 100 stream = stream.filter(noop, HTMLSanitizer()) |
101 | 101 |
102 That is equivalent to:: | 102 That is equivalent to:: |
103 | 103 |
104 stream = stream | noop | HTMLSanitizer() | 104 stream = stream | noop | HTMLSanitizer() |
107 Serialization | 107 Serialization |
108 ============= | 108 ============= |
109 | 109 |
110 The ``Stream`` class provides two methods for serializing this list of events: | 110 The ``Stream`` class provides two methods for serializing this list of events: |
111 ``serialize()`` and ``render()``. The former is a generator that yields chunks | 111 ``serialize()`` and ``render()``. The former is a generator that yields chunks |
112 of ``Markup`` objects (which are basically unicode strings). The latter returns | 112 of ``Markup`` objects (which are basically unicode strings that are considered |
113 a single string, by default UTF-8 encoded. | 113 safe for output on the web). The latter returns a single string, by default |
114 UTF-8 encoded. | |
114 | 115 |
115 Here's the output from ``serialize()``:: | 116 Here's the output from ``serialize()``:: |
116 | 117 |
117 >>> for output in stream.serialize(): | 118 >>> for output in stream.serialize(): |
118 ... print `output` | 119 ... print `output` |
142 HTML. | 143 HTML. |
143 | 144 |
144 In addition, the ``render()`` method takes an ``encoding`` parameter, which | 145 In addition, the ``render()`` method takes an ``encoding`` parameter, which |
145 defaults to “UTF-8”. If set to ``None``, the result will be a unicode string. | 146 defaults to “UTF-8”. If set to ``None``, the result will be a unicode string. |
146 | 147 |
147 The different serializer classes in ``markup.output`` can also be used | 148 The different serializer classes in ``genshi.output`` can also be used |
148 directly:: | 149 directly:: |
149 | 150 |
150 >>> from markup.filters import HTMLSanitizer | 151 >>> from genshi.filters import HTMLSanitizer |
151 >>> from markup.output import TextSerializer | 152 >>> from genshi.output import TextSerializer |
152 >>> print TextSerializer()(HTMLSanitizer()(stream)) | 153 >>> print TextSerializer()(HTMLSanitizer()(stream)) |
153 Some text and a link. | 154 Some text and a link. |
154 | 155 |
155 The pipe operator allows a nicer syntax:: | 156 The pipe operator allows a nicer syntax:: |
156 | 157 |
163 XPath can be used to extract a specific subset of the stream via the | 164 XPath can be used to extract a specific subset of the stream via the |
164 ``select()`` method:: | 165 ``select()`` method:: |
165 | 166 |
166 >>> substream = stream.select('a') | 167 >>> substream = stream.select('a') |
167 >>> substream | 168 >>> substream |
168 <markup.core.Stream object at 0x7118b0> | 169 <genshi.core.Stream object at 0x7118b0> |
169 >>> print substream | 170 >>> print substream |
170 <a href="http://example.org/">a link</a> | 171 <a href="http://example.org/">a link</a> |
171 | 172 |
172 Often, streams cannot be reused: in the above example, the sub-stream is based | 173 Often, streams cannot be reused: in the above example, the sub-stream is based |
173 on a generator. Once it has been serialized, it will have been fully consumed, | 174 on a generator. Once it has been serialized, it will have been fully consumed, |
174 and cannot be rendered again. To work around this, you can wrap such a stream | 175 and cannot be rendered again. To work around this, you can wrap such a stream |
175 in a ``list``:: | 176 in a ``list``:: |
176 | 177 |
177 >>> from markup import Stream | 178 >>> from genshi import Stream |
178 >>> substream = Stream(list(stream.select('a'))) | 179 >>> substream = Stream(list(stream.select('a'))) |
179 >>> substream | 180 >>> substream |
180 <markup.core.Stream object at 0x7118b0> | 181 <genshi.core.Stream object at 0x7118b0> |
181 >>> print substream | 182 >>> print substream |
182 <a href="http://example.org/">a link</a> | 183 <a href="http://example.org/">a link</a> |
183 >>> print substream.select('@href') | 184 >>> print substream.select('@href') |
184 http://example.org/ | 185 http://example.org/ |
185 >>> print substream.select('text()') | 186 >>> print substream.select('text()') |