comparison doc/streams.txt @ 745:74b5c5476ddb trunk

Preparing for [milestone:0.5] release.
author cmlenz
date Mon, 09 Jun 2008 09:50:03 +0000
parents be0b4a7b2fd4
children f459f22f7ad2
comparison
equal deleted inserted replaced
744:cd6624cf2f7c 745:74b5c5476ddb
6 6
7 A stream is the common representation of markup as a *stream of events*. 7 A stream is the common representation of markup as a *stream of events*.
8 8
9 9
10 .. contents:: Contents 10 .. contents:: Contents
11 :depth: 1 11 :depth: 2
12 .. sectnum:: 12 .. sectnum::
13 13
14 14
15 Basics 15 Basics
16 ====== 16 ======
130 130
131 Serialization means producing some kind of textual output from a stream of 131 Serialization means producing some kind of textual output from a stream of
132 events, which you'll need when you want to transmit or store the results of 132 events, which you'll need when you want to transmit or store the results of
133 generating or otherwise processing markup. 133 generating or otherwise processing markup.
134 134
135 The ``Stream`` class provides two methods for serialization: ``serialize()`` and 135 The ``Stream`` class provides two methods for serialization: ``serialize()``
136 ``render()``. The former is a generator that yields chunks of ``Markup`` objects 136 and ``render()``. The former is a generator that yields chunks of ``Markup``
137 (which are basically unicode strings that are considered safe for output on the 137 objects (which are basically unicode strings that are considered safe for
138 web). The latter returns a single string, by default UTF-8 encoded. 138 output on the web). The latter returns a single string, by default UTF-8
139 encoded.
139 140
140 Here's the output from ``serialize()``: 141 Here's the output from ``serialize()``:
141 142
142 .. code-block:: pycon 143 .. code-block:: pycon
143 144
159 160
160 >>> print stream.render() 161 >>> print stream.render()
161 <p class="intro">Some text and <a href="http://example.org/">a link</a>.<br/></p> 162 <p class="intro">Some text and <a href="http://example.org/">a link</a>.<br/></p>
162 163
163 Both methods can be passed a ``method`` parameter that determines how exactly 164 Both methods can be passed a ``method`` parameter that determines how exactly
164 the events are serialzed to text. This parameter can be either “xml” (the 165 the events are serialized to text. This parameter can be either a string or a
165 default), “xhtml”, “html”, “text”, or a custom serializer class: 166 custom serializer class:
166 167
167 .. code-block:: pycon 168 .. code-block:: pycon
168 169
169 >>> print stream.render('html') 170 >>> print stream.render('html')
170 <p class="intro">Some text and <a href="http://example.org/">a link</a>.<br></p> 171 <p class="intro">Some text and <a href="http://example.org/">a link</a>.<br></p>
171 172
172 Note how the `<br>` element isn't closed, which is the right thing to do for 173 Note how the `<br>` element isn't closed, which is the right thing to do for
173 HTML. 174 HTML. See `serialization methods`_ for more details.
174 175
175 In addition, the ``render()`` method takes an ``encoding`` parameter, which 176 In addition, the ``render()`` method takes an ``encoding`` parameter, which
176 defaults to “UTF-8”. If set to ``None``, the result will be a unicode string. 177 defaults to “UTF-8”. If set to ``None``, the result will be a unicode string.
177 178
178 The different serializer classes in ``genshi.output`` can also be used 179 The different serializer classes in ``genshi.output`` can also be used
191 192
192 >>> print stream | HTMLSanitizer() | TextSerializer() 193 >>> print stream | HTMLSanitizer() | TextSerializer()
193 Some text and a link. 194 Some text and a link.
194 195
195 196
197 .. _`serialization methods`:
198
199 Serialization Methods
200 ---------------------
201
202 Genshi supports the use of different serialization methods to use for creating
203 a text representation of a markup stream.
204
205 ``xml``
206 The ``XMLSerializer`` is the default serialization method and results in
207 proper XML output including namespace support, the XML declaration, CDATA
208 sections, and so on. It is not generally not suitable for serving HTML or
209 XHTML web pages (unless you want to use true XHTML 1.1), for which the
210 ``xhtml`` and ``html`` serializers described below should be preferred.
211
212 ``xhtml``
213 The ``XHTMLSerializer`` is a specialization of the generic ``XMLSerializer``
214 that understands the pecularities of producing XML-compliant output that can
215 also be parsed without problems by the HTML parsers found in modern web
216 browsers. Thus, the output by this serializer should be usable whether sent
217 as "text/html" or "application/xhtml+html" (although there are a lot of
218 subtle issues to pay attention to when switching between the two, in
219 particular with respect to differences in the DOM and CSS).
220
221 For example, instead of rendering a script tag as ``<script/>`` (which
222 confuses the HTML parser in many browsers), it will produce
223 ``<script></script>``. Also, it will normalize any boolean attributes values
224 that are minimized in HTML, so that for example ``<hr noshade="1"/>``
225 becomes ``<hr noshade="noshade" />``.
226
227 This serializer supports the use of namespaces for compound documents, for
228 example to use inline SVG inside an XHTML document.
229
230 ``html``
231 The ``HTMLSerializer`` produces proper HTML markup. The main differences
232 compared to ``xhtml`` serialization are that boolean attributes are
233 minimized, empty tags are not self-closing (so it's ``<br>`` instead of
234 ``<br />``), and that the contents of ``<script>`` and ``<style>`` elements
235 are not escaped.
236
237 ``text``
238 The ``TextSerializer`` produces plain text from markup streams. This is
239 useful primarily for `text templates`_, but can also be used to produce
240 plain text output from markup templates or other sources.
241
242 .. _`text templates`: text-templates.html
243
244
196 Serialization Options 245 Serialization Options
197 --------------------- 246 ---------------------
198 247
199 Both ``serialize()`` and ``render()`` support additional keyword arguments that 248 Both ``serialize()`` and ``render()`` support additional keyword arguments that
200 are passed through to the initializer of the serializer class. The following 249 are passed through to the initializer of the serializer class. The following
201 options are supported by the built-in serializers: 250 options are supported by the built-in serializers:
202 251
203 ``strip_whitespace`` 252 ``strip_whitespace``
204 Whether the serializer should remove trailing spaces and empty lines. Defaults 253 Whether the serializer should remove trailing spaces and empty lines.
205 to ``True``. 254 Defaults to ``True``.
206 255
207 (This option is not available for serialization to plain text.) 256 (This option is not available for serialization to plain text.)
208 257
209 ``doctype`` 258 ``doctype``
210 A ``(name, pubid, sysid)`` tuple defining the name, publid identifier, and 259 A ``(name, pubid, sysid)`` tuple defining the name, publid identifier, and
211 system identifier of a ``DOCTYPE`` declaration to prepend to the generated 260 system identifier of a ``DOCTYPE`` declaration to prepend to the generated
212 output. If provided, this declaration will override any ``DOCTYPE`` 261 output. If provided, this declaration will override any ``DOCTYPE``
213 declaration in the stream. 262 declaration in the stream.
214 263
264 The parameter can also be specified as a string to refer to commonly used
265 doctypes:
266
267 +-----------------------------+-------------------------------------------+
268 | Shorthand | DOCTYPE |
269 +=============================+===========================================+
270 | ``html`` or | HTML 4.01 Strict |
271 | ``html-strict`` | |
272 +-----------------------------+-------------------------------------------+
273 | ``html-transitional`` | HTML 4.01 Transitional |
274 +-----------------------------+-------------------------------------------+
275 | ``html-frameset`` | HTML 4.01 Frameset |
276 +-----------------------------+-------------------------------------------+
277 | ``html5`` | DOCTYPE proposed for the work-in-progress |
278 | | HTML5 standard |
279 +-----------------------------+-------------------------------------------+
280 | ``xhtml`` or | XHTML 1.0 Strict |
281 | ``xhtml-strict`` | |
282 +-----------------------------+-------------------------------------------+
283 | ``xhtml-transitional`` | XHTML 1.0 Transitional |
284 +-----------------------------+-------------------------------------------+
285 | ``xhtml-frameset`` | XHTML 1.0 Frameset |
286 +-----------------------------+-------------------------------------------+
287 | ``xhtml11`` | XHTML 1.1 |
288 +-----------------------------+-------------------------------------------+
289 | ``svg`` or ``svg-full`` | SVG 1.1 |
290 +-----------------------------+-------------------------------------------+
291 | ``svg-basic`` | SVG 1.1 Basic |
292 +-----------------------------+-------------------------------------------+
293 | ``svg-tiny`` | SVG 1.1 Tiny |
294 +-----------------------------+-------------------------------------------+
295
215 (This option is not available for serialization to plain text.) 296 (This option is not available for serialization to plain text.)
216 297
217 ``namespace_prefixes`` 298 ``namespace_prefixes``
218 The namespace prefixes to use for namespace that are not bound to a prefix 299 The namespace prefixes to use for namespace that are not bound to a prefix
219 in the stream itself. 300 in the stream itself.
224 Whether to remove the XML declaration (the ``<?xml ?>`` part at the 305 Whether to remove the XML declaration (the ``<?xml ?>`` part at the
225 beginning of a document) when serializing. This defaults to ``True`` as an 306 beginning of a document) when serializing. This defaults to ``True`` as an
226 XML declaration throws some older browsers into "Quirks" rendering mode. 307 XML declaration throws some older browsers into "Quirks" rendering mode.
227 308
228 (This option is only available for serialization to XHTML.) 309 (This option is only available for serialization to XHTML.)
310
311 ``strip_markup``
312 Whether the text serializer should detect and remove any tags or entity
313 encoded characters in the text.
314
315 (This option is only available for serialization to plain text.)
229 316
230 317
231 318
232 Using XPath 319 Using XPath
233 =========== 320 ===========
Copyright (C) 2012-2017 Edgewall Software