Mercurial > genshi > genshi-test
diff doc/streams.txt @ 500:0742f421caba experimental-inline
Merged revisions 487-603 via svnmerge from
http://svn.edgewall.org/repos/genshi/trunk
author | cmlenz |
---|---|
date | Fri, 01 Jun 2007 17:21:47 +0000 |
parents | 55cf81951686 |
children | 1837f39efd6f |
line wrap: on
line diff
--- a/doc/streams.txt +++ b/doc/streams.txt @@ -18,9 +18,8 @@ A stream can be attained in a number of ways. It can be: * the result of parsing XML or HTML text, or -* programmatically generated, or -* the result of selecting a subset of another stream filtered by an XPath - expression. +* the result of selecting a subset of another stream using XPath, or +* programmatically generated. For example, the functions ``XML()`` and ``HTML()`` can be used to convert literal XML or HTML text to a markup stream:: @@ -91,7 +90,9 @@ ``genshi.filters``. It processes a stream of HTML markup, and strips out any potentially dangerous constructs, such as Javascript event handlers. ``HTMLSanitizer`` is not a function, but rather a class that implements -``__call__``, which means instances of the class are callable. +``__call__``, which means instances of the class are callable:: + + stream = stream | HTMLSanitizer() Both the ``filter()`` method and the pipe operator allow easy chaining of filters:: @@ -103,15 +104,22 @@ stream = stream | noop | HTMLSanitizer() +For more information about the built-in filters, see `Stream Filters`_. + +.. _`Stream Filters`: filters.html + Serialization ============= -The ``Stream`` class provides two methods for serializing this list of events: -``serialize()`` and ``render()``. The former is a generator that yields chunks -of ``Markup`` objects (which are basically unicode strings that are considered -safe for output on the web). The latter returns a single string, by default -UTF-8 encoded. +Serialization means producing some kind of textual output from a stream of +events, which you'll need when you want to transmit or store the results of +generating or otherwise processing markup. + +The ``Stream`` class provides two methods for serialization: ``serialize()`` and +``render()``. The former is a generator that yields chunks of ``Markup`` objects +(which are basically unicode strings that are considered safe for output on the +web). The latter returns a single string, by default UTF-8 encoded. Here's the output from ``serialize()``:: @@ -159,6 +167,35 @@ Some text and a link. +Serialization Options +--------------------- + +Both ``serialize()`` and ``render()`` support additional keyword arguments that +are passed through to the initializer of the serializer class. The following +options are supported by the built-in serializers: + +``strip_whitespace`` + Whether the serializer should remove trailing spaces and empty lines. Defaults + to ``True``. + + (This option is not available for serialization to plain text.) + +``doctype`` + A ``(name, pubid, sysid)`` tuple defining the name, publid identifier, and + system identifier of a ``DOCTYPE`` declaration to prepend to the generated + output. If provided, this declaration will override any ``DOCTYPE`` + declaration in the stream. + + (This option is not available for serialization to plain text.) + +``namespace_prefixes`` + The namespace prefixes to use for namespace that are not bound to a prefix + in the stream itself. + + (This option is not available for serialization to HTML or plain text.) + + + Using XPath ===========