comparison doc/filters.txt @ 500:0742f421caba experimental-inline

Merged revisions 487-603 via svnmerge from http://svn.edgewall.org/repos/genshi/trunk
author cmlenz
date Fri, 01 Jun 2007 17:21:47 +0000
parents
children a332cb9c70d5 1a29617a5d87 1837f39efd6f
comparison
equal deleted inserted replaced
499:869b7885a516 500:0742f421caba
1 .. -*- mode: rst; encoding: utf-8 -*-
2
3 ==============
4 Stream Filters
5 ==============
6
7 `Markup Streams`_ showed how to write filters and how they are applied to
8 markup streams. This page describes the features of the various filters that
9 come with Genshi itself.
10
11 .. _`Markup Streams`: streams.html
12
13 .. contents:: Contents
14 :depth: 1
15 .. sectnum::
16
17
18 HTML Form Filler
19 ================
20
21 The filter ``genshi.filters.HTMLFormFiller`` can automatically populate an HTML
22 form from values provided as a simple dictionary. When using thi filter, you can
23 basically omit any ``value``, ``selected``, or ``checked`` attributes from form
24 controls in your templates, and let the filter do all that work for you.
25
26 ``HTMLFormFiller`` takes a dictionary of data to populate the form with, where
27 the keys should match the names of form elements, and the values determine the
28 values of those controls. For example::
29
30 >>> from genshi.filters import HTMLFormFiller
31 >>> from genshi.template import MarkupTemplate
32 >>> template = MarkupTemplate("""<form>
33 ... <p>
34 ... <label>User name:
35 ... <input type="text" name="username" />
36 ... </label><br />
37 ... <label>Password:
38 ... <input type="password" name="password" />
39 ... </label><br />
40 ... <label>
41 ... <input type="checkbox" name="remember" /> Remember me
42 ... </label>
43 ... </p>
44 ... </form>""")
45 >>> filler = HTMLFormFiller(data=dict(username='john', remember=True))
46 >>> print template.generate() | filler
47 <form>
48 <p>
49 <label>User name:
50 <input type="text" name="username" value="john"/>
51 </label><br/>
52 <label>Password:
53 <input type="password" name="password"/>
54 </label><br/>
55 <label>
56 <input type="checkbox" name="remember" checked="checked"/> Remember me
57 </label>
58 </p>
59 </form>
60
61 .. note:: This processing is done without in any way reparsing the template
62 output. As any stream filter it operates after the template output is
63 generated but *before* that output is actually serialized.
64
65 The filter will of course also handle radio buttons as well as ``<select>`` and
66 ``<textarea>`` elements. For radio buttons to be marked as checked, the value in
67 the data dictionary needs to match the ``value`` attribute of the ``<input>``
68 element, or evaluate to a truth value if the element has no such attribute. For
69 options in a ``<select>`` box to be marked as selected, the value in the data
70 dictionary needs to match the ``value`` attribute of the ``<option>`` element,
71 or the text content of the option if it has no ``value`` attribute. Password and
72 file input fields are not populated, as most browsers would ignore that anyway
73 for security reasons.
74
75 You'll want to make sure that the values in the data dictionary have already
76 been converted to strings. While the filter may be able to deal with non-string
77 data in some cases (such as check boxes), in most cases it will either not
78 attempt any conversion or not produce the desired results.
79
80 You can restrict the form filler to operate only on a specific ``<form>`` by
81 passing either the ``id`` or the ``name`` keyword argument to the initializer.
82 If either of those is specified, the filter will only apply to form tags with
83 an attribute matching the specified value.
84
85
86 HTML Sanitizer
87 ==============
88
89 The filter ``genshi.filters.HTMLSanitizer`` filter can be used to clean up
90 user-submitted HTML markup, removing potentially dangerous constructs that could
91 be used for various kinds of abuse, such as cross-site scripting (XSS) attacks::
92
93 >>> from genshi.filters import HTMLSanitizer
94 >>> from genshi.input import HTML
95 >>> html = HTML("""<div>
96 ... <p>Innocent looking text.</p>
97 ... <script>alert("Danger: " + document.cookie)</script>
98 ... </div>""")
99 >>> sanitize = HTMLSanitizer()
100 >>> print html | sanitize
101 <div>
102 <p>Innocent looking text.</p>
103 </div>
104
105 In this example, the ``<script>`` tag was removed from the output.
106
107 You can determine which tags and attributes should be allowed by initializing
108 the filter with corresponding sets. See the API documentation for more
109 information.
110
111 Inline ``style`` attributes are forbidden by default. If you allow them, the
112 filter will still perform sanitization on the contents any encountered inline
113 styles: the proprietary ``expression()`` function (supported only by Internet
114 Explorer) is removed, and any property using an ``url()`` which a potentially
115 dangerous URL scheme (such as ``javascript:``) are also stripped out::
116
117 >>> from genshi.filters import HTMLSanitizer
118 >>> from genshi.input import HTML
119 >>> html = HTML("""<div>
120 ... <br style="background: url(javascript:alert(document.cookie); color: #000" />
121 ... </div>""")
122 >>> sanitize = HTMLSanitizer(safe_attrs=HTMLSanitizer.SAFE_ATTRS | set(['style']))
123 >>> print html | sanitize
124 <div>
125 <br style="color: #000"/>
126 </div>
127
128 .. warning:: You should probably not rely on the ``style`` filtering, as
129 sanitizing mixed HTML, CSS, and Javascript is very complicated and
130 suspect to various browser bugs. If you can somehow get away with
131 not allowing inline styles in user-submitted content, that would
132 definitely be the safer route to follow.
Copyright (C) 2012-2017 Edgewall Software