annotate doc/filters.txt @ 442:ff7c72b52fb2

Back out [510] and instead implement configurable error handling modes. The default is the old 0.3.x behaviour, but more strict error handling is available as an option.
author cmlenz
date Thu, 12 Apr 2007 22:40:49 +0000
parents 6fd7e4dc0318
children a332cb9c70d5 1a29617a5d87 1837f39efd6f
rev   line source
438
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
1 .. -*- mode: rst; encoding: utf-8 -*-
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
2
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
3 ==============
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
4 Stream Filters
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
5 ==============
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
6
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
7 `Markup Streams`_ showed how to write filters and how they are applied to
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
8 markup streams. This page describes the features of the various filters that
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
9 come with Genshi itself.
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
10
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
11 .. _`Markup Streams`: streams.html
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
12
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
13 .. contents:: Contents
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
14 :depth: 1
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
15 .. sectnum::
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
16
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
17
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
18 HTML Form Filler
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
19 ================
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
20
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
21 The filter ``genshi.filters.HTMLFormFiller`` can automatically populate an HTML
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
22 form from values provided as a simple dictionary. When using thi filter, you can
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
23 basically omit any ``value``, ``selected``, or ``checked`` attributes from form
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
24 controls in your templates, and let the filter do all that work for you.
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
25
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
26 ``HTMLFormFiller`` takes a dictionary of data to populate the form with, where
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
27 the keys should match the names of form elements, and the values determine the
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
28 values of those controls. For example::
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
29
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
30 >>> from genshi.filters import HTMLFormFiller
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
31 >>> from genshi.template import MarkupTemplate
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
32 >>> template = MarkupTemplate("""<form>
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
33 ... <p>
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
34 ... <label>User name:
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
35 ... <input type="text" name="username" />
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
36 ... </label><br />
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
37 ... <label>Password:
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
38 ... <input type="password" name="password" />
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
39 ... </label><br />
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
40 ... <label>
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
41 ... <input type="checkbox" name="remember" /> Remember me
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
42 ... </label>
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
43 ... </p>
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
44 ... </form>""")
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
45 >>> filler = HTMLFormFiller(data=dict(username='john', remember=True))
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
46 >>> print template.generate() | filler
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
47 <form>
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
48 <p>
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
49 <label>User name:
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
50 <input type="text" name="username" value="john"/>
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
51 </label><br/>
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
52 <label>Password:
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
53 <input type="password" name="password"/>
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
54 </label><br/>
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
55 <label>
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
56 <input type="checkbox" name="remember" checked="checked"/> Remember me
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
57 </label>
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
58 </p>
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
59 </form>
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
60
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
61 .. note:: This processing is done without in any way reparsing the template
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
62 output. As any stream filter it operates after the template output is
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
63 generated but *before* that output is actually serialized.
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
64
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
65 The filter will of course also handle radio buttons as well as ``<select>`` and
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
66 ``<textarea>`` elements. For radio buttons to be marked as checked, the value in
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
67 the data dictionary needs to match the ``value`` attribute of the ``<input>``
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
68 element, or evaluate to a truth value if the element has no such attribute. For
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
69 options in a ``<select>`` box to be marked as selected, the value in the data
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
70 dictionary needs to match the ``value`` attribute of the ``<option>`` element,
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
71 or the text content of the option if it has no ``value`` attribute. Password and
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
72 file input fields are not populated, as most browsers would ignore that anyway
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
73 for security reasons.
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
74
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
75 You'll want to make sure that the values in the data dictionary have already
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
76 been converted to strings. While the filter may be able to deal with non-string
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
77 data in some cases (such as check boxes), in most cases it will either not
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
78 attempt any conversion or not produce the desired results.
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
79
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
80 You can restrict the form filler to operate only on a specific ``<form>`` by
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
81 passing either the ``id`` or the ``name`` keyword argument to the initializer.
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
82 If either of those is specified, the filter will only apply to form tags with
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
83 an attribute matching the specified value.
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
84
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
85
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
86 HTML Sanitizer
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
87 ==============
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
88
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
89 The filter ``genshi.filters.HTMLSanitizer`` filter can be used to clean up
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
90 user-submitted HTML markup, removing potentially dangerous constructs that could
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
91 be used for various kinds of abuse, such as cross-site scripting (XSS) attacks::
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
92
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
93 >>> from genshi.filters import HTMLSanitizer
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
94 >>> from genshi.input import HTML
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
95 >>> html = HTML("""<div>
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
96 ... <p>Innocent looking text.</p>
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
97 ... <script>alert("Danger: " + document.cookie)</script>
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
98 ... </div>""")
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
99 >>> sanitize = HTMLSanitizer()
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
100 >>> print html | sanitize
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
101 <div>
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
102 <p>Innocent looking text.</p>
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
103 </div>
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
104
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
105 In this example, the ``<script>`` tag was removed from the output.
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
106
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
107 You can determine which tags and attributes should be allowed by initializing
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
108 the filter with corresponding sets. See the API documentation for more
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
109 information.
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
110
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
111 Inline ``style`` attributes are forbidden by default. If you allow them, the
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
112 filter will still perform sanitization on the contents any encountered inline
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
113 styles: the proprietary ``expression()`` function (supported only by Internet
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
114 Explorer) is removed, and any property using an ``url()`` which a potentially
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
115 dangerous URL scheme (such as ``javascript:``) are also stripped out::
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
116
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
117 >>> from genshi.filters import HTMLSanitizer
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
118 >>> from genshi.input import HTML
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
119 >>> html = HTML("""<div>
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
120 ... <br style="background: url(javascript:alert(document.cookie); color: #000" />
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
121 ... </div>""")
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
122 >>> sanitize = HTMLSanitizer(safe_attrs=HTMLSanitizer.SAFE_ATTRS | set(['style']))
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
123 >>> print html | sanitize
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
124 <div>
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
125 <br style="color: #000"/>
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
126 </div>
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
127
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
128 .. warning:: You should probably not rely on the ``style`` filtering, as
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
129 sanitizing mixed HTML, CSS, and Javascript is very complicated and
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
130 suspect to various browser bugs. If you can somehow get away with
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
131 not allowing inline styles in user-submitted content, that would
6fd7e4dc0318 Added documentation page on the builtin stream filters.
cmlenz
parents:
diff changeset
132 definitely be the safer route to follow.
Copyright (C) 2012-2017 Edgewall Software