annotate genshi/template/match.py @ 704:422d0607ba85 experimental-match-fastpaths

further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet. Also fixes a bug where we could get duplicate match indexes when a new match is created while evaluating an existing match Also, try to be lazy about creating the first MatchSet so that we don't hurt bigtable performance (i.e. when there are no py:matches in play at all)
author aflett
date Fri, 04 Apr 2008 16:57:27 +0000
parents af57b12e3dd2
children b33c14a1f579
rev   line source
687
3d7288f373bd land first cut at fast-path matching - needs some cleanup
aflett
parents:
diff changeset
1 from genshi.core import START
3d7288f373bd land first cut at fast-path matching - needs some cleanup
aflett
parents:
diff changeset
2 from genshi.path import CHILD, LocalNameTest
3d7288f373bd land first cut at fast-path matching - needs some cleanup
aflett
parents:
diff changeset
3
3d7288f373bd land first cut at fast-path matching - needs some cleanup
aflett
parents:
diff changeset
4 from copy import copy
703
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
5 from itertools import ifilter
687
3d7288f373bd land first cut at fast-path matching - needs some cleanup
aflett
parents:
diff changeset
6
3d7288f373bd land first cut at fast-path matching - needs some cleanup
aflett
parents:
diff changeset
7 def is_simple_path(path):
3d7288f373bd land first cut at fast-path matching - needs some cleanup
aflett
parents:
diff changeset
8 """
3d7288f373bd land first cut at fast-path matching - needs some cleanup
aflett
parents:
diff changeset
9 Is the path merely a tag match like "foo"?
3d7288f373bd land first cut at fast-path matching - needs some cleanup
aflett
parents:
diff changeset
10 """
3d7288f373bd land first cut at fast-path matching - needs some cleanup
aflett
parents:
diff changeset
11 if len(path.paths) == 1 and len(path.paths[0]) == 1:
3d7288f373bd land first cut at fast-path matching - needs some cleanup
aflett
parents:
diff changeset
12 axis, nodetest, predicates = path.paths[0][0]
3d7288f373bd land first cut at fast-path matching - needs some cleanup
aflett
parents:
diff changeset
13 if (axis is CHILD and
3d7288f373bd land first cut at fast-path matching - needs some cleanup
aflett
parents:
diff changeset
14 not predicates and
3d7288f373bd land first cut at fast-path matching - needs some cleanup
aflett
parents:
diff changeset
15 isinstance(nodetest, LocalNameTest)):
3d7288f373bd land first cut at fast-path matching - needs some cleanup
aflett
parents:
diff changeset
16 return True
3d7288f373bd land first cut at fast-path matching - needs some cleanup
aflett
parents:
diff changeset
17
3d7288f373bd land first cut at fast-path matching - needs some cleanup
aflett
parents:
diff changeset
18 return False
3d7288f373bd land first cut at fast-path matching - needs some cleanup
aflett
parents:
diff changeset
19
704
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
20 class MatchState(object):
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
21 """ This is a container for all py:match's found during parsing. It
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
22 does not provide a way to slice and dice which templates
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
23 match. The main interfaces are ``add``, ``remove``, and
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
24 ``find_raw_matches``
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
25
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
26 this class maintains a hash, ``self.match_order`` which maps a
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
27 template to its order, to make sure that events can be returned in
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
28 order
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
29
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
30 """
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
31 def __init__(self):
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
32 # merely for indexing. Note that this is shared between
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
33 # all MatchSets that share the same root parent. We don't
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
34 # have to worry about exclusions here
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
35 self.match_order = {}
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
36
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
37 # tag_templates are match templates whose path are simply
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
38 # a tag, like "body" or "img"
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
39 self.tag_templates = {}
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
40
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
41 # other_templates include all other match templates, such
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
42 # as ones with complex paths like "[class=container]"
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
43 self.other_templates = []
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
44
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
45 self.current_index = 0
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
46
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
47 def add(self, match_template):
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
48 """
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
49 Add a template to the match state
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
50 """
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
51 # match_templates are currently tuples that contain unhashable
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
52 # objects. So we'll use id() for now.
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
53 self.match_order[id(match_template)] = self.current_index
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
54 self.current_index += 1
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
55
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
56 path = match_template[1]
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
57
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
58 if is_simple_path(path):
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
59 # special cache of tag
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
60 tag_name = path.paths[0][0][1].name
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
61 # setdefault is wasteful
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
62 if tag_name not in self.tag_templates:
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
63 self.tag_templates[tag_name] = [match_template]
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
64 else:
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
65 self.tag_templates[tag_name].append(match_template)
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
66
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
67 else:
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
68 self.other_templates.append(match_template)
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
69
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
70 def remove(self, match_template):
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
71 """
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
72 Remove the template permanently
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
73 """
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
74 path = match_template[1]
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
75
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
76 if is_simple_path(path):
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
77 tag_name = path.paths[0][0][1].name
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
78 if tag_name in self.tag_templates:
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
79 template_list = self.tag_templates[tag_name]
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
80 template_list.remove(match_template)
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
81 if not template_list:
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
82 del self.tag_templates[tag_name]
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
83
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
84 else:
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
85 self.other_templates.remove(match_template)
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
86
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
87 # clean up match_order
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
88 del self.match_order[id(match_template)]
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
89
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
90 def find_raw_matches(self, event):
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
91 """ Return a list of all valid templates that can be used for the
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
92 given event. Ordering is funky because we first check
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
93 self.tag_templates, then check self.other_templates.
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
94 """
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
95 kind, data, pos = event[:3]
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
96
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
97 # todo: get the order right
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
98 if kind is START:
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
99 tag, attrs = data
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
100 if tag.localname in self.tag_templates:
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
101 for template in self.tag_templates[tag.localname]:
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
102 yield template
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
103
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
104 for template in self.other_templates:
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
105 yield template
687
3d7288f373bd land first cut at fast-path matching - needs some cleanup
aflett
parents:
diff changeset
106
3d7288f373bd land first cut at fast-path matching - needs some cleanup
aflett
parents:
diff changeset
107 class MatchSet(object):
690
1240ada13334 more code/comment clean up - make sure to retain match order
aflett
parents: 687
diff changeset
108 """ A MatchSet is a set of matches discovered by the parser. This
1240ada13334 more code/comment clean up - make sure to retain match order
aflett
parents: 687
diff changeset
109 class encapsulates the matching of a particular event to a set of
1240ada13334 more code/comment clean up - make sure to retain match order
aflett
parents: 687
diff changeset
110 matches. It is optimized for basic tag matches, since that is by
1240ada13334 more code/comment clean up - make sure to retain match order
aflett
parents: 687
diff changeset
111 far the most common use of py:match.
687
3d7288f373bd land first cut at fast-path matching - needs some cleanup
aflett
parents:
diff changeset
112
690
1240ada13334 more code/comment clean up - make sure to retain match order
aflett
parents: 687
diff changeset
113 The two primary entry points into MatchSet are ``add``, which adds
1240ada13334 more code/comment clean up - make sure to retain match order
aflett
parents: 687
diff changeset
114 a new py:match, and ``find_matches``, which returns all
1240ada13334 more code/comment clean up - make sure to retain match order
aflett
parents: 687
diff changeset
115 /candidate/ match templates. The consumer of ``find_matches``
1240ada13334 more code/comment clean up - make sure to retain match order
aflett
parents: 687
diff changeset
116 still must call each candidates' match() to ensure the event
1240ada13334 more code/comment clean up - make sure to retain match order
aflett
parents: 687
diff changeset
117 really matches, and to maintain state within the match.
1240ada13334 more code/comment clean up - make sure to retain match order
aflett
parents: 687
diff changeset
118
1240ada13334 more code/comment clean up - make sure to retain match order
aflett
parents: 687
diff changeset
119 If a given py:match's path is simply a node name match,
1240ada13334 more code/comment clean up - make sure to retain match order
aflett
parents: 687
diff changeset
120 (LocalNameTest) like "xyz", then MatchSet indexes that in a
1240ada13334 more code/comment clean up - make sure to retain match order
aflett
parents: 687
diff changeset
121 dictionary that maps tag names to matches.
1240ada13334 more code/comment clean up - make sure to retain match order
aflett
parents: 687
diff changeset
122
1240ada13334 more code/comment clean up - make sure to retain match order
aflett
parents: 687
diff changeset
123 If the path is more complex like "xyz[k=z]" then then that match
1240ada13334 more code/comment clean up - make sure to retain match order
aflett
parents: 687
diff changeset
124 will always be returned by ``find_matches``. """
703
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
125 def __init__(self, parent=None,
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
126 min_index=None,
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
127 max_index=None):
687
3d7288f373bd land first cut at fast-path matching - needs some cleanup
aflett
parents:
diff changeset
128 """
3d7288f373bd land first cut at fast-path matching - needs some cleanup
aflett
parents:
diff changeset
129 If a parent is given, it means this is a wrapper around another
690
1240ada13334 more code/comment clean up - make sure to retain match order
aflett
parents: 687
diff changeset
130 set.
1240ada13334 more code/comment clean up - make sure to retain match order
aflett
parents: 687
diff changeset
131
687
3d7288f373bd land first cut at fast-path matching - needs some cleanup
aflett
parents:
diff changeset
132 """
704
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
133 if parent is None:
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
134 self.state = MatchState()
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
135 else:
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
136 self.state = parent.state
690
1240ada13334 more code/comment clean up - make sure to retain match order
aflett
parents: 687
diff changeset
137
687
3d7288f373bd land first cut at fast-path matching - needs some cleanup
aflett
parents:
diff changeset
138 if parent is None:
703
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
139 self.min_index = None
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
140 self.max_index = None
687
3d7288f373bd land first cut at fast-path matching - needs some cleanup
aflett
parents:
diff changeset
141 else:
690
1240ada13334 more code/comment clean up - make sure to retain match order
aflett
parents: 687
diff changeset
142 # We have a parent: Just copy references to member
703
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
143 # variables in parent so that there's no performance loss
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
144 self.max_index = parent.max_index
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
145 self.min_index = parent.min_index
704
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
146
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
147 # sub-MatchSets can only be further constrained, not expanded
703
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
148 if max_index is not None:
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
149 assert self.max_index is None or max_index <= self.max_index
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
150 self.max_index = max_index
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
151
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
152 if min_index is not None:
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
153 assert self.min_index is None or min_index > self.min_index
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
154 self.min_index = min_index
690
1240ada13334 more code/comment clean up - make sure to retain match order
aflett
parents: 687
diff changeset
155
687
3d7288f373bd land first cut at fast-path matching - needs some cleanup
aflett
parents:
diff changeset
156 def add(self, match_template):
3d7288f373bd land first cut at fast-path matching - needs some cleanup
aflett
parents:
diff changeset
157 """
3d7288f373bd land first cut at fast-path matching - needs some cleanup
aflett
parents:
diff changeset
158 match_template is a tuple the form
3d7288f373bd land first cut at fast-path matching - needs some cleanup
aflett
parents:
diff changeset
159 test, path, template, hints, namespace, directives
3d7288f373bd land first cut at fast-path matching - needs some cleanup
aflett
parents:
diff changeset
160 """
704
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
161 self.state.add(match_template)
690
1240ada13334 more code/comment clean up - make sure to retain match order
aflett
parents: 687
diff changeset
162
687
3d7288f373bd land first cut at fast-path matching - needs some cleanup
aflett
parents:
diff changeset
163
3d7288f373bd land first cut at fast-path matching - needs some cleanup
aflett
parents:
diff changeset
164 def remove(self, match_template):
3d7288f373bd land first cut at fast-path matching - needs some cleanup
aflett
parents:
diff changeset
165 """
3d7288f373bd land first cut at fast-path matching - needs some cleanup
aflett
parents:
diff changeset
166 Permanently remove a match_template - mainly for match_once
704
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
167 """
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
168 self.state.remove(match_template)
687
3d7288f373bd land first cut at fast-path matching - needs some cleanup
aflett
parents:
diff changeset
169
703
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
170 def before_template(self, match_template, inclusive):
704
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
171 """ Return a new MatchSet where only the templates that were declared
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
172 before ``match_template`` are available. If ``inclusive`` is
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
173 true, then it will also include match_template itself.
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
174
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
175 """
703
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
176 cls = type(self)
704
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
177 max_index = self.state.match_order[id(match_template)]
703
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
178 if not inclusive:
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
179 max_index -= 1
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
180 return cls(parent=self, max_index=max_index)
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
181
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
182 def after_template(self, match_template):
687
3d7288f373bd land first cut at fast-path matching - needs some cleanup
aflett
parents:
diff changeset
183 """
703
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
184 Factory for creating a MatchSet that only matches templates after
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
185 the given match
687
3d7288f373bd land first cut at fast-path matching - needs some cleanup
aflett
parents:
diff changeset
186 """
703
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
187 cls = type(self)
704
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
188 min_index = self.state.match_order[id(match_template)] + 1
703
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
189 return cls(parent=self, min_index=min_index)
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
190
687
3d7288f373bd land first cut at fast-path matching - needs some cleanup
aflett
parents:
diff changeset
191
690
1240ada13334 more code/comment clean up - make sure to retain match order
aflett
parents: 687
diff changeset
192 def find_matches(self, event):
1240ada13334 more code/comment clean up - make sure to retain match order
aflett
parents: 687
diff changeset
193 """ Return a list of all valid templates that can be used for the
1240ada13334 more code/comment clean up - make sure to retain match order
aflett
parents: 687
diff changeset
194 given event.
1240ada13334 more code/comment clean up - make sure to retain match order
aflett
parents: 687
diff changeset
195
1240ada13334 more code/comment clean up - make sure to retain match order
aflett
parents: 687
diff changeset
196 The basic work here is sorting the result of find_raw_matches
1240ada13334 more code/comment clean up - make sure to retain match order
aflett
parents: 687
diff changeset
197 """
1240ada13334 more code/comment clean up - make sure to retain match order
aflett
parents: 687
diff changeset
198 # remove exclusions
703
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
199 def can_match(template):
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
200 # make sure that
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
201 if (self.min_index is not None and
704
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
202 self.state.match_order[id(template)] < self.min_index):
703
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
203 return False
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
204
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
205 if (self.max_index is not None and
704
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
206 self.state.match_order[id(template)] > self.max_index):
703
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
207 return False
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
208
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
209 return True
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
210
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
211 matches = ifilter(can_match,
704
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
212 self.state.find_raw_matches(event))
690
1240ada13334 more code/comment clean up - make sure to retain match order
aflett
parents: 687
diff changeset
213
1240ada13334 more code/comment clean up - make sure to retain match order
aflett
parents: 687
diff changeset
214 # sort the results according to the order they were added
704
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
215 return sorted(matches, key=lambda v: self.state.match_order[id(v)])
687
3d7288f373bd land first cut at fast-path matching - needs some cleanup
aflett
parents:
diff changeset
216
3d7288f373bd land first cut at fast-path matching - needs some cleanup
aflett
parents:
diff changeset
217 def __nonzero__(self):
3d7288f373bd land first cut at fast-path matching - needs some cleanup
aflett
parents:
diff changeset
218 """
704
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
219 allow this to behave as a list - and at least try to act empty if
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
220 the list is likely to be empty
687
3d7288f373bd land first cut at fast-path matching - needs some cleanup
aflett
parents:
diff changeset
221 """
704
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
222 # dirt-simple case: no py:match templates at all
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
223 if not (self.state.tag_templates or self.state.other_templates):
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
224 return False
703
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
225 # this is easy - before the first element there is nothing
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
226 if self.max_index == -1:
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
227 return False
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
228
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
229 # this isn't always right because match_order may shrink, but
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
230 # you'll never get a false-negative
704
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
231 if self.min_index == self.state.current_index:
703
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
232 return False
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
233
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
234 # check for a range that is completely constrained
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
235 if self.min_index is not None and self.max_index is not None:
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
236 if self.min_index >= self.max_index:
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
237 return False
704
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
238
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
239 return True
687
3d7288f373bd land first cut at fast-path matching - needs some cleanup
aflett
parents:
diff changeset
240
3d7288f373bd land first cut at fast-path matching - needs some cleanup
aflett
parents:
diff changeset
241 def __str__(self):
3d7288f373bd land first cut at fast-path matching - needs some cleanup
aflett
parents:
diff changeset
242 parent = ""
3d7288f373bd land first cut at fast-path matching - needs some cleanup
aflett
parents:
diff changeset
243 if self.parent:
3d7288f373bd land first cut at fast-path matching - needs some cleanup
aflett
parents:
diff changeset
244 parent = ": child of 0x%x" % id(self.parent)
3d7288f373bd land first cut at fast-path matching - needs some cleanup
aflett
parents:
diff changeset
245
703
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
246 return "<MatchSet 0x%x %d tag templates, %d other templates, range=[%s:%s]%s>" % (
704
422d0607ba85 further performance improvements to MatchSet functionality - factor out MatchSet's State so that we dont' have to keep copying over the state every time we create a new child MatchSet.
aflett
parents: 703
diff changeset
247 id(self), len(self.state.tag_templates), len(self.other_templates),
703
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
248 self.min_index, self.max_index,
af57b12e3dd2 merge in trunk up through r818 - fundamentally changed the way MatchSet works, but actually is more consistent now
aflett
parents: 701
diff changeset
249 parent)
Copyright (C) 2012-2017 Edgewall Software