How to match a tree against a large set of patterns?
1 2 3 4 5 6 7 8 9 | A / \ ? B / \ A C /|\ A C Q \ ? |
(P)1.This paper describes a variant of the aho-corasick algorithm,where instead of using a finite State machine(which the standard aho-corasick algorithm used for string matching)the Algorithm instead uses a pushdown automation for subtree matching.Like the aho-corasick string-matching algorithm,their variant only requires one pass through the input tree to match against the entire dictionary of S.(p)(P)The paper is quite complex-it may be worth it to contact the author to see if he has any source code available.(p)
(P)What you need is a finite State machine that tracks the set of potential matches you might have.(p)(P)In essence,such a machine is the result of matching the patterns against each other,and determining what part of the individual matches they share.This is analogous to how lexers take sets of regular expressions for tokens and compose them into a large FSA that can match any of the regular expressions by processing characters one at a time.(p)(P)You can find references to methods for doing this under term rewriting systems.(p)