Find the indexes of all regex matches?
我正在分析字符串,其中可能包含任意数量的带引号的字符串(我正在分析代码,并试图避免ply)。我想知道子字符串是否被引用,并且我有子字符串索引。我最初的想法是使用re查找所有匹配项,然后找出它们所代表的索引范围。
似乎我应该用re来处理像
我的子字符串可能和
这就是你想要的:(来源)
1 re.finditer(pattern, string[, flags])Return an iterator yielding MatchObject instances over all
non-overlapping matches for the RE pattern in string. The string is
scanned left-to-right, and matches are returned in the order found. Empty
matches are included in the result unless they touch the beginning of
another match.
然后可以从MatchObjects中获取开始和结束位置。
例如
1 | [(m.start(0), m.end(0)) for m in re.finditer(pattern, string)] |