Python如何将文件与模板匹配

Python how to match files to templates

我希望将许多文件与一些常见模板进行匹配，并提取差异。我想就最好的方法提出建议。例如：

模板A：

1
2
3
4
5
6
7
8

<1000 text lines that have to match>
a=?
b=2
c=3
d=?
e=5
f=6
<more text>

模板B：

1
2
3
4
5
6
7
8
9

<1000 different text lines that have to match>
h=20
i=21
j=?
<more text>
k=22
l=?
m=24
<more text>

号

如果我通过了文件C：

1
2
3
4
5
6
7
8

<1000 text lines that match A>
a=500
b=2
c=3
d=600
e=5
f=6
<more text>

我想要一个简单的方法来说明这个匹配模板A，并提取"a=500"，"d=600"。

我可以将它们与regex匹配，但文件相当大，构建该regex会很麻烦。

我也尝试过difflib，但是解析操作码和提取差异似乎并不理想。

有人有更好的建议吗？

你可能需要稍微调整一下以处理额外的文本，因为我不知道确切的格式，但这不应该太难。

1
2
3
4

with open('templ.txt') as templ, open('in.txt') as f:
items = [i.strip().split('=')[0] for i in templ if '=?' in i]
d = dict(i.strip().split('=') for i in f)
print [(i,d[i]) for i in items if i in d]

出：

1 2	[('a', '500'), ('d', '600')] # With template A [] # With template B

号

或者如果对齐：

1
2
3

from itertools import imap,compress
with open('templ.txt') as templ, open('in.txt') as f:
print list(imap(str.strip,compress(f,imap(lambda x: '=?' in x,templ))))

出：

1	['a=500', 'd=600']

。