count occurrences of search strings in a list
本问题已经有最佳答案,请猛点这里访问。
我有以下列表:
1 | data_items = ['abc','123data','dataxyz','456','344','666','777','888','888', 'abc', 'xyz'] |
我有一个搜索项列表:
1 | search = ['abc','123','xyz','456'] |
我想使用搜索列表迭代匹配的数据项,并构建一个为每个匹配提供计数的基本结构。例如
1 | counts = ['abc':'2', '123':'1', 'xyz':'2'.........] |
最好的方法是什么?
您可以使用
1 2 3 4 5 6 7 8 9 10 | import re from collections import Counter data_items = ['abc','123data','dataxyz','456','344','666','777','888','888', 'abc', 'xyz'] search = ['abc','123','xyz','456'] to_search = re.compile('|'.join(sorted(search, key=len, reverse=True))) matches = (to_search.search(el) for el in data_items) counts = Counter(match.group() for match in matches if match) # Counter({'abc': 2, 'xyz': 2, '123': 1, '456': 1}) |
看起来你也需要部分匹配。下面的代码是直观的,但可能没有效率。同时假设你对听写结果还满意。
1 2 3 4 5 6 7 8 9 | >>> data_items = ['abc','123data','dataxyz','456','344','666','777','888','888', 'abc', 'xyz'] >>> search = ['abc','123','xyz','456'] >>> result = {k:0 for k in search} >>> for item in data_items: for search_item in search: if search_item in item: result[search_item]+=1 >>> result {'123': 1, 'abc': 2, 'xyz': 2, '456': 1} |
1 2 3 4 | counts={} for s in search: lower_s=s.lower() counts[lower_s]=str(data_items.count(lower_s)) |
如果你能用字典的话(既然你说了结构,那是个更好的选择)。