关于python：itertools.groupby（）用于什么？

What is itertools.groupby() used for?

在阅读python文档时，我遇到了itertools.groupby()。功能。这不是很简单，所以我决定在stackoverflow上查找一些信息。我从如何使用python的itertools.groupby()中找到了一些东西？.

这里和文档中似乎没有关于它的信息，所以我决定将我的观察结果发表出来征求意见。

谢谢

首先，您可以阅读此处的文档。

我将把我认为最重要的一点放在首位。我希望在举例之后，原因会变得清楚。

始终使用用于分组的相同键对项进行排序，以避免出现意外结果。

itertools.groupby(iterable, key=None or some func)。获取iterables列表，并根据指定的键对其进行分组。键指定要应用于每个独立ITerable的操作，然后将其结果用作每个项目分组的标题；最终具有相同"键"值的项目将结束在同一个组中。

返回值是一个类似于字典的iterable，因为它的形式是{key : value}。

实施例1

1
2
3
4
5
6
7

# note here that the tuple counts as one item in this list. I did not
# specify any key, so each item in the list is a key on its own.
c = groupby(['goat', 'dog', 'cow', 1, 1, 2, 3, 11, 10, ('persons', 'man', 'woman')])
dic = {}
for k, v in c:
dic[k] = list(v)
dic

结果

1
2
3
4
5
6
7
8
9

{1: [1, 1],
'goat': ['goat'],
3: [3],
'cow': ['cow'],
('persons', 'man', 'woman'): [('persons', 'man', 'woman')],
10: [10],
11: [11],
2: [2],
'dog': ['dog']}

号

实施例2

1
2
3
4
5
6
7
8
9
10

# notice here that mulato and camel don't show up. only the last element with a certain key shows up, like replacing earlier result
# the last result for c actually wipes out two previous results.

list_things = ['goat', 'dog', 'donkey', 'mulato', 'cow', 'cat', ('persons', 'man', 'woman'), \
'wombat', 'mongoose', 'malloo', 'camel']
c = groupby(list_things, key=lambda x: x[0])
dic = {}
for k, v in c:
dic[k] = list(v)
dic

结果

1
2
3
4
5
6

{'c': ['camel'],
'd': ['dog', 'donkey'],
'g': ['goat'],
'm': ['mongoose', 'malloo'],
'persons': [('persons', 'man', 'woman')],
'w': ['wombat']}

。

现在，对于已排序的版本

1
2
3
4
5
6
7
8
9
10
11

# but observe the sorted version where I have the data sorted first on same key I used for grouping
list_things = ['goat', 'dog', 'donkey', 'mulato', 'cow', 'cat', ('persons', 'man', 'woman'), \
'wombat', 'mongoose', 'malloo', 'camel']
sorted_list = sorted(list_things, key = lambda x: x[0])
print(sorted_list)
print()
c = groupby(sorted_list, key=lambda x: x[0])
dic = {}
for k, v in c:
dic[k] = list(v)
dic

结果

1
2
3
4
5
6
7

['cow', 'cat', 'camel', 'dog', 'donkey', 'goat', 'mulato', 'mongoose', 'malloo', ('persons', 'man', 'woman'), 'wombat']
{'c': ['cow', 'cat', 'camel'],
'd': ['dog', 'donkey'],
'g': ['goat'],
'm': ['mulato', 'mongoose', 'malloo'],
'persons': [('persons', 'man', 'woman')],
'w': ['wombat']}

。

实施例3

1
2
3
4
5
6
7

things = [("animal","bear"), ("animal","duck"), ("plant","cactus"), ("vehicle","harley"), \
("vehicle","speed boat"), ("vehicle","school bus")]
dic = {}
f = lambda x: x[0]
for key, group in groupby(sorted(things, key=f), f):
dic[key] = list(group)
dic

。

结果

1
2
3
4
5

{'animal': [('animal', 'bear'), ('animal', 'duck')],
'plant': [('plant', 'cactus')],
'vehicle': [('vehicle', 'harley'),
('vehicle', 'speed boat'),
('vehicle', 'school bus')]}

现在是排序版本。我把元组改成了列表。不管怎样，结果都是一样的。

1
2
3
4
5
6
7

things = [["animal","bear"], ["animal","duck"], ["vehicle","harley"], ["plant","cactus"], \
["vehicle","speed boat"], ["vehicle","school bus"]]
dic = {}
f = lambda x: x[0]
for key, group in groupby(sorted(things, key=f), f):
dic[key] = list(group)
dic

。

结果

1
2
3
4
5

{'animal': [['animal', 'bear'], ['animal', 'duck']],
'plant': [['plant', 'cactus']],
'vehicle': [['vehicle', 'harley'],
['vehicle', 'speed boat'],
['vehicle', 'school bus']]}