关于python:按值列出组列表

Group list by values

假设我有一个这样的列表:

1
list = [["A",0], ["B",1], ["C",0], ["D",2], ["E",2]]

我如何才能最优雅地将其分组,以在python中获得此列表输出:

1
list = [["A","C"], ["B"], ["D","E"]]

所以这些值是按secound值分组的,但顺序是保留的…


1
2
values = set(map(lambda x:x[1], list))
newlist = [[y[0] for y in list if y[1]==x] for x in values]


1
2
3
4
5
6
7
8
9
10
from operator import itemgetter
from itertools import groupby

lki = [["A",0], ["B",1], ["C",0], ["D",2], ["E",2]]
lki.sort(key=itemgetter(1))

glo = [[x for x,y in g]
       for k,g in  groupby(lki,key=itemgetter(1))]

print glo

.

编辑

另一个不需要导入、可读性更高、保持顺序的解决方案比前一个解决方案长22%:

1
2
3
4
5
6
7
8
9
10
11
oldlist = [["A",0], ["B",1], ["C",0], ["D",2], ["E",2]]

newlist, dicpos = [],{}
for val,k in oldlist:
    if k in dicpos:
        newlist[dicpos[k]].extend(val)
    else:
        newlist.append([val])
        dicpos[k] = len(dicpos)

print newlist


霍华德的回答简洁而优雅,但在最坏的情况下也是O(n^2)。对于具有大量分组键值的大列表,您需要先对列表进行排序,然后使用itertools.groupby

1
2
3
4
5
6
7
>>> from itertools import groupby
>>> from operator import itemgetter
>>> seq = [["A",0], ["B",1], ["C",0], ["D",2], ["E",2]]
>>> seq.sort(key = itemgetter(1))
>>> groups = groupby(seq, itemgetter(1))
>>> [[item[0] for item in data] for (key, data) in groups]
[['A', 'C'], ['B'], ['D', 'E']]

编辑:

我在看到Eyequem的答案后改变了这一点:itemgetter(1)lambda x: x[1]更好。


1
2
3
4
5
6
7
8
9
>>> import collections
>>> D1 = collections.defaultdict(list)
>>> for element in L1:
...     D1[element[1]].append(element[0])
...
>>> L2 = D1.values()
>>> print L2
[['A', 'C'], ['B'], ['D', 'E']]
>>>

我不知道优雅,但它确实可行:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
oldlist = [["A",0], ["B",1], ["C",0], ["D",2], ["E",2]]
# change into: list = [["A","C"], ["B"], ["D","E"]]

order=[]
dic=dict()
for value,key in oldlist:
  try:
    dic[key].append(value)
  except KeyError:
    order.append(key)
    dic[key]=[value]
newlist=map(dic.get, order)

print newlist

这将保留每个键第一次出现的顺序,以及每个键的项的顺序。它要求密钥是可哈希的,但不为其赋予意义。


1
2
3
4
len = max(key for (item, key) in list)
newlist = [[] for i in range(len+1)]
for item,key in list:
  newlist[key].append(item)

您可以在一个单一的列表中理解它,也许更优雅,但是o(n**2):

1
[[item for (item,key) in list if key==i] for i in range(max(key for (item,key) in list)+1)]