关于python：更快，更’pythonic’的词典列表

faster and more 'pythonic' list of dictionaries

为了简单起见，我在一个列表中提供了两个列表，但实际上我在一个列表中处理了100个列表，每个列表都包含相当数量的字典。我只想在第一个字典中获取'status'键的值，而不检查该列表中的任何其他字典(因为我知道它们在该键中都包含相同的值)。然后我将在每个大字典中执行某种类型的集群。我需要有效地连接所有"标题"值。有没有办法让我的代码更优雅、更快？

我有：

1
2
3
4
5
6
7
8
9
10

nested = [
[
{'id': 287, 'title': 'hungry badger', 'status': 'High'},
{'id': 437, 'title': 'roadtrip to Kansas','status': 'High'}
],
[
{'id': 456, 'title': 'happy title here','status': 'Medium'},
{'id': 342,'title': 'soft big bear','status': 'Medium'}
]
]

我想要：

1
2
3
4
5
6
7
8
9
10
11
12
13
14

result = [
{
'High': [
{'id': 287, 'title': 'hungry badger'},
{'id': 437, 'title': 'roadtrip to Kansas'}
]
},
{
'Medium': [
{'id': 456, 'title': 'happy title here'},
{'id': 342, 'title': 'soft big bear'}
]
}
]

号

我的尝试：

1
2
3
4
5
6
7
8
9

for oneList in nested:
result= {}
for i in oneList:
a= list(i.keys())
m= [i[key] for key in a if key not in ['id','title']]
result[m[0]]=oneList
for key in a:
if key not in ['id','title']:
del i[key]

您可以为每个嵌套列表生成一个defaultdict：

1
2
3
4
5
6
7
8
9
10
11
12
13

import collections
nested = [
[{'id': 287, 'title': 'hungry badger', 'status': 'High'},
{'id': 437, 'title': 'roadtrip to Kansas','status': 'High'}],
[{'id': 456, 'title': 'happy title here','status': 'Medium'},
{'id': 342,'title': 'soft big bear','status': 'Medium'}] ]
result = []
for l in nested:
r = collections.defaultdict(list)
for d in l:
name = d.pop('status')
r[name].append(d)
result.append(r)

这给出了以下result：

1
2
3
4
5
6

>>> import pprint
>>> pprint.pprint(result)
[{'High': [{'id': 287, 'title': 'hungry badger'},
{'id': 437, 'title': 'roadtrip to Kansas'}]},
{'Medium': [{'id': 456, 'title': 'happy title here'},
{'id': 342, 'title': 'soft big bear'}]}]

。

相关讨论

1 2	from itertools import groupby result = groupby(sum(nested,[]), lambda x: x['status'])

工作原理：

sum(nested,[])将所有外部列表连接到一个字典大列表中。

groupby(, lambda x: x['status'])按状态属性对所有对象进行分组

注意，itertools.groupby返回一个生成器(不是列表)，因此如果您想要具体化生成器，您需要执行如下操作。

1
2
3

from itertools import groupby
result = groupby(sum(nested,[]), lambda x: x['status'])
result = {key:list(val) for key,val in result}

号