关于列表理解：通过Python查找和分组字谜

Finding and grouping anagrams by Python

1 2	input: ['abc', 'cab', 'cafe', 'face', 'goo'] output: [['abc', 'cab'], ['cafe', 'face'], ['goo']]

问题很简单：它按变位词分组。命令无关紧要。

当然，我可以用C++来做这(这是我的母语)。但是，我想知道这可以由python在一行中完成。编辑：如果不可能，可能是2行或3行。我是Python的新手。

为了检查两个字符串是否是变位词，我使用排序。

1
2
3
4

>>> input = ['abc', 'cab', 'cafe', 'face', 'goo']
>>> input2 = [''.join(sorted(x)) for x in input]
>>> input2
['abc', 'abc', 'acef', 'acef', 'goo']

号

我认为通过将map左右结合起来可能是可行的。但是，我需要使用dict作为哈希表。我还不知道这是否可以在一条线上实现。任何暗示都会被理解！

相关讨论

一个可读的单行解决方案：

1	output = [list(group) for key,group in groupby(sorted(words,key=sorted),sorted)]

。

例如：

1
2
3
4

>>> words = ['abc', 'cab', 'cafe', 'goo', 'face']
>>> from itertools import groupby
>>> [list(group) for key,group in groupby(sorted(words,key=sorted),sorted)]
[['abc', 'cab'], ['cafe', 'face'], ['goo']]

。

这里的关键是使用来自itertools模块的itertools.groupby，该模块将列表中的项目分组在一起。

我们提供给groupby的清单必须提前排序，所以我们将其传递给sorted(words,key=sorted)。这里的诀窍是，sorted可以接受一个键函数，并根据这个函数的输出进行排序，因此我们再次将sorted作为键函数传递，这将使用字符串的字母按顺序对单词进行排序。不需要定义我们自己的函数或者创建一个lambda。

groupby接受一个关键函数，它用来判断项目是否应该分组在一起，我们只需将内置的sorted函数传递给它。

最后要注意的是，输出是成对的键和组对象，所以我们只需要取grouper对象，并使用list函数将它们转换为一个列表。

(顺便说一句，我不会把你的变量input称为隐藏内置input函数，尽管它可能不是你应该使用的函数。)

相关讨论

无法读取的单行解决方案：

1
2
3
4

>>> import itertools
>>> input = ['abc', 'face', 'goo', 'cab', 'cafe']
>>> [list(group) for key,group in itertools.groupby(sorted(input, key=sorted), sorted)]
[['abc', 'cab'], ['cafe', 'face'], ['goo']]

(好吧，如果你算一下进口额的话，实际上是2行…)

相关讨论

可读版本：

1
2
3
4
5
6
7
8
9
10
11
12
13
14

from itertools import groupby
from operator import itemgetter

def norm(w):
return"".join(sorted(w))

words = ['abc', 'cba', 'gaff', 'ffag', 'aaaa']

words_aug = sorted((norm(word), word) for word in words)

grouped = groupby(words_aug, itemgetter(0))

for _, group in grouped:
print map(itemgetter(1), group)

号

一行：

1	print list(list(anagrams for _, anagrams in group) for _, group in groupby(sorted(("".join(sorted(word)), word) for word in words), itemgetter(0)))

印刷品：

1	[['aaaa'], ['abc', 'cba'], ['ffag', 'gaff']]

。

相关讨论

不是一行而是一个解决方案…

1
2
3
4
5
6
7

d = {}
for item in input:
s ="".join(sorted(item))
if not d.has_key(s):
d[s] = []
d[s].append(item)
input2 = d.values()

1
2
3
4
5

from itertools import groupby

words = ['oog', 'abc', 'cab', 'cafe', 'face', 'goo', 'foo']

print [list(g) for k, g in groupby(sorted(words, key=sorted), sorted)]

结果：

1	[['abc', 'cab'], ['cafe', 'face'], ['foo'], ['oog', 'goo']]

。

不能只使用groupby函数，因为它只将键函数产生相同结果的顺序元素分组在一起。

简单的解决方案就是首先使用与分组相同的函数对单词进行排序。

相关讨论

戴夫的回答很简洁，但是groupby所要求的类型是O(n log(n))操作。更快的解决方案是：

1
2
3
4
5
6
7
8
9

from collections import defaultdict

def group_anagrams(strings):
m = defaultdict(list)

for s in strings:
m[tuple(sorted(s))].append(s)

return list(m.values())