How to make lists contain only distinct element in Python?
我在python中有一个列表,如何使它的值唯一?
最简单的方法是先转换为集合,然后再转换回列表:
1 | my_list = list(set(my_list)) |
这样做的一个缺点是它不能维持秩序。您可能还需要考虑一个集合是否是一个更好的数据结构,而不是一个列表。
http://www.peterbe.com/plog/uniqifiers-benchmark的修改版本
要保留订单:
1 2 3 4 | def f(seq): # Order preserving ''' Modified version of Dave Kirby solution ''' seen = set() return [x for x in seq if x not in seen and not seen.add(x)] |
好吧,现在它是如何工作的,因为这里有点棘手,
1 2 3 | In [1]: 0 not in [1,2,3] and not print('add') add Out[1]: True |
为什么它会返回真的?print(and set.add)不返回任何内容:
1 2 | In [3]: type(seen.add(10)) Out[3]: <type 'NoneType'> |
和
1 2 | In [2]: 1 not in [1,2,3] and not print('add') Out[2]: False |
为什么它在[1]中打印"add",而在[2]中不打印?见
更通用的版本、更可读的、基于生成器的版本增加了用函数转换值的能力:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | def f(seq, idfun=None): # Order preserving return list(_f(seq, idfun)) def _f(seq, idfun=None): ''' Originally proposed by Andrew Dalke ''' seen = set() if idfun is None: for x in seq: if x not in seen: seen.add(x) yield x else: for x in seq: x = idfun(x) if x not in seen: seen.add(x) yield x |
没有订单(更快):
1 2 | def f(seq): # Not order preserving return list(set(seq)) |
一个衬垫,保持秩序
1 | list(OrderedDict.fromkeys([2,1,1,3])) |
尽管你需要
1 | from collections import OrderedDict |
让我用一个例子向您解释:
如果有python列表
1 | >>> randomList = ["a","f","b","c","d","a","c","e","d","f","e"] |
你想从中删除重复项。
1 2 3 4 5 6 7 8 | >>> uniqueList = [] >>> for letter in randomList: if letter not in uniqueList: uniqueList.append(letter) >>> uniqueList ['a', 'f', 'b', 'c', 'd', 'e'] |
这是从列表中删除重复项的方法。
要保留订单:
1 2 3 4 5 | l = [1, 1, 2, 2, 3] result = list() map(lambda x: not x in result and result.append(x), l) result # [1, 2, 3] |
字典的理解怎么样?
1 2 3 4 | >>> mylist = [3, 2, 1, 3, 4, 4, 4, 5, 5, 3] >>> {x:1 for x in mylist}.keys() [1, 2, 3, 4, 5] |
编辑致@danny的评论:我最初的建议并不能保证钥匙的订购。如果需要对密钥进行排序,请尝试:
1 2 3 4 | >>> from collections import OrderedDict >>> OrderedDict( (x,1) for x in mylist ).keys() [3, 2, 1, 4, 5] |
按元素第一次出现的顺序排列元素(未进行广泛测试)
来自http://www.peterbe.com/plog/uniqifiers-benchmark:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | def f5(seq, idfun=None): # order preserving if idfun is None: def idfun(x): return x seen = {} result = [] for item in seq: marker = idfun(item) # in old Python versions: # if seen.has_key(marker) # but in new ones: if marker in seen: continue seen[marker] = 1 result.append(item) return result |
在保留顺序的同时删除重复项的最简单方法是使用collections.ordereddict(python 2.7+)。
1 2 3 4 5 | from collections import OrderedDict d = OrderedDict() for x in mylist: d[x] = True print d.iterkeys() |
如果列表中的所有元素都可以用作字典键(也就是说,它们都是可哈希的),这通常会更快。python编程常见问题
1 2 3 4 | d = {} for x in mylist: d[x] = 1 mylist = list(d.keys()) |
python中集合的特征是集合中的数据项无序,不允许重复。如果试图将数据项添加到已经包含该数据项的集合中,python会忽略它。
1 2 3 4 | >>> l = ['a', 'a', 'bb', 'b', 'c', 'c', '10', '10', '8','8', 10, 10, 6, 10, 11.2, 11.2, 11, 11] >>> distinct_l = set(l) >>> print(distinct_l) set(['a', '10', 'c', 'b', 6, 'bb', 10, 11, 11.2, '8']) |