In Python, how do you remove duplicates from one or multiple lists?
本问题已经有最佳答案,请猛点这里访问。
例如,如果我有:
1 | a = ["apples","bananas","cucumbers","bananas"] |
我怎样才能去掉重复的"香蕉",以便:
1 | a = ["apples","bananas","cucumbers"] |
此外,如果我有:
1 2 3 | a = ["apples","bananas","cucumbers"] b = ["pears","apples","watermelons"] |
如何从两个列表中删除重复的"apples",以便:
1 2 3 | a = ["bananas","cucumbers"] b = ["pears","watermelons"] |
基于集合的解决方案不保留项的顺序。下面将保持项目的顺序,并删除除第一次出现的项目以外的所有项目,使用辅助集跟踪已看到的项目。
1 2 | seen = set() a = [seen.add(item) or item for item in a if item not in seen] |
如果要重用同一列表对象,可以这样做:
1 2 | seen = set() a[:] = (seen.add(item) or item for item in a if item not in seen) |
使用内置函数集
1 2 3 | a = ["apples","bananas","cucumbers","bananas"] a = list(set(a)) print(a) |
在第二种情况下,使用列表理解
1 2 3 4 5 | a = ["apples","bananas","cucumbers"] b = ["pears","apples","watermelons"] r = [i for i in a if i not in b] + [i for i in b if i not in a] print(r) |
实现这一点的关键是使用Python的集合。
- 在Python中,集合是一种数据结构,其中每个项都是唯一的。
- 如果使用列表作为参数调用set(list),将得到一个集合,其中包含列表中的所有元素,并删除重复项。
- 然后可以通过调用list()将其转换回列表。
所以,在第一个示例中,您可以编写
1 | a = list(set(a)) |
在集合中还有一些其他方法是有用的。
所以,在第二个示例中,您可以编写
1 2 3 | set1 = set(a).intersection(set(b)) #Get elements that are in both lists set2 = set(a).difference(set1) #Get a set elements that are in a but not in b a = list(set2) #Convert back to a list |
您只需使用
1 2 3 | a = ["apples","bananas","cucumbers","bananas"] print list(set(a)) |
可以使用集合对象来记录重复的元素。这样地:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | def handle_dumplicate(*lsts): s = set() result = [] for lst in lsts: no_dump_lst = [] for ele in lst: if ele in s: continue s.add(ele) no_dump_lst.append(ele) result.append(no_dump_lst) return result a = ["apples","bananas","cucumbers"] b = ["pears","apples","watermelons"] a, b = handle_dumplicate(a, b) print a print b |