从List＆lt; T＆gt;中删除重复项

Remove duplicates from a List<T> in C#

任何人都有一个快速的方法来消除C中的通用列表的重复？

如果您使用.NET 3+，则可以使用LINQ。

1 2	List<T> withDupes = LoadSomeData(); List<T> noDupes = withDupes.Distinct().ToList();

相关讨论

也许您应该考虑使用哈希集。

从msdn链接：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51

using System;
using System.Collections.Generic;

class Program
{
static void Main()
{
HashSet<int> evenNumbers = new HashSet<int>();
HashSet<int> oddNumbers = new HashSet<int>();

for (int i = 0; i < 5; i++)
{
// Populate numbers with just even numbers.
evenNumbers.Add(i * 2);

// Populate oddNumbers with just odd numbers.
oddNumbers.Add((i * 2) + 1);
}

Console.Write("evenNumbers contains {0} elements:", evenNumbers.Count);
DisplaySet(evenNumbers);

Console.Write("oddNumbers contains {0} elements:", oddNumbers.Count);
DisplaySet(oddNumbers);

// Create a new HashSet populated with even numbers.
HashSet<int> numbers = new HashSet<int>(evenNumbers);
Console.WriteLine("numbers UnionWith oddNumbers...");
numbers.UnionWith(oddNumbers);

Console.Write("numbers contains {0} elements:", numbers.Count);
DisplaySet(numbers);
}

private static void DisplaySet(HashSet<int> set)
{
Console.Write("{");
foreach (int i in set)
{
Console.Write(" {0}", i);
}
Console.WriteLine(" }");
}
}

/* This example produces output similar to the following:
* evenNumbers contains 5 elements: { 0 2 4 6 8 }
* oddNumbers contains 5 elements: { 1 3 5 7 9 }
* numbers UnionWith oddNumbers...
* numbers contains 10 elements: { 0 2 4 6 8 1 3 5 7 9 }
*/

相关讨论

怎么样？

1	var noDupes = list.Distinct().ToList();

在.NET 3.5中？

只需使用相同类型的列表初始化哈希集：

1	var noDupes = new HashSet<T>(withDupes);

或者，如果要返回列表：

1	var noDupsList = new HashSet<T>(withDupes).ToList();

相关讨论

排序，然后检查相邻的两个和两个，因为重复项将聚集在一起。

像这样：

1
2
3
4
5
6
7
8
9

list.Sort();
Int32 index = 0;
while (index < list.Count - 1)
{
if (list[index] == list[index + 1])
list.RemoveAt(index);
else
index++;
}

相关讨论

这对我很有用。简单使用

1	List<Type> liIDs = liIDs.Distinct().ToList<Type>();

将"type"替换为所需类型，例如int。

相关讨论

我喜欢使用这个命令：

1
2
3
4
5

List<Store> myStoreList = Service.GetStoreListbyProvince(provinceId)
.GroupBy(s => s.City)
.Select(grp => grp.FirstOrDefault())
.OrderBy(s => s.City)
.ToList();

我的列表中有这些字段：id、storename、city、postalcode我想在具有重复值的下拉列表中显示城市列表。解决方案：按城市分组，然后为列表选择第一个。

希望有帮助：)

正如Kronoz在.net 3.5中所说，您可以使用Distinct()。

在.NET 2中，您可以模仿它：

1
2
3
4
5
6
7
8
9

public IEnumerable<T> DedupCollection<T> (IEnumerable<T> input)
{
var passedValues = new HashSet<T>();

// Relatively simple dupe check alg used as example
foreach(T item in input)
if(passedValues.Add(item)) // True if item is new
yield return item;
}

这可用于对任何集合进行重复数据消除，并按原始顺序返回值。

通常，过滤一个集合(就像Distinct()和这个示例一样)要比从中删除项目快得多。

相关讨论

一个扩展方法可能是一个不错的方法…像这样：

1
2
3
4

public static List<T> Deduplicate<T>(this List<T> listToDeduplicate)
{
return listToDeduplicate.Distinct().ToList();
}

然后这样调用，例如：

1	List<int> myFilteredList = unfilteredList.Deduplicate();

在Java中(我假设C或多或少相同)：

1	list = new ArrayList<T>(new HashSet<T>(list))

如果您真的想改变原始列表：

1
2
3

List<T> noDupes = new ArrayList<T>(new HashSet<T>(list));
list.clear();
list.addAll(noDupes);

要保留顺序，只需用linkedhashset替换hashset。

相关讨论

Use Linq's Union method.

注意：这个解决方案不需要了解LINQ，除了它的存在。

代码

首先将以下内容添加到类文件的顶部：

1	using System.Linq;

现在，可以使用以下命令从名为obj1的对象中删除重复项：

1	obj1 = obj1.Union(obj1).ToList();

注意：将EDOCX1[0]重命名为对象的名称。

它是如何工作的

union命令列出两个源对象的每个条目中的一个。因为obj1都是源对象，所以这会将obj1减少到每个条目中的一个。

ToList()返回一个新的列表。这是必要的，因为像Union这样的LINQ命令将结果作为IEnumerable结果返回，而不是修改原始列表或返回新列表。

如果您不关心订单，您可以将项目放入一个HashSet，如果您想维护订单，您可以这样做：

1
2
3
4
5

var unique = new List<T>();
var hs = new HashSet<T>();
foreach (T t in list)
if (hs.Add(t))
unique.Add(t);

或者Linq方式：

1 2	var hs = new HashSet<T>(); list.All( x => hs.Add(x) );

编辑：HashSet方法是O(N)时间和O(N)空间，排序时再使其唯一(如@lassevk等建议)是O(N*lgN)时间和O(1)空间，所以我(如乍一看)不太清楚排序方式是劣质的(我为临时投反对票道歉…)

作为辅助方法(不带Linq)：

1
2
3
4

public static List<T> Distinct<T>(this List<T> list)
{
return (new HashSet<T>(list)).ToList();
}

相关讨论

这里有一个扩展方法，可以就地删除相邻的重复项。首先调用sort()并传入同一个IComparer。这应该比lasse v.karlsen的版本更有效，后者反复调用removeat(导致多个块内存移动)。

1
2
3
4
5
6
7
8

public static void RemoveAdjacentDuplicates<T>(this List<T> List, IComparer<T> Comparer)
{
int NumUnique = 0;
for (int i = 0; i < List.Count; i++)
if ((i == 0) || (Comparer.Compare(List[NumUnique - 1], List[i]) != 0))
List[NumUnique++] = List[i];
List.RemoveRange(NumUnique, List.Count - NumUnique);
}

通过nuget安装morelinq包，可以通过属性轻松地区分对象列表

1	IEnumerable<Catalogue> distinctCatalogues = catalogues.DistinctBy(c => c.CatalogueCode);

可能更容易简单地确保不将重复项添加到列表中。

1 2	if(items.IndexOf(new_item) < 0) items.add(new_item)

相关讨论

.NET 2.0中的另一种方法

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23

static void Main(string[] args)
{
List<string> alpha = new List<string>();

for(char a = 'a'; a <= 'd'; a++)
{
alpha.Add(a.ToString());
alpha.Add(a.ToString());
}

Console.WriteLine("Data :");
alpha.ForEach(delegate(string t) { Console.WriteLine(t); });

alpha.ForEach(delegate (string v)
{
if (alpha.FindAll(delegate(string t) { return t == v; }).Count > 1)
alpha.Remove(v);
});

Console.WriteLine("Unique Result :");
alpha.ForEach(delegate(string t) { Console.WriteLine(t);});
Console.ReadKey();
}

这里有一个简单的解决方案，它不需要任何难读的LINQ或任何列表排序。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

private static void CheckForDuplicateItems(List<string> items)
{
if (items == null ||
items.Count == 0)
return;

for (int outerIndex = 0; outerIndex < items.Count; outerIndex++)
{
for (int innerIndex = 0; innerIndex < items.Count; innerIndex++)
{
if (innerIndex == outerIndex) continue;
if (items[outerIndex].Equals(items[innerIndex]))
{
// Duplicate Found
}
}
}
}

相关讨论

这将获取distinct(不重复元素的元素)并再次将其转换为列表：

1	List<type> myNoneDuplicateValue = listValueWithDuplicate.Distinct().ToList();

David J.的回答是一个很好的方法，不需要额外的对象、排序等。但是，它可以改进：

for (int innerIndex = items.Count - 1; innerIndex > outerIndex ; innerIndex--)

因此，对于整个列表，外部循环从上到下，而内部循环从下到下，"直到到达外部循环位置"。

外部循环确保处理整个列表，内部循环查找实际的重复项，这些只能发生在外部循环尚未处理的部分。

或者如果你不想为内环自下而上，你可以让内环从outerindex+1开始。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29

public static void RemoveDuplicates<T>(IList<T> list )
{
if (list == null)
{
return;
}
int i = 1;
while(i<list.Count)
{
int j = 0;
bool remove = false;
while (j < i && !remove)
{
if (list[i].Equals(list[j]))
{
remove = true;
}
j++;
}
if (remove)
{
list.RemoveAt(i);
}
else
{
i++;
}
}
}

你可以用union

1	obj2 = obj1.Union(obj1).ToList();

相关讨论

简单直观的实现：

1
2
3
4
5
6
7
8
9
10
11
12

public static List<PointF> RemoveDuplicates(List<PointF> listPoints)
{
List<PointF> result = new List<PointF>();

for (int i = 0; i < listPoints.Count; i++)
{
if (!result.Contains(listPoints[i]))
result.Add(listPoints[i]);
}

return result;
}

如果您有两个类Product和Customer，我们希望从它们的列表中删除重复的项目

1
2
3
4
5
6
7
8
9
10
11
12
13

public class Product
{
public int Id { get; set; }
public string ProductName { get; set; }

}

public class Customer
{
public int Id { get; set; }
public string CustomerName { get; set; }

}

必须在下面的表单中定义一个泛型类

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22

public class ItemEqualityComparer<T> : IEqualityComparer<T> where T : class
{
private readonly PropertyInfo _propertyInfo;

public ItemEqualityComparer(string keyItem)
{
_propertyInfo = typeof(T).GetProperty(keyItem, BindingFlags.GetProperty | BindingFlags.Instance | BindingFlags.Public);
}

public bool Equals(T x, T y)
{
var xValue = _propertyInfo?.GetValue(x, null);
var yValue = _propertyInfo?.GetValue(y, null);
return xValue != null && yValue != null && xValue.Equals(yValue);
}

public int GetHashCode(T obj)
{
var propertyValue = _propertyInfo.GetValue(obj, null);
return propertyValue == null ? 0 : propertyValue.GetHashCode();
}
}

然后，可以删除列表中的重复项。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17

var products = new List<Product>
{
new Product{ProductName ="product 1" ,Id = 1,},
new Product{ProductName ="product 2" ,Id = 2,},
new Product{ProductName ="product 2" ,Id = 4,},
new Product{ProductName ="product 2" ,Id = 4,},
};
var productList = products.Distinct(new ItemEqualityComparer<Product>(nameof(Product.Id))).ToList();

var customers = new List<Customer>
{
new Customer{CustomerName ="Customer 1" ,Id = 5,},
new Customer{CustomerName ="Customer 2" ,Id = 5,},
new Customer{CustomerName ="Customer 2" ,Id = 5,},
new Customer{CustomerName ="Customer 2" ,Id = 5,},
};
var customerList = customers.Distinct(new ItemEqualityComparer<Customer>(nameof(Customer.Id))).ToList();

此代码通过Id删除重复项如果要通过其他属性删除重复项，可以更改nameof(YourClass.DuplicateProperty)相同的nameof(Customer.CustomerName)，然后通过CustomerName属性删除重复项。

有很多方法可以解决-列表中的重复问题，下面是其中之一：

1
2
3
4
5
6
7
8
9
10
11
12
13

List<Container> containerList = LoadContainer();//Assume it has duplicates
List<Container> filteredList = new List<Container>();
foreach (var container in containerList)
{
Container duplicateContainer = containerList.Find(delegate(Container checkContainer)
{ return (checkContainer.UniqueId == container.UniqueId); });
//Assume 'UniqueId' is the property of the Container class on which u r making a search

if(!containerList.Contains(duplicateContainer) //Add object when not found in the new class object
{
filteredList.Add(container);
}
}

干杯拉维加尼森