LINQ - Full Outer Join
我有一张身份证和他们的名字的清单,还有一张身份证和他们的姓的清单。有些人没有名字,有些人没有姓;我想在两个列表上进行完整的外部联接。
所以下面列出了:
1 2 3 4 5 6 7 8 9 | ID FirstName -- --------- 1 John 2 Sue ID LastName -- -------- 1 Doe 3 Smith |
应生产:
1 2 3 4 5 | ID FirstName LastName -- --------- -------- 1 John Doe 2 Sue 3 Smith |
我是Linq的新手(如果我有点跛脚请原谅),我发现了很多关于"Linq外部连接"的解决方案,它们看起来非常相似,但实际上似乎是左外部连接。
到目前为止,我的尝试是这样的:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 | private void OuterJoinTest() { List<FirstName> firstNames = new List<FirstName>(); firstNames.Add(new FirstName { ID = 1, Name ="John" }); firstNames.Add(new FirstName { ID = 2, Name ="Sue" }); List<LastName> lastNames = new List<LastName>(); lastNames.Add(new LastName { ID = 1, Name ="Doe" }); lastNames.Add(new LastName { ID = 3, Name ="Smith" }); var outerJoin = from first in firstNames join last in lastNames on first.ID equals last.ID into temp from last in temp.DefaultIfEmpty() select new { id = first != null ? first.ID : last.ID, firstname = first != null ? first.Name : string.Empty, surname = last != null ? last.Name : string.Empty }; } } public class FirstName { public int ID; public string Name; } public class LastName { public int ID; public string Name; } |
但这又回来了:
1 2 3 4 | ID FirstName LastName -- --------- -------- 1 John Doe 2 Sue |
我做错什么了?
更新1:提供真正通用的扩展方法
更新2:可选接受密钥类型的自定义
更新3:这个实现最近已经成为
编辑添加了
在http://ideone.com/o36nwc上观看
1 2 3 4 5 6 7 8 9 10 11 12 | static void Main(string[] args) { var ax = new[] { new { id = 1, name ="John" }, new { id = 2, name ="Sue" } }; var bx = new[] { new { id = 1, surname ="Doe" }, new { id = 3, surname ="Smith" } }; ax.FullOuterJoin(bx, a => a.id, b => b.id, (a, b, id) => new {a, b}) .ToList().ForEach(Console.WriteLine); } |
打印输出:
1 2 3 | { a = { id = 1, name = John }, b = { id = 1, surname = Doe } } { a = { id = 2, name = Sue }, b = } { a = , b = { id = 3, surname = Smith } } |
您还可以提供默认值:http://ideone.com/kg4kqo
1 2 3 4 5 6 |
印刷:
1 2 3 | { name = John, surname = Doe } { name = Sue, surname = (no surname) } { name = (no firstname), surname = Smith } |
所用术语的解释:
联接是从关系数据库设计中借用的术语:
- join将重复来自
a 的元素,重复次数与b 中具有相应键的元素相同(即,如果b 为空,则不重复)。数据库行话称之为inner (equi)join 。 - 外部连接包括来自
a 的元素,没有对应的元素。元素存在于b 中。(即:如果b 为空,即使结果也是如此)。这通常被称为left join 。 - 完整的外部联接包括来自
a 的记录,如果另一个元素中没有对应的元素,则包括来自b 的记录。(即,如果a 为空,即使结果也是如此)
RDBMS中不常见的是组联接[1]:
- 组联接与上述操作相同,但对于多个对应的
b 来说,它不是重复来自a 的元素,而是用对应的键对记录进行分组。当您希望基于公共密钥枚举"joined"记录时,这通常更方便。
另请参见GroupJoin,它也包含一些一般的背景解释。
[1](我相信Oracle和MSSQL对此有专有扩展)
全码此的通用"Drop-in"扩展类
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 | internal static class MyExtensions { internal static IEnumerable<TResult> FullOuterGroupJoin<TA, TB, TKey, TResult>( this IEnumerable<TA> a, IEnumerable<TB> b, Func<TA, TKey> selectKeyA, Func<TB, TKey> selectKeyB, Func<IEnumerable<TA>, IEnumerable<TB>, TKey, TResult> projection, IEqualityComparer<TKey> cmp = null) { cmp = cmp?? EqualityComparer<TKey>.Default; var alookup = a.ToLookup(selectKeyA, cmp); var blookup = b.ToLookup(selectKeyB, cmp); var keys = new HashSet<TKey>(alookup.Select(p => p.Key), cmp); keys.UnionWith(blookup.Select(p => p.Key)); var join = from key in keys let xa = alookup[key] let xb = blookup[key] select projection(xa, xb, key); return join; } internal static IEnumerable<TResult> FullOuterJoin<TA, TB, TKey, TResult>( this IEnumerable<TA> a, IEnumerable<TB> b, Func<TA, TKey> selectKeyA, Func<TB, TKey> selectKeyB, Func<TA, TB, TKey, TResult> projection, TA defaultA = default(TA), TB defaultB = default(TB), IEqualityComparer<TKey> cmp = null) { cmp = cmp?? EqualityComparer<TKey>.Default; var alookup = a.ToLookup(selectKeyA, cmp); var blookup = b.ToLookup(selectKeyB, cmp); var keys = new HashSet<TKey>(alookup.Select(p => p.Key), cmp); keys.UnionWith(blookup.Select(p => p.Key)); var join = from key in keys from xa in alookup[key].DefaultIfEmpty(defaultA) from xb in blookup[key].DefaultIfEmpty(defaultB) select projection(xa, xb, key); return join; } } |
我不知道这是否涵盖了所有的情况,从逻辑上看是正确的。其思想是采取左外部联接和右外部联接,然后采取结果的联合。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | var firstNames = new[] { new { ID = 1, Name ="John" }, new { ID = 2, Name ="Sue" }, }; var lastNames = new[] { new { ID = 1, Name ="Doe" }, new { ID = 3, Name ="Smith" }, }; var leftOuterJoin = from first in firstNames join last in lastNames on first.ID equals last.ID into temp from last in temp.DefaultIfEmpty() select new { first.ID, FirstName = first.Name, LastName = last?.Name, }; var rightOuterJoin = from last in lastNames join first in firstNames on last.ID equals first.ID into temp from first in temp.DefaultIfEmpty() select new { last.ID, FirstName = first?.Name, LastName = last.Name, }; var fullOuterJoin = leftOuterJoin.Union(rightOuterJoin); |
这是因为它是在linq to objects中编写的。如果linq to sql或其他,查询处理器可能不支持安全导航或其他操作。您必须使用条件运算符有条件地获取值。
即。,
1 2 3 4 5 6 7 8 9 10 | var leftOuterJoin = from first in firstNames join last in lastNames on first.ID equals last.ID into temp from last in temp.DefaultIfEmpty() select new { first.ID, FirstName = first.Name, LastName = last != null ? last.Name : default, }; |
我认为其中大多数都存在问题,包括接受的答案,因为它们在IQueryable上不能很好地与Linq配合,要么是因为执行了太多的服务器往返和太多的数据返回,要么是因为执行了太多的客户机。
对于IEnumerable,我不喜欢SEHE的答案或类似的答案,因为它有过多的内存使用(在32GB的机器上,一个简单的10000000双列表测试将linqpad内存耗尽)。
另外,大多数其他连接实际上没有实现正确的完全外部连接,因为它们使用的是右连接的联合而不是右反半连接的concat,这不仅从结果中消除了重复的内部连接行,而且还消除了最初存在于左或右数据中的任何正确重复。
因此,下面是我的扩展,它们处理所有这些问题,生成SQL,与直接在LINQ中实现联接一样好,在服务器上执行,并且比其他可枚举项更快、内存更少:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 | public static class Ext { public static IEnumerable<TResult> LeftOuterJoin<TLeft, TRight, TKey, TResult>( this IEnumerable<TLeft> leftItems, IEnumerable<TRight> rightItems, Func<TLeft, TKey> leftKeySelector, Func<TRight, TKey> rightKeySelector, Func<TLeft, TRight, TResult> resultSelector) { return from left in leftItems join right in rightItems on leftKeySelector(left) equals rightKeySelector(right) into temp from right in temp.DefaultIfEmpty() select resultSelector(left, right); } public static IEnumerable<TResult> RightOuterJoin<TLeft, TRight, TKey, TResult>( this IEnumerable<TLeft> leftItems, IEnumerable<TRight> rightItems, Func<TLeft, TKey> leftKeySelector, Func<TRight, TKey> rightKeySelector, Func<TLeft, TRight, TResult> resultSelector) { return from right in rightItems join left in leftItems on rightKeySelector(right) equals leftKeySelector(left) into temp from left in temp.DefaultIfEmpty() select resultSelector(left, right); } public static IEnumerable<TResult> FullOuterJoinDistinct<TLeft, TRight, TKey, TResult>( this IEnumerable<TLeft> leftItems, IEnumerable<TRight> rightItems, Func<TLeft, TKey> leftKeySelector, Func<TRight, TKey> rightKeySelector, Func<TLeft, TRight, TResult> resultSelector) { return leftItems.LeftOuterJoin(rightItems, leftKeySelector, rightKeySelector, resultSelector).Union(leftItems.RightOuterJoin(rightItems, leftKeySelector, rightKeySelector, resultSelector)); } public static IEnumerable<TResult> RightAntiSemiJoin<TLeft, TRight, TKey, TResult>( this IEnumerable<TLeft> leftItems, IEnumerable<TRight> rightItems, Func<TLeft, TKey> leftKeySelector, Func<TRight, TKey> rightKeySelector, Func<TLeft, TRight, TResult> resultSelector) where TLeft : class { var hashLK = new HashSet<TKey>(from l in leftItems select leftKeySelector(l)); return rightItems.Where(r => !hashLK.Contains(rightKeySelector(r))).Select(r => resultSelector((TLeft)null,r)); } public static IEnumerable<TResult> FullOuterJoin<TLeft, TRight, TKey, TResult>( this IEnumerable<TLeft> leftItems, IEnumerable<TRight> rightItems, Func<TLeft, TKey> leftKeySelector, Func<TRight, TKey> rightKeySelector, Func<TLeft, TRight, TResult> resultSelector) where TLeft : class { return leftItems.LeftOuterJoin(rightItems, leftKeySelector, rightKeySelector, resultSelector).Concat(leftItems.RightAntiSemiJoin(rightItems, leftKeySelector, rightKeySelector, resultSelector)); } private static Expression<Func<TP, TC, TResult>> CastSMBody<TP, TC, TResult>(LambdaExpression ex, TP unusedP, TC unusedC, TResult unusedRes) => (Expression<Func<TP, TC, TResult>>)ex; public static IQueryable<TResult> LeftOuterJoin<TLeft, TRight, TKey, TResult>( this IQueryable<TLeft> leftItems, IQueryable<TRight> rightItems, Expression<Func<TLeft, TKey>> leftKeySelector, Expression<Func<TRight, TKey>> rightKeySelector, Expression<Func<TLeft, TRight, TResult>> resultSelector) where TLeft : class where TRight : class where TResult : class { var sampleAnonLR = new { left = (TLeft)null, rightg = (IEnumerable<TRight>)null }; var parmP = Expression.Parameter(sampleAnonLR.GetType(),"p"); var parmC = Expression.Parameter(typeof(TRight),"c"); var argLeft = Expression.PropertyOrField(parmP,"left"); var newleftrs = CastSMBody(Expression.Lambda(Expression.Invoke(resultSelector, argLeft, parmC), parmP, parmC), sampleAnonLR, (TRight)null, (TResult)null); return leftItems.AsQueryable().GroupJoin(rightItems, leftKeySelector, rightKeySelector, (left, rightg) => new { left, rightg }).SelectMany(r => r.rightg.DefaultIfEmpty(), newleftrs); } public static IQueryable<TResult> RightOuterJoin<TLeft, TRight, TKey, TResult>( this IQueryable<TLeft> leftItems, IQueryable<TRight> rightItems, Expression<Func<TLeft, TKey>> leftKeySelector, Expression<Func<TRight, TKey>> rightKeySelector, Expression<Func<TLeft, TRight, TResult>> resultSelector) where TLeft : class where TRight : class where TResult : class { var sampleAnonLR = new { leftg = (IEnumerable<TLeft>)null, right = (TRight)null }; var parmP = Expression.Parameter(sampleAnonLR.GetType(),"p"); var parmC = Expression.Parameter(typeof(TLeft),"c"); var argRight = Expression.PropertyOrField(parmP,"right"); var newrightrs = CastSMBody(Expression.Lambda(Expression.Invoke(resultSelector, parmC, argRight), parmP, parmC), sampleAnonLR, (TLeft)null, (TResult)null); return rightItems.GroupJoin(leftItems, rightKeySelector, leftKeySelector, (right, leftg) => new { leftg, right }).SelectMany(l => l.leftg.DefaultIfEmpty(), newrightrs); } public static IQueryable<TResult> FullOuterJoinDistinct<TLeft, TRight, TKey, TResult>( this IQueryable<TLeft> leftItems, IQueryable<TRight> rightItems, Expression<Func<TLeft, TKey>> leftKeySelector, Expression<Func<TRight, TKey>> rightKeySelector, Expression<Func<TLeft, TRight, TResult>> resultSelector) where TLeft : class where TRight : class where TResult : class { return leftItems.LeftOuterJoin(rightItems, leftKeySelector, rightKeySelector, resultSelector).Union(leftItems.RightOuterJoin(rightItems, leftKeySelector, rightKeySelector, resultSelector)); } private static Expression<Func<TP, TResult>> CastSBody<TP, TResult>(LambdaExpression ex, TP unusedP, TResult unusedRes) => (Expression<Func<TP, TResult>>)ex; public static IQueryable<TResult> RightAntiSemiJoin<TLeft, TRight, TKey, TResult>( this IQueryable<TLeft> leftItems, IQueryable<TRight> rightItems, Expression<Func<TLeft, TKey>> leftKeySelector, Expression<Func<TRight, TKey>> rightKeySelector, Expression<Func<TLeft, TRight, TResult>> resultSelector) where TLeft : class where TRight : class where TResult : class { var sampleAnonLgR = new { leftg = (IEnumerable<TLeft>)null, right = (TRight)null }; var parmLgR = Expression.Parameter(sampleAnonLgR.GetType(),"lgr"); var argLeft = Expression.Constant(null, typeof(TLeft)); var argRight = Expression.PropertyOrField(parmLgR,"right"); var newrightrs = CastSBody(Expression.Lambda(Expression.Invoke(resultSelector, argLeft, argRight), parmLgR), sampleAnonLgR, (TResult)null); return rightItems.GroupJoin(leftItems, rightKeySelector, leftKeySelector, (right, leftg) => new { leftg, right }).Where(lgr => !lgr.leftg.Any()).Select(newrightrs); } public static IQueryable<TResult> FullOuterJoin<TLeft, TRight, TKey, TResult>( this IQueryable<TLeft> leftItems, IQueryable<TRight> rightItems, Expression<Func<TLeft, TKey>> leftKeySelector, Expression<Func<TRight, TKey>> rightKeySelector, Expression<Func<TLeft, TRight, TResult>> resultSelector) where TLeft : class where TRight : class where TResult : class { return leftItems.LeftOuterJoin(rightItems, leftKeySelector, rightKeySelector, resultSelector).Concat(leftItems.RightAntiSemiJoin(rightItems, leftKeySelector, rightKeySelector, resultSelector)); } } |
正确的反半连接之间的区别主要是在linq-to对象或源中没有,但在服务器(SQL)端在最终答案中会有所不同,删除了不必要的
用linqkit可以改进处理将
我为
下面是一个扩展方法:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | public static IEnumerable<KeyValuePair<TLeft, TRight>> FullOuterJoin<TLeft, TRight>(this IEnumerable<TLeft> leftItems, Func<TLeft, object> leftIdSelector, IEnumerable<TRight> rightItems, Func<TRight, object> rightIdSelector) { var leftOuterJoin = from left in leftItems join right in rightItems on leftIdSelector(left) equals rightIdSelector(right) into temp from right in temp.DefaultIfEmpty() select new { left, right }; var rightOuterJoin = from right in rightItems join left in leftItems on rightIdSelector(right) equals leftIdSelector(left) into temp from left in temp.DefaultIfEmpty() select new { left, right }; var fullOuterJoin = leftOuterJoin.Union(rightOuterJoin); return fullOuterJoin.Select(x => new KeyValuePair<TLeft, TRight>(x.left, x.right)); } |
正如您所发现的,Linq没有"外部联接"构造。您能得到的最接近的是使用所述查询的左外部联接。为此,可以添加未在联接中表示的姓氏列表中的任何元素:
1 2 3 4 5 6 | outerJoin = outerJoin.Concat(lastNames.Select(l=>new { id = l.ID, firstname = String.Empty, surname = l.Name }).Where(l=>!outerJoin.Any(o=>o.id == l.id))); |
我猜@sehe的方法更强大,但在我更好地理解它之前,我发现自己已经从@michaelsander的扩展中跳了出来。我修改了它以匹配这里描述的内置Enumerable.Join()方法的语法和返回类型。我在@jeffmercado's solution下为@cadrell0的评论添加了"distinct"后缀。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | public static class MyExtensions { public static IEnumerable<TResult> FullJoinDistinct<TLeft, TRight, TKey, TResult> ( this IEnumerable<TLeft> leftItems, IEnumerable<TRight> rightItems, Func<TLeft, TKey> leftKeySelector, Func<TRight, TKey> rightKeySelector, Func<TLeft, TRight, TResult> resultSelector ) { var leftJoin = from left in leftItems join right in rightItems on leftKeySelector(left) equals rightKeySelector(right) into temp from right in temp.DefaultIfEmpty() select resultSelector(left, right); var rightJoin = from right in rightItems join left in leftItems on rightKeySelector(right) equals leftKeySelector(left) into temp from left in temp.DefaultIfEmpty() select resultSelector(left, right); return leftJoin.Union(rightJoin); } } |
在这个例子中,您可以这样使用它:
1 2 3 4 5 6 7 8 9 10 11 12 | var test = firstNames .FullJoinDistinct( lastNames, f=> f.ID, j=> j.ID, (f,j)=> new { ID = f == null ? j.ID : f.ID, leftName = f == null ? null : f.Name, rightName = j == null ? null : j.Name } ); |
将来,随着我了解的更多,我有一种感觉,考虑到@sehe的受欢迎程度,我会迁移到它的逻辑。但即便如此,我还是要小心,因为我觉得至少有一个重载与现有".join()"方法的语法相匹配是很重要的,如果可行,有两个原因:
我对泛型、扩展、func语句和其他特性还是很新的,所以欢迎提供反馈。
编辑:没过多久我就意识到我的代码有问题。我在linqpad中执行.dump()并查看返回类型。它是不可数的,所以我试图匹配它。但是,当我对扩展名执行.where()或.select()时,出现了一个错误:"'system collections.ienumerable'不包含'select'和……"的定义。因此,最终我能够匹配.join()的输入语法,但不能匹配返回行为。
编辑:为函数的返回类型添加了"tresult"。在阅读微软的文章时错过了这一点,当然这是有道理的。有了这个解决方案,现在的返回行为似乎完全符合我的目标。
我的解决方案是,在两个可枚举项中键都是唯一的:
1 2 3 4 5 6 7 8 9 10 11 | private static IEnumerable<TResult> FullOuterJoin<Ta, Tb, TKey, TResult>( IEnumerable<Ta> a, IEnumerable<Tb> b, Func<Ta, TKey> key_a, Func<Tb, TKey> key_b, Func<Ta, Tb, TResult> selector) { var alookup = a.ToLookup(key_a); var blookup = b.ToLookup(key_b); var keys = new HashSet<TKey>(alookup.Select(p => p.Key)); keys.UnionWith(blookup.Select(p => p.Key)); return keys.Select(key => selector(alookup[key].FirstOrDefault(), blookup[key].FirstOrDefault())); } |
所以
1 2 3 4 5 6 7 8 | var ax = new[] { new { id = 1, first_name ="ali" }, new { id = 2, first_name ="mohammad" } }; var bx = new[] { new { id = 1, last_name ="rezaei" }, new { id = 3, last_name ="kazemi" } }; var list = FullOuterJoin(ax, bx, a => a.id, b => b.id, (a, b) =>"f:" + a?.first_name +" l:" + b?.last_name).ToArray(); |
输出:
1 2 3 | f: ali l: rezaei f: mohammad l: f: l: kazemi |
我决定将此作为一个单独的答案添加,因为我不确定它是否经过了足够的测试。这是对
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 | public static class Ext { private static Expression<Func<TP, TC, TResult>> CastSMBody<TP, TC, TResult>(LambdaExpression ex, TP unusedP, TC unusedC, TResult unusedRes) => (Expression<Func<TP, TC, TResult>>)ex; public static IQueryable<TResult> LeftOuterJoin<TLeft, TRight, TKey, TResult>( this IQueryable<TLeft> leftItems, IQueryable<TRight> rightItems, Expression<Func<TLeft, TKey>> leftKeySelector, Expression<Func<TRight, TKey>> rightKeySelector, Expression<Func<TLeft, TRight, TResult>> resultSelector) where TLeft : class where TRight : class where TResult : class { // (lrg,r) => resultSelector(lrg.left, r) var sampleAnonLR = new { left = (TLeft)null, rightg = (IEnumerable<TRight>)null }; var parmP = Expression.Parameter(sampleAnonLR.GetType(),"lrg"); var parmC = Expression.Parameter(typeof(TRight),"r"); var argLeft = Expression.PropertyOrField(parmP,"left"); var newleftrs = CastSMBody(Expression.Lambda(resultSelector.Apply(argLeft, parmC), parmP, parmC), sampleAnonLR, (TRight)null, (TResult)null); return leftItems.GroupJoin(rightItems, leftKeySelector, rightKeySelector, (left, rightg) => new { left, rightg }).SelectMany(r => r.rightg.DefaultIfEmpty(), newleftrs); } public static IQueryable<TResult> RightOuterJoin<TLeft, TRight, TKey, TResult>( this IQueryable<TLeft> leftItems, IQueryable<TRight> rightItems, Expression<Func<TLeft, TKey>> leftKeySelector, Expression<Func<TRight, TKey>> rightKeySelector, Expression<Func<TLeft, TRight, TResult>> resultSelector) where TLeft : class where TRight : class where TResult : class { // (lgr,l) => resultSelector(l, lgr.right) var sampleAnonLR = new { leftg = (IEnumerable<TLeft>)null, right = (TRight)null }; var parmP = Expression.Parameter(sampleAnonLR.GetType(),"lgr"); var parmC = Expression.Parameter(typeof(TLeft),"l"); var argRight = Expression.PropertyOrField(parmP,"right"); var newrightrs = CastSMBody(Expression.Lambda(resultSelector.Apply(parmC, argRight), parmP, parmC), sampleAnonLR, (TLeft)null, (TResult)null); return rightItems.GroupJoin(leftItems, rightKeySelector, leftKeySelector, (right, leftg) => new { leftg, right }) .SelectMany(l => l.leftg.DefaultIfEmpty(), newrightrs); } private static Expression<Func<TParm, TResult>> CastSBody<TParm, TResult>(LambdaExpression ex, TParm unusedP, TResult unusedRes) => (Expression<Func<TParm, TResult>>)ex; public static IQueryable<TResult> RightAntiSemiJoin<TLeft, TRight, TKey, TResult>( this IQueryable<TLeft> leftItems, IQueryable<TRight> rightItems, Expression<Func<TLeft, TKey>> leftKeySelector, Expression<Func<TRight, TKey>> rightKeySelector, Expression<Func<TLeft, TRight, TResult>> resultSelector) where TLeft : class where TRight : class where TResult : class { // newrightrs = lgr => resultSelector((TLeft)null, lgr.right) var sampleAnonLgR = new { leftg = (IEnumerable<TLeft>)null, right = (TRight)null }; var parmLgR = Expression.Parameter(sampleAnonLgR.GetType(),"lgr"); var argLeft = Expression.Constant(null, typeof(TLeft)); var argRight = Expression.PropertyOrField(parmLgR,"right"); var newrightrs = CastSBody(Expression.Lambda(resultSelector.Apply(argLeft, argRight), parmLgR), sampleAnonLgR, (TResult)null); return rightItems.GroupJoin(leftItems, rightKeySelector, leftKeySelector, (right, leftg) => new { leftg, right }).Where(lgr => !lgr.leftg.Any()).Select(newrightrs); } public static IQueryable<TResult> FullOuterJoin<TLeft, TRight, TKey, TResult>( this IQueryable<TLeft> leftItems, IQueryable<TRight> rightItems, Expression<Func<TLeft, TKey>> leftKeySelector, Expression<Func<TRight, TKey>> rightKeySelector, Expression<Func<TLeft, TRight, TResult>> resultSelector) where TLeft : class where TRight : class where TResult : class { return leftItems.LeftOuterJoin(rightItems, leftKeySelector, rightKeySelector, resultSelector).Concat(leftItems.RightAntiSemiJoin(rightItems, leftKeySelector, rightKeySelector, resultSelector)); } public static Expression Apply(this LambdaExpression e, params Expression[] args) { var b = e.Body; foreach (var pa in e.Parameters.Cast<ParameterExpression>().Zip(args, (p, a) => (p, a))) { b = b.Replace(pa.p, pa.a); } return b.PropagateNull(); } public static Expression Replace(this Expression orig, Expression from, Expression to) => new ReplaceVisitor(from, to).Visit(orig); public class ReplaceVisitor : System.Linq.Expressions.ExpressionVisitor { public readonly Expression from; public readonly Expression to; public ReplaceVisitor(Expression _from, Expression _to) { from = _from; to = _to; } public override Expression Visit(Expression node) => node == from ? to : base.Visit(node); } public static Expression PropagateNull(this Expression orig) => new NullVisitor().Visit(orig); public class NullVisitor : System.Linq.Expressions.ExpressionVisitor { public override Expression Visit(Expression node) { if (node is MemberExpression nme && nme.Expression is ConstantExpression nce && nce.Value == null) return Expression.Constant(null, nce.Type.GetMember(nme.Member.Name).Single().GetMemberType()); else return base.Visit(node); } } public static Type GetMemberType(this MemberInfo member) { switch (member) { case FieldInfo mfi: return mfi.FieldType; case PropertyInfo mpi: return mpi.PropertyType; case EventInfo mei: return mei.EventHandlerType; default: throw new ArgumentException("MemberInfo must be if type FieldInfo, PropertyInfo or EventInfo", nameof(member)); } } } |
对两个输入执行内存流枚举,并为每行调用选择器。如果在当前迭代中没有相关性,则其中一个选择器参数将为空。
例子:
1 2 3 4 5 | var result = left.FullOuterJoin( right, x=>left.Key, x=>right.Key, (l,r) => new { LeftKey = l?.Key, RightKey=r?.Key }); |
需要相关类型的IComparer,使用比较器。如果未提供,则使用默认值。
要求将"orderby"应用于输入可枚举项
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71/// <summary>
/// Performs a full outer join on two <see cref="IEnumerable{T}" />.
/// </summary>
/// <typeparam name="TLeft"></typeparam>
/// <typeparam name="TValue"></typeparam>
/// <typeparam name="TRight"></typeparam>
/// <typeparam name="TResult"></typeparam>
/// <param name="left"></param>
/// <param name="right"></param>
/// <param name="leftKeySelector"></param>
/// <param name="rightKeySelector"></param>
/// <param name="selector">Expression defining result type</param>
/// <param name="keyComparer">A comparer if there is no default for the type</param>
/// <returns></returns>
[System.Diagnostics.DebuggerStepThrough]
public static IEnumerable<TResult> FullOuterJoin<TLeft, TRight, TValue, TResult>(
this IEnumerable<TLeft> left,
IEnumerable<TRight> right,
Func<TLeft, TValue> leftKeySelector,
Func<TRight, TValue> rightKeySelector,
Func<TLeft, TRight, TResult> selector,
IComparer<TValue> keyComparer = null)
where TLeft: class
where TRight: class
where TValue : IComparable
{
keyComparer = keyComparer ?? Comparer<TValue>.Default;
using (var enumLeft = left.OrderBy(leftKeySelector).GetEnumerator())
using (var enumRight = right.OrderBy(rightKeySelector).GetEnumerator())
{
var hasLeft = enumLeft.MoveNext();
var hasRight = enumRight.MoveNext();
while (hasLeft || hasRight)
{
var currentLeft = enumLeft.Current;
var valueLeft = hasLeft ? leftKeySelector(currentLeft) : default(TValue);
var currentRight = enumRight.Current;
var valueRight = hasRight ? rightKeySelector(currentRight) : default(TValue);
int compare =
!hasLeft ? 1
: !hasRight ? -1
: keyComparer.Compare(valueLeft, valueRight);
switch (compare)
{
case 0:
// The selector matches. An inner join is achieved
yield return selector(currentLeft, currentRight);
hasLeft = enumLeft.MoveNext();
hasRight = enumRight.MoveNext();
break;
case -1:
yield return selector(currentLeft, default(TRight));
hasLeft = enumLeft.MoveNext();
break;
case 1:
yield return selector(default(TLeft), currentRight);
hasRight = enumRight.MoveNext();
break;
}
}
}
}
我喜欢sehe的答案,但它不使用延迟执行(输入序列被tolookup调用热切地枚举)。因此,在查看了linq to对象的.NET源之后,我想到了:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 | public static class LinqExtensions { public static IEnumerable<TResult> FullOuterJoin<TLeft, TRight, TKey, TResult>( this IEnumerable<TLeft> left, IEnumerable<TRight> right, Func<TLeft, TKey> leftKeySelector, Func<TRight, TKey> rightKeySelector, Func<TLeft, TRight, TKey, TResult> resultSelector, IEqualityComparer<TKey> comparator = null, TLeft defaultLeft = default(TLeft), TRight defaultRight = default(TRight)) { if (left == null) throw new ArgumentNullException("left"); if (right == null) throw new ArgumentNullException("right"); if (leftKeySelector == null) throw new ArgumentNullException("leftKeySelector"); if (rightKeySelector == null) throw new ArgumentNullException("rightKeySelector"); if (resultSelector == null) throw new ArgumentNullException("resultSelector"); comparator = comparator ?? EqualityComparer<TKey>.Default; return FullOuterJoinIterator(left, right, leftKeySelector, rightKeySelector, resultSelector, comparator, defaultLeft, defaultRight); } internal static IEnumerable<TResult> FullOuterJoinIterator<TLeft, TRight, TKey, TResult>( this IEnumerable<TLeft> left, IEnumerable<TRight> right, Func<TLeft, TKey> leftKeySelector, Func<TRight, TKey> rightKeySelector, Func<TLeft, TRight, TKey, TResult> resultSelector, IEqualityComparer<TKey> comparator, TLeft defaultLeft, TRight defaultRight) { var leftLookup = left.ToLookup(leftKeySelector, comparator); var rightLookup = right.ToLookup(rightKeySelector, comparator); var keys = leftLookup.Select(g => g.Key).Union(rightLookup.Select(g => g.Key), comparator); foreach (var key in keys) foreach (var leftValue in leftLookup[key].DefaultIfEmpty(defaultLeft)) foreach (var rightValue in rightLookup[key].DefaultIfEmpty(defaultRight)) yield return resultSelector(leftValue, rightValue, key); } } |
此实现具有以下重要属性:
- 延迟执行,在枚举输出序列之前不会枚举输入序列。
- 每个输入序列仅枚举一次。
- 保留输入序列的顺序,从这个意义上说,它将按照左序列的顺序生成元组,然后是右序列(对于左序列中不存在的键)。
这些属性很重要,因为它们是对fulloterjoin不熟悉但对linq有经验的人所期望的。
两个或多个表的完全外部联接:首先提取要联接的列。
1 2 3 4 5 | var DatesA = from A in db.T1 select A.Date; var DatesB = from B in db.T2 select B.Date; var DatesC = from C in db.T3 select C.Date; var Dates = DatesA.Union(DatesB).Union(DatesC); |
然后在提取的列和主表之间使用左外部联接。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | var Full_Outer_Join = (from A in Dates join B in db.T1 on A equals B.Date into AB from ab in AB.DefaultIfEmpty() join C in db.T2 on A equals C.Date into ABC from abc in ABC.DefaultIfEmpty() join D in db.T3 on A equals D.Date into ABCD from abcd in ABCD.DefaultIfEmpty() select new { A, ab, abc, abcd }) .AsEnumerable(); |
我大约6年前为一个应用程序编写了这个扩展类,从那时起,我就在许多没有问题的解决方案中使用它。希望它有帮助。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 | public static class JoinExtensions { public static IEnumerable<TResult> FullOuterJoin<TOuter, TInner, TKey, TResult>( this IEnumerable<TOuter> outer, IEnumerable<TInner> inner, Func<TOuter, TKey> outerKeySelector, Func<TInner, TKey> innerKeySelector, Func<TOuter, TInner, TResult> resultSelector) where TInner : class where TOuter : class { var innerLookup = inner.ToLookup(innerKeySelector); var outerLookup = outer.ToLookup(outerKeySelector); var innerJoinItems = inner .Where(innerItem => !outerLookup.Contains(innerKeySelector(innerItem))) .Select(innerItem => resultSelector(null, innerItem)); return outer .SelectMany(outerItem => { var innerItems = innerLookup[outerKeySelector(outerItem)]; return innerItems.Any() ? innerItems : new TInner[] { null }; }, resultSelector) .Concat(innerJoinItems); } public static IEnumerable<TResult> LeftJoin<TOuter, TInner, TKey, TResult>( this IEnumerable<TOuter> outer, IEnumerable<TInner> inner, Func<TOuter, TKey> outerKeySelector, Func<TInner, TKey> innerKeySelector, Func<TOuter, TInner, TResult> resultSelector) { return outer.GroupJoin( inner, outerKeySelector, innerKeySelector, (o, i) => new { o = o, i = i.DefaultIfEmpty() }) .SelectMany(m => m.i.Select(inn => resultSelector(m.o, inn) )); } public static IEnumerable<TResult> RightJoin<TOuter, TInner, TKey, TResult>( this IEnumerable<TOuter> outer, IEnumerable<TInner> inner, Func<TOuter, TKey> outerKeySelector, Func<TInner, TKey> innerKeySelector, Func<TOuter, TInner, TResult> resultSelector) { return inner.GroupJoin( outer, innerKeySelector, outerKeySelector, (i, o) => new { i = i, o = o.DefaultIfEmpty() }) .SelectMany(m => m.o.Select(outt => resultSelector(outt, m.i) )); } } |
我真的讨厌这些LINQ表达式,这就是SQL存在的原因:
1 2 3 | select isnull(fn.id, ln.id) as id, fn.firstname, ln.lastname from firstnames fn full join lastnames ln on ln.id=fn.id |
在数据库中将其创建为SQL视图,并将其作为实体导入。
当然,(明显)左连接和右连接的结合也会使它成为现实,但它是愚蠢的。