Is there any pythonic way to combine two dicts (adding values for keys that appear in both)?
例如,我有两个口述:
1 2 | Dict A: {'a': 1, 'b': 2, 'c': 3} Dict B: {'b': 3, 'c': 4, 'd': 5} |
我需要一种"结合"两个口述的Python式方法,结果是:
1 | {'a': 1, 'b': 5, 'c': 7, 'd': 5} |
也就是说:如果一个键出现在两个dict中,则添加它们的值;如果它只出现在一个dict中,则保留其值。
使用
1 2 3 4 5 | >>> from collections import Counter >>> A = Counter({'a':1, 'b':2, 'c':3}) >>> B = Counter({'b':3, 'c':4, 'd':5}) >>> A + B Counter({'c': 7, 'b': 5, 'd': 5, 'a': 1}) |
计数器基本上是
更通用的解决方案,也适用于非数字值:
1 2 3 4 5 | a = {'a': 'foo', 'b':'bar', 'c': 'baz'} b = {'a': 'spam', 'c':'ham', 'x': 'blah'} r = dict(a.items() + b.items() + [(k, a[k] + b[k]) for k in set(b) & set(a)]) |
或者更一般:
1 2 3 | def combine_dicts(a, b, op=operator.add): return dict(a.items() + b.items() + [(k, op(a[k], b[k])) for k in set(b) & set(a)]) |
例如:
1 2 3 4 5 6 | >>> a = {'a': 2, 'b':3, 'c':4} >>> b = {'a': 5, 'c':6, 'x':7} >>> import operator >>> print combine_dicts(a, b, operator.mul) {'a': 10, 'x': 7, 'c': 24, 'b': 3} |
1 2 3 4 5 6 | >>> A = {'a':1, 'b':2, 'c':3} >>> B = {'b':3, 'c':4, 'd':5} >>> c = {x: A.get(x, 0) + B.get(x, 0) for x in set(A).union(B)} >>> print(c) {'a': 1, 'c': 7, 'b': 5, 'd': 5} |
简介:有(可能)最好的解决方案。但是你必须了解它并记住它,有时你必须希望你的Python版本不会太旧,或者不管问题是什么。
还有最"黑客"的解决方案。它们既伟大又短暂,但有时很难理解、阅读和记忆。
不过,还有一种选择,那就是尝试重新发明轮子。-为什么要重新发明轮子?-一般来说,这是一种很好的学习方法(有时仅仅是因为现有的工具不能完全按照您的意愿和/或您希望的方式进行学习),如果您不知道或不记得解决问题的最佳工具,这也是最简单的方法。
因此,我建议从
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | class MyDict(dict): def __add__(self, oth): r = self.copy() try: for key, val in oth.items(): if key in r: r[key] += val # You can custom it here else: r[key] = val except AttributeError: # In case oth isn't a dict return NotImplemented # The convention when a case isn't handled return r a = MyDict({'a':1, 'b':2, 'c':3}) b = MyDict({'b':3, 'c':4, 'd':5}) print(a+b) # Output {'a':1, 'b': 5, 'c': 7, 'd': 5} |
可能还有其他的方法来实现这一点,而且已经有了工具来实现这一点,但是想象一下事情基本上是如何工作的总是很好的。
1 2 3 | myDict = {} for k in itertools.chain(A.keys(), B.keys()): myDict[k] = A.get(k, 0)+B.get(k, 0) |
没有额外进口的那个!
它们是一种叫做EAFP的Python标准(请求宽恕比请求允许更容易)。下面的代码基于Python标准。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | # The A and B dictionaries A = {'a': 1, 'b': 2, 'c': 3} B = {'b': 3, 'c': 4, 'd': 5} # The final dictionary. Will contain the final outputs. newdict = {} # Make sure every key of A and B get into the final dictionary 'newdict'. newdict.update(A) newdict.update(B) # Iterate through each key of A. for i in A.keys(): # If same key exist on B, its values from A and B will add together and # get included in the final dictionary 'newdict'. try: addition = A[i] + B[i] newdict[i] = addition # If current key does not exist in dictionary B, it will give a KeyError, # catch it and continue looping. except KeyError: continue |
编辑:感谢Jerzyk提出的改进建议。
在这种情况下,对
1 2 3 4 5 6 7 8 | In [1]: from collections import Counter In [2]: A = Counter({'a':1, 'b':2, 'c':3}) In [3]: B = Counter({'b':3, 'c':-4, 'd':5}) In [4]: A + B Out[4]: Counter({'d': 5, 'b': 5, 'a': 1}) |
这是因为
- The Counter class itself is a dictionary
subclass with no restrictions on its keys and values. The values are
intended to be numbers representing counts, but you could store
anything in the value field.- The
most_common() method requires only
that the values be orderable.- For in-place operations such as
c[key]
+= 1 , the value type need only support addition and subtraction. So fractions, floats, and decimals would work and negative values are
supported. The same is also true forupdate() andsubtract() which
allow negative and zero values for both inputs and outputs.- The multiset methods are designed only for use cases with positive values.
The inputs may be negative or zero, but only outputs with positive
values are created. There are no type restrictions, but the value type
needs to support addition, subtraction, and comparison.- The
elements() method requires integer counts. It ignores zero and negative counts.
因此,为了在求和计数器后解决这个问题,您可以使用
1 2 3 4 | In [24]: A.update(B) In [25]: A Out[25]: Counter({'d': 5, 'b': 5, 'a': 1, 'c': -1}) |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | import itertools import collections dictA = {'a':1, 'b':2, 'c':3} dictB = {'b':3, 'c':4, 'd':5} new_dict = collections.defaultdict(int) # use dict.items() instead of dict.iteritems() for Python3 for k, v in itertools.chain(dictA.iteritems(), dictB.iteritems()): new_dict[k] += v print dict(new_dict) # OUTPUT {'a': 1, 'c': 7, 'b': 5, 'd': 5} |
或
或者,您可以使用counter作为上面提到的@martijn。
对于更通用和可扩展的方式,请检查mergedict。它使用
例子:
1 2 3 4 5 6 7 8 9 10 11 | from mergedict import MergeDict class SumDict(MergeDict): @MergeDict.dispatch(int) def merge_int(this, other): return this + other d2 = SumDict({'a': 1, 'b': 'one'}) d2.merge({'a':2, 'b': 'two'}) assert d2 == {'a': 3, 'b': 'two'} |
另外,请注意,
1 2 3 4 5 6 7 8 9 10 | from collections import Counter a = Counter({'menu': 20, 'good': 15, 'happy': 10, 'bar': 5}) b = Counter({'menu': 1, 'good': 1, 'bar': 3}) %timeit a + b; ## 100000 loops, best of 3: 8.62 μs per loop ## The slowest run took 4.04 times longer than the fastest. This could mean that an intermediate result is being cached. %timeit a.update(b) ## 100000 loops, best of 3: 4.51 μs per loop |
来自python 3.5:合并和求和
感谢@tokeinizer fsj在评论中告诉我,我没有完全理解这个问题的含义(我认为添加意味着只添加两个词中最终不同的键,相反,我的意思是应该对共同的键值求和)。所以我在合并之前添加了这个循环,这样第二个字典就包含了公共键的和。最后一个字典将是其值将在新字典中保持不变的字典,这是两个字典合并的结果,所以我认为问题已经解决了。该解决方案在Python3.5及以下版本中有效。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | a = { "a": 1, "b": 2, "c": 3 } b = { "a": 2, "b": 3, "d": 5 } # Python 3.5 for key in b: if key in a: b[key] = b[key] + a[key] c = {**a, **b} print(c) >>> c {'a': 3, 'b': 5, 'c': 3, 'd': 5} |
可重用代码
1 2 3 4 5 6 7 8 9 10 11 12 13 | a = {'a': 1, 'b': 2, 'c': 3} b = {'b': 3, 'c': 4, 'd': 5} def mergsum(a, b): for k in b: if k in a: b[k] = b[k] + a[k] c = {**a, **b} return c print(mergsum(a, b)) |
这是一个合并两个字典的简单解决方案,其中
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | a = {'a':1, 'b':2, 'c':3} dicts = [{'b':3, 'c':4, 'd':5}, {'c':9, 'a':9, 'd':9}] def merge_dicts(merged,mergedfrom): for k,v in mergedfrom.items(): if k in merged: merged[k] += v else: merged[k] = v return merged for dct in dicts: a = merge_dicts(a,dct) print (a) #{'c': 16, 'b': 5, 'd': 14, 'a': 10} |
在一行中合并三个dict a、b、c,而不使用任何其他模块或libs
如果我们有三个口述
1 2 3 | a = {"a":9} b = {"b":7} c = {'b': 2, 'd': 90} |
全部合并为一行,并使用返回dict对象
1 | c = dict(a.items() + b.items() + c.items()) |
返回
1 | {'a': 9, 'b': 2, 'd': 90} |
1 2 3 4 5 6 | def merge_with(f, xs, ys): xs = a_copy_of(xs) # dict(xs), maybe generalizable? for (y, v) in ys.iteritems(): xs[y] = v if y not in xs else f(xs[x], v) merge_with((lambda x, y: x + y), A, B) |
您可以很容易地概括如下:
1 2 3 4 5 | def merge_dicts(f, *dicts): result = {} for d in dicts: for (k, v) in d.iteritems(): result[k] = v if k not in result else f(result[k], v) |
然后它可以接受任何数量的听写。
这个解决方案很容易使用,它被用作普通字典,但您可以使用和函数。
1 2 3 4 5 6 7 | class SumDict(dict): def __add__(self, y): return {x: self.get(x, 0) + y.get(x, 0) for x in set(self).union(y)} A = SumDict({'a': 1, 'c': 2}) B = SumDict({'b': 3, 'c': 4}) # Also works: B = {'b': 3, 'c': 4} print(A + B) # OUTPUT {'a': 1, 'b': 3, 'c': 6} |
如何:
1 2 3 4 5 6 7 8 9 10 | def dict_merge_and_sum( d1, d2 ): ret = d1 ret.update({ k:v + d2[k] for k,v in d1.items() if k in d2 }) ret.update({ k:v for k,v in d2.items() if k not in d1 }) return ret A = {'a': 1, 'b': 2, 'c': 3} B = {'b': 3, 'c': 4, 'd': 5} print( dict_merge_and_sum( A, B ) ) |
输出:
1 | {'d': 5, 'a': 1, 'c': 7, 'b': 5} |
上面的解决方案非常适合您拥有少量
1 2 3 4 5 6 7 8 9 10 11 | from collections import Counter A = Counter({'a':1, 'b':2, 'c':3}) B = Counter({'b':3, 'c':4, 'd':5}) C = Counter({'a': 5, 'e':3}) list_of_counts = [A, B, C] total = sum(list_of_counts, Counter()) print(total) # Counter({'c': 7, 'a': 6, 'b': 5, 'd': 5, 'e': 3}) |
上述解决方案实质上是将
1 2 3 4 5 | total = Counter() for count in list_of_counts: total += count print(total) # Counter({'c': 7, 'a': 6, 'b': 5, 'd': 5, 'e': 3}) |
这是同样的事情,但我认为它总是有助于看到它在下面有效地做什么。
最好使用dict():
1 2 3 4 | A = {'a':1, 'b':2, 'c':3} B = {'b':3, 'c':4, 'd':5} Merged = dict(A, **B) Merged == {'a':1, 'b':3, 'c':3, 'd':5} |