How to calculate sum of Euclidean distances from one datapoint to all other datapoints from pandas dataframe?
我有以下pandas数据帧:
1 2 3 4 5 6 7 8 9 10 11 12 13 | import pandas as pd import math df = pd.DataFrame() df['x'] = [2, 1, 3] df['y'] = [2, 5, 6] df['weight'] = [11, 12, 13] print(df) x y weight 0 2 2 11 1 1 5 12 2 3 6 13 |
假设这3个节点分别被称为{a,b,c}。 我想计算从一个节点到所有其他节点的总欧几里德距离乘以其权重,如下所示:
1 | Sum = 11(d(a,b)+d(a,c)) + 12(d(b,a)+d(b,c)) + 13(d(c,a)+d(c,b)) |
使用
1 2 3 4 5 6 7 8 | In [72]: from scipy.spatial.distance import cdist In [73]: a = df[['x','y']].values In [74]: w = df.weight.values In [100]: cdist(a,a).sum(1) * w Out[100]: array([ 80.13921614, 64.78014765, 82.66925684]) |
我们还可以使用来自相同SciPy方法的
验证这些实际值 -
1 2 3 4 5 6 7 8 9 10 | In [76]: from scipy.spatial.distance import euclidean In [77]: euclidean([2,2],[1,5])*11 + euclidean([2,2],[3,6])*11 Out[77]: 80.139216143646451 In [78]: euclidean([1,5],[2,2])*12 + euclidean([1,5],[3,6])*12 Out[78]: 64.78014765201803 In [80]: euclidean([3,6],[2,2])*13 + euclidean([3,6],[1,5])*13 Out[80]: 82.669256840526856 |