关于python:如何计算从一个数据点到大熊猫数据帧的所有其他数据点的欧几里德距离之和?

How to calculate sum of Euclidean distances from one datapoint to all other datapoints from pandas dataframe?

我有以下pandas数据帧:

1
2
3
4
5
6
7
8
9
10
11
12
13
import pandas as pd
import math

df = pd.DataFrame()
df['x'] = [2, 1, 3]
df['y'] = [2, 5, 6]
df['weight'] = [11, 12, 13]
print(df)

     x    y   weight  
 0   2    2       11      
 1   1    5       12      
 2   3    6       13

假设这3个节点分别被称为{a,b,c}。 我想计算从一个节点到所有其他节点的总欧几里德距离乘以其权重,如下所示:

1
Sum = 11(d(a,b)+d(a,c)) + 12(d(b,a)+d(b,c)) + 13(d(c,a)+d(c,b))


使用SciPy's cdist -

1
2
3
4
5
6
7
8
In [72]: from scipy.spatial.distance import cdist

In [73]: a = df[['x','y']].values

In [74]: w = df.weight.values

In [100]: cdist(a,a).sum(1) * w
Out[100]: array([ 80.13921614,  64.78014765,  82.66925684])

我们还可以使用来自相同SciPy方法的pdistsquareform的组合来替换cdist

验证这些实际值 -

1
2
3
4
5
6
7
8
9
10
In [76]: from scipy.spatial.distance import euclidean

In [77]: euclidean([2,2],[1,5])*11 + euclidean([2,2],[3,6])*11
Out[77]: 80.139216143646451

In [78]: euclidean([1,5],[2,2])*12 + euclidean([1,5],[3,6])*12
Out[78]: 64.78014765201803

In [80]: euclidean([3,6],[2,2])*13 + euclidean([3,6],[1,5])*13
Out[80]: 82.669256840526856