关于python：使用float索引插值一系列

Interpolating a series with float index

我有以下数据框

1
2
3
4
5
6
7

density A2 B2
0 20 1 0.525
1 30 1 0.577
2 40 1 0.789
3 50 1 1.000
4 75 1 1.000
5 100 1 1.000

我试着用index_column插入result_column列的值value。

比如说value = 35, result_column = 'B2', index_column= 'density'。

1
2
3
4
5

result = pd.Series(df[result_column])
try:
result.index = df[index_column].astype(float)
except ValueError:
evaluation_error(_("cannot perform interpolation on non numeric index"))

然后我用索引value附加一行

1	result = result.append(pd.Series(None,index=[value]))

和插值

1 2	result = result.interpolate(method="values") result = result.loc[value][:1,]

这是失败的

1	TypeError:"Cannot cast array data from dtype('O') to dtype('float64') according to the rule 'safe'"

这里的错误信息并不神秘。我使用的是熊猫0.12，我知道浮动指数有问题。

稍微调试一下，我还可以看到索引创建为object，而不是float，这阻止了插值。

1 2	(Pdb) result.index Index([20.0, 30.0, 40.0, 50.0, 75.0, 100.0, 0.8], dtype=object)

我还没有设法强制序列索引浮动，或者在原始数据帧上执行插值。

我也尝试过

1
2
3

(Pdb) pd.Series(df[result_column], index=df[index_column])
(Pdb) pd.Series(df[result_column], index=df[index_column].astype(float))
(Pdb) pd.Series(df[result_column], index=pd.Series(df[index_column],dtype=float))

都回来了

1
2
3
4
5
6
7
8

density
20 NaN
30 NaN
40 NaN
50 NaN
75 NaN
100 NaN
Name: A2, dtype: float64

我的问题是-什么是最好的为什么执行插值？

编辑跟进@tomaugspurger答案

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

(Pdb) l
249 pdb.set_trace()
250 result = df.set_index(index_column)[result_column]
251 result = result.reindex(result.index + pd.Index([value]))
252
253 -> result = result.interpolate(method='values')[value][:1,]
254 return result
(Pdb) result
20 0.630
30 0.692
35 NaN
40 0.947
50 1.200
75 1.200
100 1.200
Name: B2, dtype: float64
(Pdb) result.index
Index([20, 30, 35, 40, 50, 75, 100], dtype=object)
(Pdb) result.interpolate(method='values')
*** TypeError: Cannot cast array data from dtype('O') to dtype('float64') according to the rule 'safe'

我不明白-在ipython中运行这段代码时，我得到了预期的结果，但在运行时，它总是失败，并出现这种类型错误。

编辑2由于value的类型是Decimal的类型，因此索引变为对象。虽然我不太确定为什么价值会影响指数……我只做一个转换。