如何在Python Pandas中选择两个值之间的DataFrame中的行？

How to select rows in a DataFrame between two values, in Python Pandas?

我正在尝试修改DataFrame df以仅包含列closing_price中的值介于99和101之间的行，并尝试使用下面的代码执行此操作。

但是，我得到了错误

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()

我想知道是否有办法在不使用循环的情况下执行此操作。

1	df = df[(99 <= df['closing_price'] <= 101)]

相关讨论

您应该使用()对布尔向量进行分组以消除歧义。

1	df = df[(df['closing_price'] >= 99) & (df['closing_price'] <= 101)]

有一个更好的替代方法 - 使用query()方法：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23

In [58]: df = pd.DataFrame({'closing_price': np.random.randint(95, 105, 10)})

In [59]: df
Out[59]:
closing_price
0 104
1 99
2 98
3 95
4 103
5 101
6 101
7 99
8 95
9 96

In [60]: df.query('99 <= closing_price <= 101')
Out[60]:
closing_price
1 99
5 101
6 101
7 99

更新：回答评论：

I like the syntax here but fell down when trying to combine with
expresison; df.query('(mean + 2 *sd) <= closing_price <=(mean + 2 *sd)')

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

In [161]: qry ="(closing_price.mean() - 2*closing_price.std())" +\
...: " <= closing_price <=" + \
...: "(closing_price.mean() + 2*closing_price.std())"
...:

In [162]: df.query(qry)
Out[162]:
closing_price
0 97
1 101
2 97
3 95
4 100
5 99
6 100
7 101
8 99
9 95

相关讨论

1	newdf = df.query('closing_price.mean() <= closing_price <= closing_price.std()')

要么

1
2
3
4

mean = closing_price.mean()
std = closing_price.std()

newdf = df.query('@mean <= closing_price <= @std')

你也可以使用.between()方法

1
2
3

emp = pd.read_csv("C:\\py\\programs\\pandas_2\\pandas\\employees.csv")

emp[emp["Salary"].between(60000, 61000)]

Output

enter image description here

如果您正在处理多个值和多个输入，您还可以设置这样的应用函数。在这种情况下，过滤掉具有特定范围的GPS位置的数据帧。

1
2
3
4
5
6
7
8
9
10

def filter_values(lat,lon):
if abs(lat - 33.77) < .01 and abs(lon - -118.16) < .01:
return True
elif abs(lat - 37.79) < .01 and abs(lon - -122.39) < .01:
return True
else:
return False

df = df[df.apply(lambda x: filter_values(x['lat'],x['lon']),axis=1)]

而不是这个

1	df = df[(99 <= df['closing_price'] <= 101)]

你应该用它

1	df = df[(df['closing_price']>=99 ) & (df['closing_price']<=101)]

我们必须使用NumPy的按位逻辑运算符|，＆，?，^来进行复合查询。
此外，括号对于运算符优先级很重要。

有关详细信息，请访问该链接
：比较，掩码和布尔逻辑