Scaling the target variable is giving error in Python using StandardScaler of Sklearn library
通过使用StandardScaler类的常规过程缩放目标变量会产生错误。 但是,通过添加行
X是具有一个特征的自变量,y是数字目标变量" Salary"。 我试图将支持向量回归应用于该问题,该问题需要显式的特征缩放。 我尝试了以下代码:
1 2 3 4 5 6 7 | from sklearn.preprocessing import StandardScaler sc_X = StandardScaler() sc_y = StandardScaler() X = sc_X.fit_transform(X) y = sc_y.fit_transform(y) |
它给了我这样的错误:
ValueError: Expected 2D array, got 1D array instead Reshape your data
either usingarray.reshape(-1, 1) if your data has a single feature
orarray.reshape(1, -1) if it contains a single sample.
在进行以下更改之后:
1 2 3 | X = sc_X.fit_transform(X) y = y.reshape(-1,1) y = sc_y.fit_transform(y) |
标准化工作得很好。 我需要了解添加此
谢谢。
简而言之,是的,您需要对其进行转换。这是因为根据sklearn文档,
1 2 3 4 5 6 7 8 9 10 | In [1]: x_arr Out[1]: array([1, 2, 3, 4, 5]) # will be considered as 1 sample of 5 feature In [2]: x_arr.reshape(-1,1) Out[2]: array([[1], # 1st sample [2], # 2nd sample [3], # 3rd sample [4], # 4th sample [5]])# 5th sample |
无论如何,关于如何使用
首先,您想存储训练数据的平均值和标准偏差,以供以后扩展测试数据时使用。
1 2 3 4 5 6 7 8 9 10 | from sklearn.preprocessing import StandardScaler scaler = StandardScaler() # Here the scaler will learn the mean and std of train data x_train_scaled = scaler.fit_transform(x_train, y_train) # Use here to transform test data # This ensures both the train and test data are in the same scale x_test_scaled = scaler.transform(x_test) |
希望这可以帮助!
这在SKLearn中很有用。
从缩放器的
Perform standardization by centering and scaling
Parameters: X : array-like, shape [n_samples, n_features] The data
used to scale along the features axis.
现在,最后一个维度必须明确设置为1,不能丢失。在调整数据的形状(即
1 2 3 4 5 6 7 | import numpy as np a = np.array([0,0,0]) print(a) # [0 0 0] print(a.shape) # (3,) b = a.reshape(-1,1) print(b) # [[0] [0] [0]] print(b.shape) # (3,1) |
reshape方法更改数组的形状:例如,如果a是具有6个元素(以及任何形状)的数组,则