关于python：sklearn中的”transform”和”fit_transform”有什么区别？

what is the difference between 'transform' and 'fit_transform' in sklearn

在sklearn python工具箱中，有两个关于sklearn.decomposition.RandomizedPCA的函数transform和fit_transform。两个功能的描述如下

enter image description here

但是他们之间有什么区别呢？

这里的区别只有已经在矩阵上计算了PCA时，才能使用PCA.Transform。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28

In [12]: pc2 = RandomizedPCA(n_components=3)

In [13]: pc2.transform(X) # can't transform because it does not know how to do it.
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-13-e3b6b8ea2aff> in <module>()
----> 1 pc2.transform(X)

/usr/local/lib/python3.4/dist-packages/sklearn/decomposition/pca.py in transform(self, X, y)
714 # XXX remove scipy.sparse support here in 0.16
715 X = atleast2d_or_csr(X)
--> 716 if self.mean_ is not None:
717 X = X - self.mean_
718

AttributeError: 'RandomizedPCA' object has no attribute 'mean_'

In [14]: pc2.ftransform(X)
pc2.fit pc2.fit_transform

In [14]: pc2.fit_transform(X)
Out[14]:
array([[-1.38340578, -0.2935787 ],
[-2.22189802, 0.25133484],
[-3.6053038 , -0.04224385],
[ 1.38340578, 0.2935787 ],
[ 2.22189802, -0.25133484],
[ 3.6053038 , 0.04224385]])

如果您想使用.transform，您需要将转换规则传授给PCA。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17

In [20]: pca = RandomizedPCA(n_components=3)

In [21]: pca.fit(X)
Out[21]:
RandomizedPCA(copy=True, iterated_power=3, n_components=3, random_state=None,
whiten=False)

In [22]: pca.transform(z)
Out[22]:
array([[ 2.76681156, 0.58715739],
[ 1.92831932, 1.13207093],
[ 0.54491354, 0.83849224],
[ 5.53362311, 1.17431479],
[ 6.37211535, 0.62940125],
[ 7.75552113, 0.92297994]])

In [23]:

尤其是PCA变换将矩阵X的PCA分解得到的基的变化应用于矩阵Z。