.fit_transform方法的輸出

Question

我想對scikit Learn的PolynomialFeatures類中的.fit_transform（）方法正在輸出的內容有更深入的了解。

我知道該方法在做兩件事：1）通過使數據適合回歸算法來生成數據模型，以及2）根據1中找到的模型創建新數據。

但是我不理解的是輸出。 這是我的代碼：

import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split


np.random.seed(0)
n = 15
x = np.linspace(0,10,n) + np.random.randn(n)/5
y = np.sin(x)+x/6 + np.random.randn(n)/10


X_train, X_test, y_train, y_test = train_test_split(x, y, random_state=0)
X_train1 = X_train.reshape(11,1)
y_train1 = y_train.reshape(11,1)

def answer_one():
    from sklearn.linear_model import LinearRegression
    from sklearn.preprocessing import PolynomialFeatures

    poly1 = PolynomialFeatures(degree=1)

    X_poly1 = poly1.fit_transform(X_train1)

    return X_poly1

answer_one()

我得到的輸出是：

array([[  1.        ,  10.08877265],
       [  1.        ,   3.23065446],
       [  1.        ,   1.62431903],
       [  1.        ,   9.31004929],
       [  1.        ,   7.17166586],
       [  1.        ,   4.96972856],
       [  1.        ,   8.14799756],
       [  1.        ,   2.59103578],
       [  1.        ,   0.35281047],
       [  1.        ,   3.375973  ],
       [  1.        ,   8.72363612]])

我假設每個迷你數組中的每個第二個數字都是模型計算出的值，但我不明白每個1是什么？

Answer 1

從PolynomialFeatures文檔中：

生成由度小於或等於指定度的特征的所有多項式組合組成的新特征矩陣。 例如，如果輸入樣本是二維且格式為[a，b]，則2階多項式特征為[1，a，b，a ^ 2，ab，b ^ 2]。

在您的情況下，輸出是度數小於或等於1的列x所有組合： [1, x] 。 在第一欄中，您有x**0 ，在第二欄中，您有x**1

Answer 2

您稍微誤解了PolynomialFeatures 。 這個想法根本不適合模型，而只是通過將現有特征相乘來創建新特征。 如果輸入樣本為二維且格式為[a, b] ，則2中的多項式特征為[1, a, b, a^2, ab, b^2] 。

因此，您在示例中看到的只是偏差和輸入。 如果在模型中設置“ include_bias = False”，則模型將消失。

.fit_transform方法的輸出

問題描述

2 個解決方案

解決方案1
2 已采納 2018-06-08 14:56:58

解決方案2
2 2018-06-08 14:59:02

.fit_transform方法的輸出

問題描述

2 個解決方案

解決方案1 2 已采納 2018-06-08 14:56:58

解決方案2 2 2018-06-08 14:59:02

解決方案1
2 已采納 2018-06-08 14:56:58

解決方案2
2 2018-06-08 14:59:02