pandas python中数据帧的随机排列带来了不同的回归结果？

Question

I am trying to randomise my rows in the dataframe - data before applying linear regression, but i realised the regression results differs after the rows are randomised which shouldn't be the case? 我试图在数据帧中随机化我的行 - 在应用线性回归之前的数据，但我意识到在行随机化之后回归结果不同，这不应该是这种情况？ Codes which i have tried using: 我尝试使用的代码：

Without row randomisation: 
data 
X = data[feature_col]
y = data['median_price']
lr = LinearRegression()
lr.fit(X, y)

With row randomisation: 
Method 1: 
data = data.sample(frac=1)

Method 2:
data = data.sample(frac=1, axis=1)

Method 3: 
from sklearn.utils import shuffle
data = shuffle(data)

Method 4: 
data = data.sample(frac=1, axis=1).reset_index(drop=True)

Out of the 4 row randomisation methods i have tried, only Method 4 gives the same results as the one where no randomisation is applied. 在我尝试的4行随机化方法中，只有方法4给出了与未应用随机化的方法相同的结果。 I thought row randomisation does not affects the regression results in any case? 我认为行随机化在任何情况下都不会影响回归结果？

Answer 1

Methods 2 and 4 are identical? 方法2和4是相同的吗？

Regression results should not differ if you are applying the same type of regression to the same data (randomized or not). 如果您将相同类型的回归应用于相同的数据（随机或不随机），则回归结果不应该有所不同。 You should be using axis = 0 to randomize rows of dataframes, axis = 1 randomizes the columns. 您应该使用axis = 0来随机化数据帧行， axis = 1使列随机化。

pandas python中数据帧的随机排列带来了不同的回归结果？

问题描述

1 个解决方案

解决方案1
3 已采纳 2018-07-01 05:25:58

pandas python中数据帧的随机排列带来了不同的回归结果？

问题描述

1 个解决方案

解决方案1 3 已采纳 2018-07-01 05:25:58

解决方案1
3 已采纳 2018-07-01 05:25:58