尝试在 jupyter notebook 中拆分数据时出错

Question

Code:代码：

import glob
import pandas as pd
from sklearn.model_selection import train_test_split 

files = glob.glob("filepath/*.csv",)
df = [pd.read_csv(f, header=None, sep=";") for f in files]

data = pd.concat(df,ignore_index=True)

X_train, X_test, y_train, y_test = train_test_split(data, test_size=0.33, random_state=42)

Error: ValueError: not enough values to unpack (expected 4, got 2)错误：ValueError：没有足够的值来解包（预期为 4，得到 2）

Answer 1

As per scikit-doc The number of return list will be 2 * len(arrays) .根据scikit-doc返回列表的数量将为2 * len(arrays) 。 Since you are only giving it a single "array" which is data , train_test_split will split your dataframe in 2, X_train, X_test .由于您只给它一个“数组”，即data ， train_test_split会将您的数据帧拆分为 2, X_train, X_test 。

X_train, X_test = train_test_split(data, test_size=0.33, random_state=42)

If your dataframe contains the X and Y data, you can do如果您的数据框包含 X 和 Y 数据，则可以执行

X_train, X_test, y_train, y_test = train_test_split(data['X'], data['Y'], test_size=0.33, random_state=42)

尝试在 jupyter notebook 中拆分数据时出错

问题描述

1 个解决方案

解决方案1
0 2020-10-12 15:01:20

尝试在 jupyter notebook 中拆分数据时出错

问题描述

1 个解决方案

解决方案1 0 2020-10-12 15:01:20

解决方案1
0 2020-10-12 15:01:20