[英]Error when trying to split the data in jupyter notebook
Code:代码:
import glob
import pandas as pd
from sklearn.model_selection import train_test_split
files = glob.glob("filepath/*.csv",)
df = [pd.read_csv(f, header=None, sep=";") for f in files]
data = pd.concat(df,ignore_index=True)
X_train, X_test, y_train, y_test = train_test_split(data, test_size=0.33, random_state=42)
Error: ValueError: not enough values to unpack (expected 4, got 2)错误:ValueError:没有足够的值来解包(预期为 4,得到 2)
As per scikit-doc The number of return list will be 2 * len(arrays)
.根据scikit-doc返回列表的数量将为
2 * len(arrays)
。 Since you are only giving it a single "array" which is data
, train_test_split
will split your dataframe in 2, X_train, X_test
.由于您只给它一个“数组”,即
data
, train_test_split
会将您的数据帧拆分为 2, X_train, X_test
。
X_train, X_test = train_test_split(data, test_size=0.33, random_state=42)
If your dataframe contains the X and Y data, you can do如果您的数据框包含 X 和 Y 数据,则可以执行
X_train, X_test, y_train, y_test = train_test_split(data['X'], data['Y'], test_size=0.33, random_state=42)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.