简体   繁体   English

如何在主数据集中找到 X_train 索引?

[英]How can I find X_train indexes in the main dataset?

We can split the dataset to X_train, y_train by Sklearn function in Python.我们可以通过 Python 中的 Sklearn 函数将数据集拆分为 X_train、y_train。

X_train, X_test, y_train, y_test = train_test_split(X, y, shuffle=True, test_size=0.3)

My question is: how can we find the X_train or y_train indexes in our data set?我的问题是:我们如何在我们的数据集中找到 X_train 或 y_train 索引?

suppose we found the prediction by假设我们通过以下方式找到了预测

prediction = model.predict(X_test)

Also, how can we find the indexes for prediction?另外,我们如何找到预测的索引?

I am asking because I would like to see each row's values when I get inaccurate results.我问是因为当我得到不准确的结果时,我想查看每一行的值。

In other words, data is the main dataset and subset is data's subset换句话说,数据是主数据集,子集是数据的子集

data = array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])数据 = 数组([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
subest = array([ 2, 4, 5, 6]) subest = 数组([ 2, 4, 5, 6])

How can I find the subset's index in data?如何在数据中找到子集的索引?

As documented in sklearn.model_selection.train_test_split , it is a quick application of sklearn.model_selection.ShuffleSplit :作为记录sklearn.model_selection.train_test_split ,它是一个快速应用sklearn.model_selection.ShuffleSplit

from sklearn.model_selection import ShuffleSplit, train_test_split

x_train, x_test, y_train, y_test = train_test_split(X, y, random_state=1, test_size=1)
x_train
array([[2, 3],
       [8, 9],
       [0, 1],
       [6, 7]])

This is yield by the split sets of indices from ShuffleSplit :这是来自ShuffleSplit的拆分索引集的收益:

train_ind, test_ind = next(ShuffleSplit(random_state=1).split(X, y))
X[train_ind]
array([[2, 3],
       [8, 9],
       [0, 1],
       [6, 7]])

So you can use train_ind and/or test_ind made by ShuffleSplit and it will be just same as using train_test_split所以你可以使用ShuffleSplit制作的train_ind和/或test_ind ,它和使用train_test_split

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何将自定义图像数据集加载到 X_train - How to load custom image dataset to X_train 我如何 go 关于在 python 中拟合数据集? - lr.fit(x_train, y_train) 给我错误 - How do I go about fitting a dataset in python? - lr.fit(x_train, y_train) giving me errors 如何将数据集拆分为 (X_train, y_train), (X_test, y_test)? - How to split dataset into (X_train, y_train), (X_test, y_test)? 如何在 X_train、y_train、X_test、y_test 中拆分图像数据集? - How to split an image dataset in X_train, y_train, X_test, y_test? 如何解决问题以重塑从 Python 中的 x_train 派生的图像中的过程? - How can I fix the issue to reshape process in image derived from x_train in Python? 使用 Keras,如何输入 X_train 图像(超过一千张图像)? - Using Keras, how can I input an X_train of images (more than a thousand images)? 如何从生成器编写数据集以替换 tensorflow 中切片中的数据集,以获取具有 X_train 和 y_train 的表格数据集 - How do I write a Dataset from generator to replace Dataset from slices in tensorflow for a tabular data set with X_train and y_train 如何将 tf.data.Dataset 拆分为 x_train、y_train、x_test、y_test for keras - how to split up tf.data.Dataset into x_train, y_train, x_test, y_test for keras 如何同时对 X_train 和 y_train 应用增强 - How to simultaneously apply augmentation to X_train and y_train 如何为 LSTM keras 重塑 X_train 和 y_train - How to reshape X_train and y_train for LSTM keras
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM