简体   繁体   English

我如何使用 2 numpy arrays 作为去噪自动编码器的数据集,并将它们进一步拆分为训练集和测试集

[英]How can i use 2 numpy arrays as dataset for denoising autoencoder, and further split them into train and test sets

I have 2 numpy arrays, one with clean data [4000 x [1000][25]] (4000 1000x25 arrays) and one with noisy data (same size as clean) to be used for a de-noising auto-encoder problem.我有 2 个 numpy arrays,一个是干净数据 [4000 x [1000][25]](4000 个 1000x25 数组),另一个是噪声数据(与干净大小相同),用于去噪自动编码器问题。

I want to be able to either map them and then store them into a tensorflow data set, or any other way which allows me to do this我希望能够 map 它们,然后将它们存储到 tensorflow 数据集中,或者任何其他允许我这样做的方式

clean[i] -> De-noising Autoencoder -> noisy[i] clean[i] -> 去噪自编码器 -> noisy[i]

Also implement a train and test split in a way that mapping remains.还以保留映射的方式实施训练和测试拆分。

I'm sorry if this is too vague, I'm new to ML and python.如果这太含糊,我很抱歉,我是 ML 和 python 的新手。

assume you have your clean data in an array clean_data and your noisy data in an array noisy_data.假设您在数组 clean_data 中有干净的数据,在数组 noisy_data 中有嘈杂的数据。 Then use train_test_split from sklearn to split the data into a training set and a test as follows然后使用 sklearn 中的 train_test_split 将数据拆分为训练集和测试,如下所示

from sklearn.model_selection import train_test_split
train_size=.7  # set this to the percentage you want for training
clean_train, clean_test, noisy_train, noisy_test=train_test_split(clean_data, noisy_data,
                                                 train_size=train_size, randon_state=123)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何将此数据集拆分为训练集、验证集和测试集? - How can I split this dataset into train, validation, and test set? 如何正确拆分不平衡数据集以训练和测试集? - How can I properly split imbalanced dataset to train and test set? 调整 numpy arrays 的大小以使用 train_test_split sklearn function? - Resizing numpy arrays to use train_test_split sklearn function? 在训练和测试拆分中,如何使用整个数据集进行训练? - In train and test split, how to use the whole dataset for training? 如何将稀疏矩阵拆分为训练集和测试集? - How to split sparse matrix into train and test sets? 如何为自己的数据集使用 Keras ImageDataGenerator 来训练卷积自编码器? - How to use Keras ImageDataGenerator for own dataset, to train a convolutional autoencoder? 如何将大数据集拆分为训练集、验证集和测试集 - How to split a big dataset into train, validation and testing sets 如何在不使用 function train_test_split 的情况下将数据拆分为测试和训练? - How can I split the data into test and train without using function train_test_split? 如何使用自定义数据集训练 Keras 自动编码器? - How to train a Keras autoencoder with custom dataset? TensorFlow 数据集训练/测试拆分 - TensorFlow Dataset train/test split
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM