[英]How can i use 2 numpy arrays as dataset for denoising autoencoder, and further split them into train and test sets
I have 2 numpy arrays, one with clean data [4000 x [1000][25]] (4000 1000x25 arrays) and one with noisy data (same size as clean) to be used for a de-noising auto-encoder problem.我有 2 个 numpy arrays,一个是干净数据 [4000 x [1000][25]](4000 个 1000x25 数组),另一个是噪声数据(与干净大小相同),用于去噪自动编码器问题。
I want to be able to either map them and then store them into a tensorflow data set, or any other way which allows me to do this我希望能够 map 它们,然后将它们存储到 tensorflow 数据集中,或者任何其他允许我这样做的方式
clean[i] -> De-noising Autoencoder -> noisy[i] clean[i] -> 去噪自编码器 -> noisy[i]
Also implement a train and test split in a way that mapping remains.还以保留映射的方式实施训练和测试拆分。
I'm sorry if this is too vague, I'm new to ML and python.如果这太含糊,我很抱歉,我是 ML 和 python 的新手。
assume you have your clean data in an array clean_data and your noisy data in an array noisy_data.假设您在数组 clean_data 中有干净的数据,在数组 noisy_data 中有嘈杂的数据。 Then use train_test_split from sklearn to split the data into a training set and a test as follows然后使用 sklearn 中的 train_test_split 将数据拆分为训练集和测试,如下所示
from sklearn.model_selection import train_test_split
train_size=.7 # set this to the percentage you want for training
clean_train, clean_test, noisy_train, noisy_test=train_test_split(clean_data, noisy_data,
train_size=train_size, randon_state=123)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.