简体   繁体   中英

Add a gaussian noise to a Tensorflow Dataset

I have a CSVDataset which has around 6 million rows. For the purposes of this question I am making a TensorSliceDataset as following:-

import tensorflow as tf
import numpy as np

datasetz = tf.data.Dataset.from_tensor_slices((np.random.randn(10, 5), np.random.randn(10,1)))
datasetz = datasetz.map(lambda x, y: (x, x))
datasetz

# <MapDataset element_spec=(TensorSpec(shape=(5,), dtype=tf.float64, name=None), TensorSpec(shape=(5,), dtype=tf.float64, name=None))>

I am trying to make a denoising autoencoder. For this, I need to add some noise to my dataset. If dataset were a numpy.ndarray , I could've added the noise the following way:-

corruption_level = 0.3
datasetz = datasetz + (np.random.randn(10, 5) * corruption_level)

But I don't know how to do it with a CSVDataset object.

This adds each row with random noise:

datasetz = tf.data.Dataset.from_tensor_slices((np.random.randn(10, 5), np.random.randn(10,1)))
datasetz = datasetz.map(lambda x, y: (x+corruption_level*tf.random.uniform(shape=(5,), dtype=tf.float64), y))
datasetz

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM