简体   繁体   中英

What is the difference between a Tensor and a DataSet in TensorFlow 2.0?

I've been having trouble understanding the difference between a Tensor and a DataSet. From the definitions I managed to find in the documentation:

Tensors :

Tensors are multi-dimensional arrays with a uniform type

Datasets :

Represents a potentially large set of elements.

Is a DataSet a special type of big tensor? My understanding is that Tensors are basically pandas DataFrames, they are n-dimensional structures with a uniform type along a "column" (or tensors need to be entirely uniform?). If that's the case then why do DataSets exist? Do they have additional functionalities that Tensors do not?

I'm kinda lost in the documentation of different tensor types and with the transition to tensorflow 2.0 I'm never sure if the information I'm finding online is deprecated. Any help would be appreciated.

You should read more of the documentation. The links are in your question, but for clarity, here it is for Tensors and for Dataset .

Tensors are n-dimensional, meaning they can have an arbitrary number of dimensions. They are not rectangular, which would be restricted to 2-dimensional.

Datasets are an API for inputting data. If your data are in Dataset form, then you can use the various processing methods that the tf.data module provides, like parallel processing, shuffling, batching, and others. You can also match your features an labels in a single Dataset, as opposed to using two separate Tensor objects.

Once you iterate over a Dataset, you get Tensor objects.

Perhaps the distinction would also be clearer if you took a look at TensorFlow Datasets . That project distributes data in Dataset form.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM