[英]Multidimensional Numpy array to Dataframe, Error: raise ValueError("Data must be 1-dimensional") ValueError: Data must be 1-dimensional
I want to train a neural network and I have the labels (one-hot encoded) and the images both as numpy arrays.我想训练一个神经网络,我有标签(单热编码)和图像都是 numpy arrays。 I want to add them to a DataFrame to use them as input for the training.我想将它们添加到 DataFrame 以将它们用作训练的输入。 I tried to recreate an example, it looks something like this:我试图重新创建一个示例,它看起来像这样:
import pandas as pd
import numpy as np
label_onehot_example = np.asarray([[0, 0, 0, 0, 1], [0, 0, 0, 1, 0], [1, 0, 0, 0, 0], [1, 0, 0, 0, 0], [1, 0, 0, 0, 0], [0, 1, 0, 0, 0], [1, 0, 0, 0, 0.], [1, 0, 0, 0, 0]])
images_example = np.random.randint(0, 1, (8, 10, 10, 3))
test_df = pd.DataFrame(data={'images': images_example, 'labels' : label_onehot_example})
The error I get is "raise ValueError("Data must be 1-dimensional") ValueError: Data must be 1-dimensional"我得到的错误是“raise ValueError("Data must be 1-dimensional") ValueError: Data must be 1-dimensional”
I guess it is due to the shape of the image-input (in my example that's (8, 10, 10,3)) but I don't know how to fix it.我猜这是由于图像输入的形状(在我的示例中为 (8, 10, 10,3)),但我不知道如何修复它。 I thought of looping through the image-array and adding the images and labels one by one to the DataFrame but that seems very inefficient.我想过遍历图像数组并将图像和标签一一添加到 DataFrame 但这似乎非常低效。
The values in your dictionary should be a list in this case, as pandas expects some kind of iterable I think for you column values.在这种情况下,您的字典中的值应该是一个列表,因为 pandas 期望我认为您的列值具有某种可迭代性。 You can use the normal 'list()' constructor to change the BWHC numpy array to a list with B elements of shape WHC.您可以使用普通的“list()”构造函数将 BWHC numpy 数组更改为具有 B 形状 WHC 元素的列表。 (same for the labels) (标签相同)
import pandas as pd
import numpy as np
label_onehot_example = np.asarray([[0, 0, 0, 0, 1], [0, 0, 0, 1, 0], [1, 0, 0, 0, 0], [1, 0, 0, 0, 0], [1, 0, 0, 0, 0], [0, 1, 0, 0, 0], [1, 0, 0, 0, 0.], [1, 0, 0, 0, 0]])
images_example = np.random.randint(0, 1, (8, 10, 10, 3))
test_df = pd.DataFrame(data={'images': list(images_example), 'labels' : list(label_onehot_example)})
print(test_df.head())
>>> images labels
0 [[[0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], ... [0.0, 0.0, 0.0, 0.0, 1.0]
1 [[[0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], ... [0.0, 0.0, 0.0, 1.0, 0.0]
2 [[[0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], ... [1.0, 0.0, 0.0, 0.0, 0.0]
3 [[[0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], ... [1.0, 0.0, 0.0, 0.0, 0.0]
4 [[[0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], ... [1.0, 0.0, 0.0, 0.0, 0.0]
print(test_df.images[0].shape)
>>> (10, 10, 3)
PS: Which version of pandas did you use? PS:你用的是哪个版本的pandas? When I first ran you code I got a different error then you reported (mine was "ValueError: If using all scalar values, you must pass an index").当我第一次运行你的代码时,我得到了一个不同的错误,然后你报告了(我的是“ValueError:如果使用所有标量值,你必须传递一个索引”)。 I used pandas 1.1.3我用 pandas 1.1.3
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.