從 2d numpy 數組（將是 1d numpy 數組的列）和 1d np 標簽數組創建一個 Pandas 數據框

Question

例如，我有這些 numpy 數組：

import pandas as pd
import numpy as np

# points could be in n dimension, i need a solution that would cover that up
# and being able to calculate distance between points so flattening the data
# is not my goal.
points = np.array([[1, 2], [2, 1], [100, 100], [-2, -1], [0, 0], [-1, -2]])  # a 2d numpy array containing points in space
labels = np.array([0, 1, 1, 1, 0, 0])  # the labels of the points (not necessarily only 0 and 1)

我試圖制作一本字典，並從中創建熊貓數據框：

my_dict = {'point': points, 'label': labels}
df = pd.DataFrame(my_dict, columns=['point', 'label'])

但它沒有用，我得到以下異常：

Exception: Data must be 1-dimensional

可能是因為點的 numpy 數組（二維 numpy 數組）。

想要的結果：

        point  label
0      [1, 2]      0
1      [2, 1]      1
2  [100, 100]      1
3    [-2, -1]      0
4      [0, 0]      0
5    [-1, -2]      1

在此先感謝所有幫助者:)

Answer 1

您應該始終嘗試規范化您的數據，以便每列只包含奇異值，而不是具有維度的數據。

在這種情況下，我會做這樣的事情：

>>> df = pd.DataFrame({'x': points[:,0], 'y': points[:, 1], 'label': labels},
                      columns=['x', 'y', 'label'])
>>> df
     x    y  label
0    1    2      0
1    2    1      1
2  100  100      1
3   -2   -1      1
4    0    0      0
5   -1   -2      0

如果您真的堅持保留點，請在傳遞給pandas之前將它們轉換為列表列表或元組列表以避免此錯誤。

從 2d numpy 數組（將是 1d numpy 數組的列）和 1d np 標簽數組創建一個 Pandas 數據框

問題描述

1 個解決方案

解決方案1
1 已采納 2020-11-18 20:01:00

從 2d numpy 數組（將是 1d numpy 數組的列）和 1d np 標簽數組創建一個 Pandas 數據框

問題描述

1 個解決方案

解決方案1 1 已采納 2020-11-18 20:01:00

解決方案1
1 已采納 2020-11-18 20:01:00