[英]Convert pandas dataframe of lists into numpy array
I have the following dataframe:我有以下 dataframe:
import pandas as pd
import numpy as np
df = pd.DataFrame([{'a': [1,3,2]},{'a': [7,6,5]},{'a': [9,8,8]}])
df
df['a'].to_numpy()
df['a'].to_numpy()
=> array([list([1, 3, 2]), list([7, 6, 5]), list([9, 8, 8])], dtype=object)
How can I get a numpy array of shape (3,3)
without writing a for loop?如何在不编写 for 循环的情况下获得形状为
(3,3)
的 numpy 数组?
First create nested lists and then convert to array, only necessary all lists with same lengths:首先创建嵌套列表,然后转换为数组,只需要所有具有相同长度的列表:
arr = np.array(df.a.tolist())
print (arr)
[[1 3 2]
[7 6 5]
[9 8 8]]
If always have the same length如果总是有相同的长度
pd.DataFrame(df.a.tolist()).values
array([[1, 3, 2],
[7, 6, 5],
[9, 8, 8]])
All of these answers are focused on a single column rather than an entire Dataframe. If you have multiple columns, where every entry at index ij
is a list you can do this:所有这些答案都集中在单个列而不是整个 Dataframe。如果您有多个列,索引
ij
处的每个条目都是一个列表,您可以这样做:
df = pd.DataFrame({"A": [[1, 2], [3, 4]], "B": [[5, 6], [7, 8]]})
print(df)
A B
0 [1, 2] [5, 6]
1 [3, 4] [7, 8]
arrays = df.applymap(lambda x: np.array(x, dtype=np.float32)).to_numpy()
result = np.array(np.stack([np.stack(a) for a in array]))
print(result, result.shape)
array([[[1., 2.],
[5., 6.]],
[[3., 4.],
[7., 8.]]], dtype=float32)
I cannot speak to the speed of this, as I use it on very small amounts of data.我不能说它的速度,因为我在非常少量的数据上使用它。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.