简体   繁体   中英

Convert pandas dataframe of lists into numpy array

I have the following dataframe:

import pandas as pd
import numpy as np

df = pd.DataFrame([{'a': [1,3,2]},{'a': [7,6,5]},{'a': [9,8,8]}])
df

df['a'].to_numpy()
df['a'].to_numpy()

=> array([list([1, 3, 2]), list([7, 6, 5]), list([9, 8, 8])], dtype=object)

How can I get a numpy array of shape (3,3) without writing a for loop?

First create nested lists and then convert to array, only necessary all lists with same lengths:

arr = np.array(df.a.tolist())
print (arr)
[[1 3 2]
 [7 6 5]
 [9 8 8]]

If always have the same length

pd.DataFrame(df.a.tolist()).values
array([[1, 3, 2],
       [7, 6, 5],
       [9, 8, 8]])

All of these answers are focused on a single column rather than an entire Dataframe. If you have multiple columns, where every entry at index ij is a list you can do this:

df = pd.DataFrame({"A": [[1, 2], [3, 4]], "B": [[5, 6], [7, 8]]})
print(df)

        A       B
0  [1, 2]  [5, 6]
1  [3, 4]  [7, 8]

arrays = df.applymap(lambda x: np.array(x, dtype=np.float32)).to_numpy()

result = np.array(np.stack([np.stack(a) for a in array]))
print(result, result.shape)

array([[[1., 2.],
         [5., 6.]],
 
        [[3., 4.],
         [7., 8.]]], dtype=float32)

I cannot speak to the speed of this, as I use it on very small amounts of data.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM