将Pandas系列2D numpy数组转换为1D numpy数组列的Pandas DataFrame

Question

First post to stackoverflow. 第一篇文章到stackoverflow。 I have searched an cannot find an answer to this. 我搜索过一个找不到答案。

I have a Pandas Series of 2D numpy arrays: 我有一个Pandas系列2D numpy数组：

import numpy as np
import pandas as pd

x1 = np.array([[0,1],[2,3],[3,4]],dtype=np.uint8)
x2 = np.array([[5,6],[7,8],[9,10]],dtype=np.uint8)

S = pd.Series(data=[x1,x2],index=['a','b'])

The output S should look like: 输出S应如下所示：

a    [[0, 1], [2, 3], [3, 4]]
b    [[5, 6], [7, 8], [9, 10]]

I wish to have it transformed into a Pandas DataFrame D where each column of the 2D numpy array in S becomes a 1D numpy array in a column of D: 我希望将它转换为Pandas DataFrame D，其中S中的2D numpy数组的每一列成为D列中的1D numpy数组：

D should look like: D应该看起来像：

     0        1
a    [0,2,3]  [1,3,4]
b    [5,7,9]  [6,8,10]

Note, my actual data set is 1238500 arrays sized (32,8) so i was trying to avoid iterating over rows. 注意，我的实际数据集是1238500数组大小（32,8）所以我试图避免迭代行。

What is an efficient way to do this? 有效的方法是什么？

Answer 1

One solution with np.stack and map 一个使用np.stack和map解决方案

df =  pd.DataFrame(np.stack(map(np.transpose, S)).tolist(), index=S.index)

print (df)

           0           1
a  [0, 2, 3]   [1, 3, 4]
b  [5, 7, 9]  [6, 8, 10]

Answer 2

You can split and squeeze without ever converting the last dimension to a python list. 您可以拆分和挤压，而无需将最后一个维度转换为python列表。

df = S.apply(np.split, args=[2, 1]).apply(pd.Series).applymap(np.squeeze)

           # 0           1
# a  [0, 2, 3]   [1, 3, 4]
# b  [5, 7, 9]  [6, 8, 10]

In args=[2, 1] , 2 stands for the number of columns and 1 stands for the axis to slice across. 在args=[2, 1] ， 2代表列数， 1代表轴切片。

Types: 类型：

In [280]: df.applymap(type)
Out[280]: 
                         0                        1
a  <class 'numpy.ndarray'>  <class 'numpy.ndarray'>
b  <class 'numpy.ndarray'>  <class 'numpy.ndarray'>

Answer 3

I would do like this: 我想这样做：

# flatten the list
S = S.apply(lambda x: [i for s in x for i in s])

# pick alternate values and create a data frame
S = S.apply(lambda x: [x[::2], x[1::2]]).reset_index()[0].apply(pd.Series)

# name index
S.index = ['a','b']

     0          1
a   [0, 2, 3]   [1, 3, 4]
b   [5, 7, 9]   [6, 8, 10]

将Pandas系列2D numpy数组转换为1D numpy数组列的Pandas DataFrame

问题描述

3 个解决方案

解决方案1
3 已采纳 2018-12-03 23:51:06

解决方案2
1 2018-12-04 04:12:07

解决方案3
0 2018-12-03 23:24:33

将Pandas系列2D numpy数组转换为1D numpy数组列的Pandas DataFrame

问题描述

3 个解决方案

解决方案1 3 已采纳 2018-12-03 23:51:06

解决方案2 1 2018-12-04 04:12:07

解决方案3 0 2018-12-03 23:24:33

解决方案1
3 已采纳 2018-12-03 23:51:06

解决方案2
1 2018-12-04 04:12:07

解决方案3
0 2018-12-03 23:24:33