繁体   English   中英

将多维 Numpy 数组分配给 Pandas 系列

[英]Assigning multi-dimensional Numpy Array to a Pandas Series

背景

我有一个shape==(95,15) numpy.ndarray numpy.ndarray 。 我已经有了所需的Series.Index名称,即len(my_index)==95 我想创建一个Series ,其中每个索引都与我的 95x15 numpy.ndarray的行之一相关联。

变量名

  • pfit :95x15 numpy.ndarray
  • my_index : 95x1 list(str)

采取的步骤

  1. 以下失败并出现相应的错误:
my_series = pd.Series(index=my_index, dtype="object", data=pfit)
Traceback (most recent call last):

  File "C:\Users\gford1\AppData\Local\Temp\1/ipykernel_22244/2329315457.py", line 1, in <module>
    my_series = pd.Series(index=my_index, dtype="object", data=pfit)

  File "C:\Users\gford1\AppData\Local\Programs\Spyder\pkgs\pandas\core\series.py", line 439, in __init__
    data = sanitize_array(data, index, dtype, copy)

  File "C:\Users\gford1\AppData\Local\Programs\Spyder\pkgs\pandas\core\construction.py", line 577, in sanitize_array
    subarr = _sanitize_ndim(subarr, data, dtype, index, allow_2d=allow_2d)

  File "C:\Users\gford1\AppData\Local\Programs\Spyder\pkgs\pandas\core\construction.py", line 628, in _sanitize_ndim
    raise ValueError("Data must be 1-dimensional")

ValueError: Data must be 1-dimensional
  1. 因此,我必须遍历my_index并逐个添加pfit arrays:
my_series = pd.Series(index=my_index, dtype="object")
i = 0
for idx in my_series.index:
    my_series[idx] = pfit[i]
    i+=1

#2有效,但我相信有一种我不知道的更好/更快的方法。

In [283]: pfit=np.arange(12).reshape(3,4)
In [284]: pfit
Out[284]: 
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
In [285]: my_index=[1,2,3]

你的构造:

In [286]: my_series = pd.Series(index=my_index, dtype="object")
     ...: i = 0
     ...: for idx in my_series.index:
     ...:     my_series[idx] = pfit[i]
     ...:     i+=1
     ...: 
In [287]: my_series
Out[287]: 
1      [0, 1, 2, 3]
2      [4, 5, 6, 7]
3    [8, 9, 10, 11]
dtype: object
In [288]: my_series.values
Out[288]: 
array([array([0, 1, 2, 3]), array([4, 5, 6, 7]), array([ 8,  9, 10, 11])],
      dtype=object)

我的建议产生了同样的结果:

In [289]: list(pfit)
Out[289]: [array([0, 1, 2, 3]), array([4, 5, 6, 7]), array([ 8,  9, 10, 11])]
In [290]: S = pd.Series(index=my_index, data=list(pfit))
In [291]: S
Out[291]: 
1      [0, 1, 2, 3]
2      [4, 5, 6, 7]
3    [8, 9, 10, 11]
dtype: object
In [292]: S.values
Out[292]: 
array([array([0, 1, 2, 3]), array([4, 5, 6, 7]), array([ 8,  9, 10, 11])],
      dtype=object)

重新创建二维数组:

In [293]: np.stack(S.values)
Out[293]: 
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

Dataframe:

In [294]: df = pd.DataFrame(index=my_index, data=pfit)
In [295]: df
Out[295]: 
   0  1   2   3
1  0  1   2   3
2  4  5   6   7
3  8  9  10  11
In [296]: df.values
Out[296]: 
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM