Pandas Dataframe 到 Numpy Vstack 数组按唯一列值

Question

I have a dataframe with following structure:我有一个 dataframe 具有以下结构：

import numpy as np
import pandas as pd

data = {'Group':['1', '1', '2', '2', '3', '3'], 'Value':[1, 2, 3, 4, 5, 6]} 
df = pd.DataFrame(data)

I need to convert that dataframe (which has approx 4000 values per unique group, and 1000 groups) to a numpy array like the following one (order shall be preservered)我需要将 dataframe（每个唯一组大约有 4000 个值和 1000 个组）转换为 numpy 数组，如下所示（应保留顺序）

array([[1, 2],[3, 4],[5,6])

Additionaly: 99% percent of the groups have the same count of values, but some have different counts.另外：99% 的组具有相同的值计数，但有些具有不同的计数。 If some padding would be possilbe to increase to the max.如果一些填充可能会增加到最大值。 count, that would spare me lost data.计数，这样可以避免我丢失数据。

At the moment I iterate trough the uniqe 'Group' values and numpy.vstack them together.目前，我将 uniqe 'Group' 值和 numpy.vstack 一起迭代。 That is slow and far from elegant.这是缓慢的，远非优雅。

Answer 1

IIUC, this is just pivot : IIUC，这只是pivot ：

(df.assign(col=df.groupby('Group').cumcount())
  .pivot(index='Group', columns='col', values='Value')
  .values
)

Output: Output：

array([[1, 2],
       [3, 4],
       [5, 6]], dtype=int64)

Pandas Dataframe 到 Numpy Vstack 数组按唯一列值

问题描述

1 个解决方案

解决方案1
1 已采纳 2020-04-21 15:03:53

Pandas Dataframe 到 Numpy Vstack 数组按唯一列值

问题描述

1 个解决方案

解决方案1 1 已采纳 2020-04-21 15:03:53

解决方案1
1 已采纳 2020-04-21 15:03:53