Pandas - 如何根据其他列值移动列

Question

我有一个很大的 Pandas 数据集（4600 万行），这里用一个小样本表示：

    df = pd.DataFrame([[0, 0, 0, 34],[0, 0, 1, 23],[0, 1, 0, 14],[0, 1, 1, 11],[1, 0, 0, 73],[1, 0, 1, 33],[1, 1, 0, 96],[1, 1, 1, 64],[2, 0, 0, 4],[2, 0, 1, 13],[2, 1, 0, 31],[2, 1, 1, 10]])

df.columns = ['month','player','team','skill']

每个月我们都有球员和球队的产品笛卡尔

id month player team skill
0   0   0   0   34
1   0   0   1   23
2   0   1   0   14
3   0   1   1   11
4   1   0   0   73
5   1   0   1   33
6   1   1   0   96
7   1   1   1   64
8   2   0   0   4
9   2   0   1   13
10  2   1   0   31
11  2   1   1   10

我想按月移动技能列的反向词，以获得这样的东西

0   0   0   0   73
1   0   0   1   33
2   0   1   0   96
3   0   1   1   64
4   1   0   0   4
5   1   0   1   13
6   1   1   0   31
7   1   1   1   10
8   2   0   0   Nan
9   2   0   1   Nan
10  2   1   0   Nan
11  2   1   1   Nan

我怎样才能在 Pandas 中有效地做到这一点？ 谢谢！

Answer 1

如果我理解正确，您想在下个月找到相同的player-team组合的skill 。 您可以使用groupby和transform来做到这一点：

# Sort the rows by `player-team-month` combination so that the
# next row is the subsequent month for the same `player-team`
# or a new `player-team`
tmp = df.sort_values(['player', 'team', 'month'])

# The groupby here serves to divide the dataframe by `player-team`
# Each group is now ordered by `month` so `skill.shift(-1)` can
# give us the `skill` in the following month
skill = tmp.groupby(['player', 'team'])['skill'].transform(lambda s: s.shift(-1))

# Combine the shifted skill with the original attributes
result = pd.concat([tmp[['month', 'player', 'team']], skill], axis=1)

Pandas - 如何根据其他列值移动列

问题描述

1 个解决方案

解决方案1
0 已采纳 2020-11-02 12:09:20

Pandas - 如何根据其他列值移动列

问题描述

1 个解决方案

解决方案1 0 已采纳 2020-11-02 12:09:20

解决方案1
0 已采纳 2020-11-02 12:09:20