如何根据列值和行值重构 dataframe？

Question

I have a dataframe as follows:我有一个 dataframe 如下：

data = {
    'Title': ['001C', '001C', '004C', '001C', '004C', '004C', '007C', '010C'],
    'Items': ['A', 'B', 'D', 'A', 'A', 'K', 'L', 'M']
}
df = pd.DataFrame(data)

df去向

    Title   Items
0   001C    A
1   001C    B
2   004C    D
3   001C    A
4   004C    A
5   004C    K
6   007C    L
7   010C    M

I want to get the Items under each Title without any redundancy.我想在没有任何冗余的情况下获得每个Title下的Items 。 The expected output is预期的 output 是

    001C    004C    007C    010C
0   A       D       L       M
1   B       A                
2           K

Answer 1

You can drop_duplicates , assign a helper column with increasing indexes per Item per group, and pivot :您可以drop_duplicates ， assign一个辅助列，每组每个项目的索引增加，以及pivot ：

(df.drop_duplicates(subset=['Title', 'Items'])
   .assign(index=df.groupby('Title').cumcount())
   .pivot(index='index', columns='Title', values='Items')
   .rename_axis(index=None, columns=None)
  #.fillna('') # uncomment if you want empty strings in place of NaNs 
)

output: output：

      001C 004C 007C 010C                 
0        A    D    L    M
1        B    A  NaN  NaN
2      NaN    K  NaN  NaN

Answer 2

You can also use .drop_duplicates() + .pivot() .您还可以使用.drop_duplicates() + .pivot() 。 Then, relocate the non-NaN values of each column to the top by .dropna() , as follows:然后，通过.dropna()将每列的非 NaN 值重新定位到顶部，如下所示：

(df.drop_duplicates()
   .pivot(columns='Title', values='Items')
   .apply(lambda x: pd.Series(x.dropna().values))
   .rename_axis(columns=None)
)

Result:结果：

  001C 004C 007C 010C
0    A    D    L    M
1    B    A  NaN  NaN
2  NaN    K  NaN  NaN

如何根据列值和行值重构 dataframe？

问题描述

2 个解决方案

解决方案1
2 已采纳 2021-09-27 17:39:33

解决方案2
1 2021-09-27 20:16:12

如何根据列值和行值重构 dataframe？

问题描述

2 个解决方案

解决方案1 2 已采纳 2021-09-27 17:39:33

解决方案2 1 2021-09-27 20:16:12

解决方案1
2 已采纳 2021-09-27 17:39:33

解决方案2
1 2021-09-27 20:16:12