Pandas 创建新列，其中来自另一列的所有条目对应于唯一值

Question

I am sorry if the question is not clear enough.如果问题不够清楚，我很抱歉。 Say I have this dataframe:假设我有这个 dataframe：

timestamp source dest size

1          a      b     5
1          c      d     6
2          c      e     7
2          d      a     8

From this dataframe I want something like this:从这个 dataframe 我想要这样的东西：

timestamp      link        size
 1             a b c d     5 6
 2             c e d a     7 8

How can I achieve this?我怎样才能做到这一点？

Thank you谢谢

Answer 1

This is a pivot with a couple added steps since you want to pivot on two columns independently of each other.这是一个带有几个附加步骤的 pivot，因为您希望 pivot 在两列上彼此独立。

u = df.melt('timestamp')
m = u['variable'].isin(['source', 'dest'])

u.loc[m, 'variable'] = 'link'

u.pivot_table(
  'value', 'timestamp', 'variable', aggfunc=list)

variable           link    size
timestamp
1          [a, c, b, d]  [5, 6]
2          [c, d, e, a]  [7, 8]

An alternative using rename first首先使用rename的替代方法

d = dict(source='link', dest='link')

df.rename(columns=d).melt('timestamp').pivot_table(
  'value', 'timestamp', 'variable', aggfunc=list)

variable           link    size
timestamp
1          [a, c, b, d]  [5, 6]
2          [c, d, e, a]  [7, 8]

Answer 2

You can also use the groupby method of pandas dataframe.您也可以使用 pandas dataframe 的groupby方法。 Make sure that you size column contains strings.确保您的size列包含字符串。

df['link'] = df['source'] + ' ' + df['dest']
df = df.drop(['source', 'dest'], axis = 1)
newDf = df.groupby('timestamp').agg(lambda col: ' '.join(col))

Pandas 创建新列，其中来自另一列的所有条目对应于唯一值

问题描述

2 个解决方案

解决方案1
2 2019-10-24 14:27:48

解决方案2
1 已采纳 2019-10-24 14:49:50

Pandas 创建新列，其中来自另一列的所有条目对应于唯一值

问题描述

2 个解决方案

解决方案1 2 2019-10-24 14:27:48

解决方案2 1 已采纳 2019-10-24 14:49:50

解决方案1
2 2019-10-24 14:27:48

解决方案2
1 已采纳 2019-10-24 14:49:50