[英]Pandas create new column with all the entries from another column corresponding to a unique value
I am sorry if the question is not clear enough.如果问题不够清楚,我很抱歉。 Say I have this dataframe:
假设我有这个 dataframe:
timestamp source dest size
1 a b 5
1 c d 6
2 c e 7
2 d a 8
From this dataframe I want something like this:从这个 dataframe 我想要这样的东西:
timestamp link size
1 a b c d 5 6
2 c e d a 7 8
How can I achieve this?我怎样才能做到这一点?
Thank you谢谢
This is a pivot with a couple added steps since you want to pivot on two columns independently of each other.这是一个带有几个附加步骤的 pivot,因为您希望 pivot 在两列上彼此独立。
u = df.melt('timestamp')
m = u['variable'].isin(['source', 'dest'])
u.loc[m, 'variable'] = 'link'
u.pivot_table(
'value', 'timestamp', 'variable', aggfunc=list)
variable link size
timestamp
1 [a, c, b, d] [5, 6]
2 [c, d, e, a] [7, 8]
An alternative using rename
first首先使用
rename
的替代方法
d = dict(source='link', dest='link')
df.rename(columns=d).melt('timestamp').pivot_table(
'value', 'timestamp', 'variable', aggfunc=list)
variable link size
timestamp
1 [a, c, b, d] [5, 6]
2 [c, d, e, a] [7, 8]
You can also use the groupby
method of pandas dataframe.您也可以使用 pandas dataframe 的
groupby
方法。 Make sure that you size
column contains strings.确保您的
size
列包含字符串。
df['link'] = df['source'] + ' ' + df['dest']
df = df.drop(['source', 'dest'], axis = 1)
newDf = df.groupby('timestamp').agg(lambda col: ' '.join(col))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.