[英]Creating a new Dataframe using existing data in a column
I am trying to create a new dataframe based on the data shown in the below Dataframe link.我正在尝试根据以下 Dataframe 链接中显示的数据创建一个新的 dataframe。 Basically I need to create 6 new columns based on the value of "Keyword Type" Which gives me each article per row and all the corresponding keyword type information.
基本上我需要根据“关键字类型”的值创建 6 个新列,这为我提供每行的每篇文章以及所有相应的关键字类型信息。 So the columns would be Article ID, Sport, Competition, Context, etc... and the first row would be Article 1's corresponding info.
因此,列将是文章 ID、运动、比赛、上下文等……而第一行将是文章 1 的相应信息。 I need it per article so I can join it to another dataframe's article column and bring this info in. Is there an efficient way to do this?
我每篇文章都需要它,这样我就可以将它加入另一个数据框的文章列并将此信息带入。有没有一种有效的方法来做到这一点? Click here to view Dataframe
点击此处查看 Dataframe
Current Structure:当前结构:
Article ID | Keyword Type | Keyword Value
Article 1 | Sport | Football
Article 1 | Team | Manchester United
Article 1 | Language | English
Article 1 | Context | News
Expected Output:预期 Output:
Article ID | Sport | Team | Language | Context
Article 1 | Football | Manchester United | English | News
Do the following:请执行下列操作:
res = pd.pivot_table(df, columns="Keyword Type", index="Article ID", aggfunc=lambda x:x)
res = res.droplevel(0, axis="columns")
The result is:结果是:
Context Language Sport Team
Article ID
Article 1 News English Football Manchester United
A combination of set_index and unstack could get you your desired output: set_index和unstack的组合可以得到你想要的 output:
df.set_index(['Article ID','Keyword Type'])
.unstack()
.droplevel(0,axis=1)
.rename_axis(None,axis=1)
Context Language Sport Team
Article ID
Article 1 News English Football Manchester United
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.