[英]Create a column with particular value in pandas DataFrame
I have DataFrame with columns author
(with name of author), hour
(when author published the topic) and number_of_topics
(how many topics each author published an hour).我有 DataFrame 列
author
( author
姓名)、 hour
(作者发布主题时)和number_of_topics
(每个作者每小时发布多少主题)。 Here is an example:下面是一个例子:
author hour number_of_topics
0 A h01 1
1 B h02 4
2 B h04 2
3 C h04 6
4 A h05 8
5 C h05 3
My goal is create six columns (for first six hours) and fill them with number of topics.我的目标是创建六列(前六个小时)并用主题数填充它们。 I am tried using
df.groupby
to do this but did not succeed.我尝试使用
df.groupby
来做到这一点,但没有成功。 Desired output:期望的输出:
author h01 h02 h03 h04 h05 h06
0 A 1 0 0 0 8 0
1 B 0 4 0 2 0 0
2 C 0 0 0 6 3 0
Code to create my DataFrame:创建我的 DataFrame 的代码:
import pandas as pd
df = pd.DataFrame({"author":["A","B", "B","C","A","C"],
"hour":["h01","h02","h04","h04","h05","h05"],
"number_of_topics":["1","4","2","6","8","3"]})
print(df)
Use pivot
with reindex
for add mising columns:使用带有
reindex
pivot
表来添加缺失的列:
cols = ['h{:02d}'.format(x) for x in range(1, 7)]
df = (df.pivot('author','hour','number_of_topics')
.fillna(0)
.reindex(columns=cols, fill_value=0)
.reset_index()
.rename_axis(None, axis=1))
print (df)
author h01 h02 h03 h04 h05 h06
0 A 1 0 0 0 8 0
1 B 0 4 0 2 0 0
2 C 0 0 0 6 3 0
Or set_index
with unstack
:或
set_index
与unstack
:
cols = ['h{:02d}'.format(x) for x in range(1, 7)]
df = (df.set_index(['author','hour'])['number_of_topics']
.unstack(fill_value=0)
.reindex(columns=cols, fill_value=0)
.reset_index()
.rename_axis(None, axis=1))
print (df)
author h01 h02 h03 h04 h05 h06
0 A 1 0 0 0 8 0
1 B 0 4 0 2 0 0
2 C 0 0 0 6 3 0
What you are looking for can be achieved through pivot
function:您正在寻找的可以通过
pivot
功能实现:
df.pivot(index = 'author',columns = 'hour',values = 'number_of_topics').fillna(0)
hour h01 h02 h04 h05
author
A 1 0 0 8
B 0 4 2 0
C 0 0 6 3
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.