简体   繁体   English

在 Pandas DataFrame 中创建一个具有特定值的列

[英]Create a column with particular value in pandas DataFrame

I have DataFrame with columns author (with name of author), hour (when author published the topic) and number_of_topics (how many topics each author published an hour).我有 DataFrame 列authorauthor姓名)、 hour (作者发布主题时)和number_of_topics (每个作者每小时发布多少主题)。 Here is an example:下面是一个例子:

  author hour number_of_topics
0      A  h01                1
1      B  h02                4
2      B  h04                2
3      C  h04                6
4      A  h05                8
5      C  h05                3

My goal is create six columns (for first six hours) and fill them with number of topics.我的目标是创建六列(前六个小时)并用主题数填充它们。 I am tried using df.groupby to do this but did not succeed.我尝试使用df.groupby来做到这一点,但没有成功。 Desired output:期望的输出:

  author h01 h02 h03 h04 h05 h06
0      A   1   0   0   0   8   0
1      B   0   4   0   2   0   0
2      C   0   0   0   6   3   0 

Code to create my DataFrame:创建我的 DataFrame 的代码:

import pandas as pd
df = pd.DataFrame({"author":["A","B", "B","C","A","C"],
                   "hour":["h01","h02","h04","h04","h05","h05"],
                   "number_of_topics":["1","4","2","6","8","3"]})
print(df)

Use pivot with reindex for add mising columns:使用带有reindex pivot表来添加缺失的列:

cols = ['h{:02d}'.format(x) for x in range(1, 7)]
df = (df.pivot('author','hour','number_of_topics')
        .fillna(0)
        .reindex(columns=cols, fill_value=0)
        .reset_index()
        .rename_axis(None, axis=1))
print (df)
  author h01 h02  h03 h04 h05  h06
0      A   1   0    0   0   8    0
1      B   0   4    0   2   0    0
2      C   0   0    0   6   3    0

Or set_index with unstack :set_indexunstack

cols = ['h{:02d}'.format(x) for x in range(1, 7)]
df = (df.set_index(['author','hour'])['number_of_topics']
        .unstack(fill_value=0)
        .reindex(columns=cols, fill_value=0)
        .reset_index()
        .rename_axis(None, axis=1))
print (df)
  author h01 h02  h03 h04 h05  h06
0      A   1   0    0   0   8    0
1      B   0   4    0   2   0    0
2      C   0   0    0   6   3    0

What you are looking for can be achieved through pivot function:您正在寻找的可以通过pivot功能实现:

df.pivot(index = 'author',columns = 'hour',values = 'number_of_topics').fillna(0)

hour    h01     h02     h04     h05
author              
A       1       0       0       8
B       0       4       2       0
C       0       0       6       3

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在 pandas dataframe 中的特定行创建一个新列并插入值? - How to create a new column and insert value at a particular row in pandas dataframe? 根据值是否为 null 创建 pandas dataframe 列 - Create a pandas dataframe column depending if a value is null or not 在 Pandas 数据框中创建 value_counts 列 - Create column of value_counts in Pandas dataframe 根据特定列值从pandas数据帧中采样 - Sampling from pandas dataframe according to particular column value 在 Pandas Dataframe 中查找相似的行并减去特定的列值 - Find similar rows and subtract a particular column value in Pandas Dataframe 如何根据不同的条件为 pandas dataframe 中的特定列赋值? - How to assign value to particular column in pandas dataframe based on different conditions? 如何在 pandas dataframe 的列中以字符串格式查找特定值的索引? - How to find index in a string format for a particular value in a column of a pandas dataframe? 计算 pandas dataframe 中每一列中特定值的出现次数 - Count number of occurences of a particular value in each column in pandas dataframe 获取 Pandas Dataframe 中特定值的索引和列名 - Get index and column name for a particular value in Pandas Dataframe 使用特定列值作为在pandas数据帧中搜索的键 - Use particular column value as key to search in pandas dataframe
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM