[英]How can I fill a dataframe on specific column names of another dataframe
I have constructed a pretty basic dataframe where the column names are years:我已经构建了一个非常基本的 dataframe 列名是年份:
import pandas as pd
column_names = [x for x in range(2000,2005)]
df = pd.DataFrame(columns=column_names)
Which of course gives me a dataframe where the column names are years and currently has no entries.这当然给了我一个 dataframe 列名是年份并且目前没有条目。
2000 2001 2002 2003 2004
I also have a different dataframe where one column has specific dates and the second column has the corresponding year.我还有一个不同的 dataframe ,其中一列具有特定日期,第二列具有相应的年份。 I call this dataframe set0 because I will continously add more sets which are then numbered (set1,set2,etc.).我将此称为 dataframe set0,因为我将不断添加更多集,然后对其进行编号(set1、set2 等)。
data = {'Date': ['2001-06-08', '2002-05-23', '2002-05-24', '2003-06-23'],
'Year': [2001, 2002, 2002, 2003]}
df2 = pd.DataFrame(data)
Date Year
0 2001-06-08 2001
1 2002-05-23 2002
2 2002-05-24 2002
3 2003-06-23 2003
Now what I want to do is to create something like this: It takes the first dataframe, adds a first column which has a name of a certain dataset, in this case set0.现在我想做的是创建这样的东西:它需要第一个 dataframe,添加一个具有某个数据集名称的第一列,在本例中为 set0。 I will then group this dataset by years and if I have an entry for a year I can create this:然后,我将按年份对这个数据集进行分组,如果我有一年的条目,我可以创建这个:
set_name 2000 2001 2002 2003 2004
set0 0 1 2 1 0
I have found nothing similar on the web.我在 web 上没有发现任何类似的东西。 I have done the grouping but then wasn't able to add the entries in the corresponding columns.我已经完成了分组,但无法在相应的列中添加条目。 Any help or hint is much appreciated!非常感谢任何帮助或提示!
Does this answer your question?这回答了你的问题了吗?
import pandas as pd
column_names = [x for x in range(2000, 2005)]
df = pd.DataFrame(index=column_names)
data = {
'Date': ['2001-06-08', '2002-05-23', '2002-05-24', '2003-06-23'],
'Year': [2001, 2002, 2002, 2003]
}
df2 = pd.DataFrame(data)
df2_grouped = df2.groupby('Year').count()['Date']
df['set0'] = df2_grouped
df = df.fillna(0).reset_index(names='set_name').pivot_table(columns='set_name')
print(df)
Result:结果:
set_name 2000 2001 2002 2003 2004
set0 0.0 1.0 2.0 1.0 0.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.