[英]Merge Dataframe rows based on the date
I have a dataframe that looks like this , It has the name of the company, the date and the title of a headline that was published regarding that company on that day.我有一个看起来像这样的 dataframe,上面有公司名称、日期和当天发布的关于该公司的标题的标题。 There are multiple headlines published on that single day and every single one of those headlines take up a different row even for the same date.当天发布了多个标题,即使在同一日期,这些标题中的每一个都占据不同的行。
What I wish to do is merge all the title rows as per the date, so the Title column would represent ALL the headlines that were published on the day.我想做的是根据日期合并所有标题行,因此标题列将代表当天发布的所有标题。 I tried doing it, but just messed up my dataframe.我试过这样做,但只是弄乱了我的 dataframe。
Any help will be greatly appreciated!任何帮助将不胜感激!
You can groupby
and aggregate
:您可以groupby
和aggregate
:
from datetime import date
import pandas as pd
df = pd.DataFrame(
{
"company": ["GOOG", "GOOG", "META", "META"],
"date": [
date(2022, 6, 1),
date(2022, 6, 1),
date(2022, 6, 1),
date(2022, 6, 2),
],
"title": ["google good", "google bad", "meta good", "meta bad"],
}
)
df.groupby(["company", "date"]).aggregate(list).reset_index()
gives给
company date title
0 GOOG 2022-06-01 [google good, google bad]
1 META 2022-06-01 [meta good]
2 META 2022-06-02 [meta bad]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.