简体   繁体   English

Pandas 时间序列:按日期聚合并转置

[英]Pandas time-series: aggregate by date and transpose

I have the following time series dataframe:我有以下时间序列 dataframe:

dataframe = pd.DataFrame({
    'date': pd.to_datetime([
        '2020-04-01', '2020-04-02', '2020-04-03',
        '2020-04-01', '2020-04-02', '2020-04-03']), 
    'Ticker': ['A', 'A', 'A', 'AAPL', 'AAPL', 'AAPL'],
    'Price': ['8', '10', '12', '100', '200', '50']})
          date   Ticker   Price
0   2020-04-01        A       8
1   2020-04-02        A      10
2   2020-04-03        A      12
3   2020-04-01     AAPL     100
4   2020-04-02     AAPL     200
5   2020-04-03     AAPL      50

The final result should look like:最终结果应如下所示:

dataframe_2 = pd.DataFrame({
    'date': pd.to_datetime(['2020-04-01', '2020-04-02','2020-04-03']), 
    'A': [8, 10, 12],
    'AAPL': [100, 200, 50]})
          date   A  AAPL
0   2020-04-01   8   100
1   2020-04-02  10   200
2   2020-04-03  12    50

Initially I tried using the groupby function but with not much success.最初我尝试使用 groupby function 但收效甚微。

The operation you are trying to do is called pivoting.您尝试执行的操作称为旋转。 That is, creating new columns from the categorical values of a column.也就是说,从列的分类值创建新列。

You can do either of these (same results):您可以执行以下任一操作(结果相同):

df = dataframe.set_index("date").pivot(columns="Ticker", values="Price")

df = dataframe.pivot(index="date", columns="Ticker", values="Price")

It is important to set the index;设置索引很重要; otherwise, the pivot will not know how to combine rows and you will get extra rows with NaN values.否则,pivot 将不知道如何组合行,您将获得带有 NaN 值的额外行。 For the sample data, without the index, it would not know to treat rows 0 and 3 in your base data as the same date.对于示例数据,如果没有索引,它就不知道将基础数据中的第 0 行和第 3 行视为同一日期。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM