簡體   English   中英

如何壓縮 pandas 數據框,其中行是月份,我試圖將它們壓縮成年?

[英]How do I condense a pandas data frame where the rows are the months and I'm trying to condense them into years?

所以我有一個 dataframe

https://docs.google.com/spreadsheets/d/19ssG8bvkZKVDR6V5yU9fZVRJbJNfTTEYmWqLwmDwBa0/edit#gid=0

這是我的代碼給出的輸出。

這是代碼:

from yahoofinancials import YahooFinancials
import pandas as pd
import datetime as datetime

df = pd.read_excel('C:/Users/User/Downloads/Div Tickers.xlsx', sheet_name='Sheet1')

tickers_list = df['Ticker'].tolist()
data = pd.DataFrame(columns=tickers_list)


yahoo_financials_ecommerce = YahooFinancials(data)

ecommerce_income_statement_data = yahoo_financials_ecommerce.get_financial_stmts('annual', 'income')

data = ecommerce_income_statement_data['incomeStatementHistory']

df_dict = dict()

for ticker in tickers_list:

    df_dict[ticker] = pd.concat([pd.DataFrame(data[ticker][x]) for x in range(len(data[ticker]))],
               sort=False, join='outer', axis=1)

df = pd.concat(df_dict, sort=True)

df_l = pd.DataFrame(df.stack())
df_l.reset_index(inplace=True)
df_l.columns = ['ticker', 'financials', 'date', 'value']

df_w = df_l.pivot_table(index=['date.year', 'financials'], columns='ticker', values='value')


export_excel = df_w.to_excel(r'C:/Users/User/Downloads/Income Statement Histories.xlsx', sheet_name="Sheet1", index= True)

我將如何 go 將幾個月壓縮成幾年,以便數據與去年同期相比?

不確定,因為您沒有向我們提供任何數據,但您可以使用以下代碼將日期時間列更改為年份。 第一位只是生成一些小數據:

from datetime import datetime, timedelta
from random import randint

df = pd.DataFrame({
    'dates': [datetime.today() - timedelta(randint(0, 1000)) for _ in range(50)]
})

print(df.head())

                       dates
0 2019-09-02 21:01:46.702300
1 2019-11-03 21:01:46.702329
2 2019-04-01 21:01:46.702338
3 2019-03-04 21:01:46.702345
4 2019-03-28 21:01:46.702351

重要的部分

df.dates.dt.to_period('Y')

0     2018
1     2018
2     2019
3     2018
4     2019
5     2020

IIUC,你需要融化,然后在你的日期列上使用groupby按年份分組。

#df['date'] = pd.to_datetime(df['date'])

df = pd.melt(df,id_vars=['date','financials'],var_name='ticker')

df.groupby([df['date'].dt.year,df['financials'],df['ticker']])['value'].sum().unstack()

ticker                                      AEM          AGI           ALB  \
date financials                                                              
2016 costOfRevenue                 1.030000e+09  309000000.0  1.710000e+09   
     discontinuedOperations        0.000000e+00          0.0  2.020000e+08   
     ebit                          3.360000e+08   21300000.0  5.370000e+08   
     grossProfit                   1.110000e+09  173000000.0  9.700000e+08   
     incomeBeforeTax               2.680000e+08   -7600000.0  5.750000e+08   
...                                         ...          ...           ...   
2019 researchDevelopment           0.000000e+00          0.0  5.828700e+07   
     sellingGeneralAdministrative  1.210000e+08   19800000.0  4.390000e+08   
     totalOperatingExpenses        1.650000e+09  557000000.0  2.830000e+09   
     totalOtherIncomeExpenseNet   -1.000000e+08    2900000.0 -6.900000e+07   
     totalRevenue                  2.490000e+09  683000000.0  3.590000e+09  

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM