简体   繁体   中英

Multiply two columns in a pandas MultiIndex dataframe

I have never used multiIndex pandas dataframes, and the Python API for CalcBench, a financial program I use, returned one when I typed the following code:

dataItems = ["ROE", "StockholdersEquity", "SharesOutstandingEndOfPeriod", "EndOfPeriodStockPrice"]             
data = cb.normalized_data(tickers, dataItems, start_year=2021, end_year=2021, period_type='annual')

If I try to explore my dataframe by typing data.info() , I get:

<class 'pandas.core.frame.DataFrame'>
PeriodIndex: 1 entries, 2021 to 2021
Freq: A-DEC
Columns: 1800 entries, ('EndOfPeriodStockPrice', 'A') to ('StockholdersEquity', 'ZTS')
dtypes: float64(1800)

Typing data.columns() gives me:

MultiIndex([('EndOfPeriodStockPrice',    'A'),
            ('EndOfPeriodStockPrice',  'AAL'),
            ('EndOfPeriodStockPrice',  'AAP'),
            ('EndOfPeriodStockPrice', 'AAPL'),
            ('EndOfPeriodStockPrice', 'ABBV'),
            ('EndOfPeriodStockPrice',  'ABC'),
            ('EndOfPeriodStockPrice',  'ABT'),
            ('EndOfPeriodStockPrice',  'ACN'),
            ('EndOfPeriodStockPrice', 'ADBE'),
            ('EndOfPeriodStockPrice',  'ADI'),
            ...
            (   'StockholdersEquity', 'WYNN'),
            (   'StockholdersEquity',  'XEL'),
            (   'StockholdersEquity',  'XOM'),
            (   'StockholdersEquity', 'XRAY'),
            (   'StockholdersEquity',  'XYL'),
            (   'StockholdersEquity',  'YUM'),
            (   'StockholdersEquity',  'ZBH'),
            (   'StockholdersEquity', 'ZBRA'),
            (   'StockholdersEquity', 'ZION'),
            (   'StockholdersEquity',  'ZTS')],
           names=['metric', 'ticker'], length=1800)

I would like to create a new metric MarketCapAtEndOfPeriod in this dataframe for each firm in the year 2021 by multiplying the corresponding entries for EndofPeriodStockPrice and SharesOutstandingEndOfPeriod .

My efforts to do so have gone nowhere, and searching StackOverflow has not helped me solve my problem. Loosely speaking, I'm looking to write something like:

data["MarketCapAtEndOfPeriod"] = data["EndofPeriodStockPrice"] * data["SharesOutstandingEndOfPeriod"]
market_cap = data.EndOfPeriodStockPrice * data.SharesOutstandingEndOfPeriod
market_cap_df = pd.concat(
    {"MarketCap": market_cap},
    names=["metric"],
    axis=1,
)

data = data.join(market_cap_df)

multiplying the corresponding entries for EndofPeriodStockPrice and SharesOutstandingEndOfPeriod

Use DataFrame.xs to access MultiIndex cross-sections by specifying the label and an optional axis/level.

  • In your case, the target labels are in the first level (default), so you can leave out level=0 and just specify axis=1 for columns.
  • To assign the output as new columns, reconstruct the MultiIndex with MultiIndex.from_product before rejoining with the original dataframe.
key1 = 'EndofPeriodStockPrice'
key2 = 'SharesOutstandingEndOfPeriod'
new = 'MarketCapAtEndOfPeriod'

prod = data.xs(key1, axis=1) * data.xs(key2, axis=1)

prod.columns = pd.MultiIndex.from_product([[new], [key1, key2]])
# MultiIndex([('MarketCapAtEndOfPeriod', 'EndofPeriodStockPrice'),
#             ('MarketCapAtEndOfPeriod', 'SharesOutstandingEndOfPeriod')])

df = df.join(prod)

If hypothetically you wanted to divide corresponding AAPL and ABC entries, those are in the second column level, so specify both axis=1 and level=1 :

data.xs('AAPL', axis=1, level=1) / data.xs('ABC', axis=1, level=1)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM