df = pd.DataFrame([[100,90,80,70,36,45], [101,78,65,88,55,78], [92,77,42,79,43,32], [103,98,76,54,45,65]], index = pd.date_range(start='2022-01-01' ,periods=4))
df.columns = pd.MultiIndex.from_tuples((("mkf", "Open"), ("mkf", "Close"), ("tdf", "Open"), ("tdf","Close"), ("ghi","Open"), ("ghi", "Close")))
df
mkf tdf ghi
Open Close Open Close Open Close
2022-01-01 100 90 80 70 36 45
2022-01-02 101 78 65 88 55 78
2022-01-03 92 77 42 79 43 32
2022-01-04 103 98 76 54 45 65
The above data frame is composed of three smaller data frames that are identical in structure (ie same columns and indices). You can think of it as a large data frame that has been concatenated from three smaller ones. In reality there are thousands of these smaller df's concatenated together but for simplicity I've kept it at three.
I'd like to be able to use the split-apply-combine principle to create a new column called r
in each of the three sub data frames.
The split would be based on df.columns.levels[0]
. The apply would be np.log(['Close']).diff()
The combined data frame would produce the below:
mkf tdf ghi
Open Close r Open Close r Open Close r
2022-01-01 100 90 NaN 80 70 NaN 36 45 NaN
2022-01-02 101 78 -0.143101 65 88 0.228842 55 78 0.550046
2022-01-03 92 77 -0.012903 42 79 -0.107889 43 32 -0.890973
2022-01-04 103 98 0.241162 76 54 -0.380464 45 65 0.708651
You could use:
(pd.concat([df,
(np.log(df.xs('Close', level=1, axis=1, drop_level=False))
.rename(columns={'Close': 'r'}, level=1).diff())
], axis=1)
.sort_index(axis=1, level=0, sort_remaining=False)
)
or with a groupby
approach:
(df.stack(0)
.assign(r=lambda d:
d.groupby(level=1)['Close'].apply(lambda s: np.log(s).diff())
)
.unstack(1).swaplevel(axis=1)
.sort_index(axis=1, level=0, sort_remaining=False)
)
output:
ghi mkf tdf
Open Close r Open Close r Open Close r
2022-01-01 36 45 NaN 100 90 NaN 80 70 NaN
2022-01-02 55 78 0.550046 101 78 -0.143101 65 88 0.228842
2022-01-03 43 32 -0.890973 92 77 -0.012903 42 79 -0.107889
2022-01-04 45 65 0.708651 103 98 0.241162 76 54 -0.380464
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.