[英]How to select this kind of data from this pandas multi-index dataframe
I'm working with futures market data, here is an example multi-index dataframe: 我正在使用期货市场数据,这是一个示例多索引数据框:
date_index = pd.date_range('2018-03-20', periods = 10)
contract = ['ZN1805', 'ZN1806', 'ZN1807']
price = ['open', 'close']
columns = pd.MultiIndex.from_product([contract, price], names=['contract', 'price'])
df1 = pd.DataFrame(data=np.random.randint(100, 150, (10, columns.shape[0])), index=date_index, columns=columns)
df2 = pd.DataFrame(columns=['contract', 'close'], index=df1.index)
# Set the data in contract column randomly here for illustration
df2.contract = np.random.choice(contract, 10)
Here is what df1
looks like, 这是
df1
样子,
df1
Out[357]:
contract ZN1805 ZN1806 ZN1807
price open close open close open close
2018-03-20 145 144 116 127 107 128
2018-03-21 116 143 114 103 114 148
2018-03-22 101 135 143 125 140 129
2018-03-23 106 139 100 127 116 100
2018-03-24 104 101 148 132 102 140
2018-03-25 125 141 106 136 128 134
2018-03-26 148 146 142 143 108 137
2018-03-27 110 123 128 128 124 127
2018-03-28 144 143 117 116 112 140
2018-03-29 143 114 115 105 124 118
and df2
would be: 和
df2
将是:
df2
Out[364]:
contract close
2018-03-20 ZN1805 NaN
2018-03-21 ZN1807 NaN
2018-03-22 ZN1806 NaN
2018-03-23 ZN1807 NaN
2018-03-24 ZN1807 NaN
2018-03-25 ZN1806 NaN
2018-03-26 ZN1807 NaN
2018-03-27 ZN1806 NaN
2018-03-28 ZN1805 NaN
2018-03-29 ZN1807 NaN
My problem is how do I 'pythonically' fill in the close
column of df2
from df1
that has the same date
index and contract
value? 我的问题是我怎么“pythonically”在填写
close
的列df2
从df1
具有相同的date
索引和contract
价值?
I tried this: 我尝试了这个:
from pandas import IndexSlice as idx
df2['close'] = df1.loc[df2.index, idx[df2.contract.values.tolist(), 'close']]
However I got an error: 但是我得到一个错误:
UnsortedIndexError: 'MultiIndex Slicing requires the index to be fully lexsorted tuple len (2), lexsort depth (1)'
I understand I could do an iterated way to filter each row, but any pythonic way to do it? 我知道我可以做一个迭代的方法来过滤每一行,但是有什么pythonic的方法吗?
Use join
by 2 columns created by xs
for select close
level and unstack
for reshape: 使用
join
由创建2列xs
的选择close
水平和unstack
的重塑:
s = df1.xs('close', axis=1, level=1).unstack().rename('close')
df2 = (df2.drop('close', 1)
.reset_index()
.join(s, on=['contract', 'index'])
.set_index('index')
.rename_axis(None))
print (df2)
contract close
2018-03-20 ZN1805 124
2018-03-21 ZN1805 112
2018-03-22 ZN1807 118
2018-03-23 ZN1807 136
2018-03-24 ZN1805 103
2018-03-25 ZN1805 135
2018-03-26 ZN1805 138
2018-03-27 ZN1805 109
2018-03-28 ZN1805 129
2018-03-29 ZN1805 104
@jezrael's answer is very good, but if for someone who is not familiar with xs
(like me), I just come up with a more complex way to get s
first: @jezrael的回答非常好,但是如果对于不熟悉
xs
(例如我),我只是想出了一种更复杂的方法来获得s
:
s= df1.loc[:, idx[:, 'close']]
s.columns = s.columns.droplevel(1)
s = s.unstack().rename('close')
Of course this 3-line statement doesn't look that attractive. 当然,这三行陈述看起来并不那么吸引人。 :D Then we can get df1 in the same way:
:D然后我们可以用相同的方式得到df1:
df2 = (df2.drop('close', 1)
.reset_index()
.join(s, on=['contract', 'index'])
.set_index('index')
.rename_axis(None))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.