[英]How can I subtract a columns from two elements of multi index?
I have a panel data frame with dimensions in multi-index: country and year.我有一个具有多索引维度的面板数据框:国家和年份。 For each country in level = 0 of the index, I want to divide (or subtract) some variable from the United States only.对于指数级别 = 0 的每个国家/地区,我只想从美国除(或减去)一些变量。
In pseudo code在伪代码中
for country in countries_in_level0:
Data[‘new_variable’][country] = Data[‘variable’][country] - Data[‘variable’][‘United States’]
What I tried to do is我试图做的是
Data[‘new_variable’] = Data[‘variable’] - Data[‘variable’].loc[‘United States’, :]
But I get NaN in for every country but the United States但是我为除美国以外的每个国家都输入了 NaN
If you need to substract "United States" to the entire DataFrame, you can use xs :如果需要在整个 DataFrame 中减去“United States”,可以使用xs :
Data - Data.xs(("United States"))
Here an example:这里有一个例子:
arrays = [['United States', 'United States', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux'],
[1, 2, 1, 2, 1, 2, 1, 2]]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=['country','year'])
s = pd.DataFrame(10+(10*np.random.randn(8,2)), index=index)
s
0 1
country year
United States 1 8.012399 13.287124
2 13.357553 -4.295128
baz 1 20.305391 12.381340
2 0.070968 6.314961
foo 1 25.015921 -11.577952
2 1.301654 6.000196
qux 1 4.198554 -6.915449
2 5.071788 12.423901
s - s.xs(('United States'))
0 1
country year
United States 1 0.000000 0.000000
2 0.000000 0.000000
baz 1 12.292992 -0.905785
2 -13.286585 10.610089
foo 1 17.003522 -24.865077
2 -12.055899 10.295324
qux 1 -3.813845 -20.202574
2 -8.285765 16.719028
PS: If the question was reassign just United States, it is PS:如果问题只是重新分配美国,它是
s.loc[['United States']]=1000
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.