简体   繁体   English

如何删除第一级索引,然后将剩余的索引值与 pd DataFrame 的自定义逻辑合并?

[英]How to drop 1st level index and then merge the remaining index values with custom logic for a pd DataFrame?

Say I have a MultiIndex DataFrame like so:假设我有一个像这样的 MultiIndex DataFrame:

                             price     volume
year   product   city
2010   A         LA          10        7
       B         SF          7         9 
       C         NY          7         6 
                 LA          18        21
                 SF          4         8
2011   A         LA          13        5 
       B         SF          2         4 
       C         NY          9         3
                 SF          2         0

I want to do a somewhat complex merge where the first level of the DataFrame index (year) is dropped and the duplicates in the now first level index (product) in the DataFrame get merged according to some custom logic.我想做一个有点复杂的合并,其中删除 DataFrame 索引(年份)的第一级,并根据一些自定义逻辑合并 DataFrame 中现在第一级索引(产品)中的重复项。 In this case I would like to be able to set the price column to use the value from the 2010 outer index and the volume column to use the values from the 2011 outer index, but I would like a general solution that can be applied to more columns should they exist.在这种情况下,我希望能够将价格列设置为使用 2010 年外部索引中的值,将交易量列设置为使用 2011 年外部索引中的值,但我想要一个可以应用于更多列应该存在。

Final DataFrame would look like this, where the price values are those from the 2010 index and the volume values are those from the 2011 index, where missing values are filled with NaNs.最终的 DataFrame 看起来像这样,其中价格值来自 2010 年指数,交易量值来自 2011 年指数,其中缺失值用 NaN 填充。

                      price     volume
product   city
A         LA          10        5
B         SF          7         4 
C         NY          7         3 
          LA          18        NaN
          SF          4         0

You can select by first level by DataFrame.xs and then concat :您可以通过DataFrame.xs按第一级选择,然后concat

df = pd.concat([df.xs(2010)['price'], df.xs(2011)['volume']], axis=1)

Also is possible use loc :也可以使用loc

df = pd.concat([df.loc[2010, 'price'], df.loc[2011, 'volume']], axis=1)

print (df)
              price  volume
product city               
A       LA       10     5.0
B       SF        7     4.0
C       LA       18     NaN
        NY        7     3.0
        SF        4     0.0

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何删除具有特定 1 级和 2 级索引的多行? - How to drop multiple rows with certain 1st level and 2nd level index? 如何根据第一级最大值过滤MultiIndex数据帧? - How to filter MultiIndex dataframe based on 1st level max values? 如果列表中的 0 值相同,有没有办法组合列表的第一个索引? - Is there a way to combine the 1st index of a list if the 0 values in the list are the same? 关于 pd.dataframe.reset_index() 中 drop=True 的问题 - Question about drop=True in pd.dataframe.reset_index() 如何从数据透视表中删除列名/标签,而剩余的列名下降到索引名级别? - How can remove a column name/label from a pivot table and remaining column names drop to index name level? 使用 pd.melt 和 merge 为 Seaborn 和 matplotlib 创建 DataFrame 时如何保留索引 - How to keep the index when using pd.melt and merge to create a DataFrame for Seaborn and matplotlib 如何在 dataframe 中居中对齐标题和值,以及如何在 dataframe 中删除索引 - How to center align headers and values in a dataframe, and how to drop the index in a dataframe Pandas Merge(pd.merge)如何设置索引和连接 - Pandas Merge (pd.merge) How to set the index and join 根据0级索引自定义排序多索引Pandas DataFrame的1级索引 - Custom sorting of the level 1 index of a multiindex Pandas DataFrame according to the level 0 index 根据列中的值计算 pd.DataFrame() 索引的中值 - calculate a median value of pd.DataFrame() index based on values in the column
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM