简体   繁体   English

如何从另一个 dataframe 创建一个 dataframe,每个值列只有最后一个非负值?

[英]How do I create a dataframe from another dataframe with only the last non negative values for each value column?

I have a multi-indexed dataframe like so:我有一个多索引 dataframe 像这样:

           year  value   value2 value3   some_other_column_i_dont_care_about
  one two             
  a   t     2000     0     1     7        aaa
      w     2001     3    -1     4        bbb
      t     2002    -2     1     -3       ccc
  b   t     2000     4     3     6        ddd
      w     2001     7     5     -1       eee    
      t     2002    -8    -3     3        fff
  c   t     2000    11    10     3        ggg
      w     2001   -12    -9     -1       hhh
      t     2002   -15    -6     -5       iii 

How do I create a new, single level df, that just has the latest (in terms of years) non-negative values, like so:如何创建一个新的单级df,它只有最新的(以年计)非负值,如下所示:

            value  value2 value3
 one    
 a            3      1     4
 b            7      5     3
 c           11     10     3  

One option is to melt , use query to keep values >=0 , then use pivot_table with aggfunc='last' to get to wide format again:一种选择是melt ,使用query保持values >=0 ,然后使用pivot_tableaggfunc='last'再次获得宽格式:

new_df = (
    df.reset_index('one')
        .melt(id_vars='one',
              value_vars=['value', 'value2', 'value3'],
              value_name='values')
        .query('values >= 0')
        .pivot_table(index='one', columns='variable', aggfunc='last')
        .droplevel(0, axis=1)
        .rename_axis(index=None, columns=None)
)

Alternatively use groupby last to keep the last value from each group, then unstack after melt :或者使用groupby last保留每个组的最后一个值,然后在melt之后unstack

new_df = (
    df.reset_index('one')
        .melt(id_vars='one',
              value_vars=['value', 'value2', 'value3'],
              value_name='values')
        .query('values >= 0')
        .groupby(['one', 'variable'])
        .last()
        .unstack()
        .droplevel(0, axis=1)
        .rename_axis(index=None, columns=None)
)

new_df : new_df

   value  value2  value3
a      3       1       4
b      7       5       3
c     11      10       3

Assuming years are not guaranteed to be in ascending order chain sort_values before melt:假设年份不保证在融化之前按升序排列链sort_values

new_df = (
    df.reset_index('one')
        .sort_values('year')  # Sort By Year
        .melt(id_vars='one',
              value_vars=['value', 'value2', 'value3'],
              value_name='values')
        .query('values >= 0')
        .pivot_table(index='one', columns='variable', aggfunc='last')
        .droplevel(0, axis=1)
        .rename_axis(index=None, columns=None)
)

Complete Working Example:完整的工作示例:

import pandas as pd

df = pd.DataFrame({
    'one': ['a', 'a', 'a', 'b', 'b', 'b', 'c', 'c', 'c'],
    'two': ['t', 'w', 't', 't', 'w', 't', 't', 'w', 't'],
    'year': [2000, 2001, 2002, 2000, 2001, 2002, 2000, 2001, 2002],
    'value': [0, 3, -2, 4, 7, -8, 11, -12, -15],
    'value2': [1, -1, 1, 3, 5, -3, 10, -9, -6],
    'value3': [3, 4, -3, -1, 3, -2, 7, -9, 3],
    'some_other_column_i_dont_care_about': ['aaa', 'bbb', 'ccc', 'ddd', 'eee',
                                            'fff', 'ggg', 'hhh', 'iii']
}).set_index(['one', 'two'])

new_df = (
    df.reset_index('one')
        .melt(id_vars='one',
              value_vars=['value', 'value2', 'value3'],
              value_name='values')
        .query('values >= 0')
        .pivot_table(index='one', columns='variable', aggfunc='last')
        .droplevel(0, axis=1)
        .rename_axis(index=None, columns=None)
)

print(new_df)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何从另一个 dataframe 创建一个只有最后一个非负值的 dataframe? - How do I create a dataframe from another dataframe with only the last non negative values? 如何根据 pandas dataframe 中另一列的多个值在一列中创建值列表? - How do I create a list of values in a column from several values from another column in a pandas dataframe? Pandas DataFrame:如何从另一列的数值中创建数值? - Pandas DataFrame: How do I create numerical values out of numerical values from another column? 仅当新值不为空时,如何从 dataframe 比较来自另一个 dataframe 的值来更改列值? - How to change a column values from dataframe comparing value from another dataframe only if the new value is not empty? 对于 dataframe 列中的每个值,我想在另一列(熊猫)上创建值 - for each value in dataframe column i wanna need to create values on another column (pandas) 如何将列值从一个 dataframe 提取到另一个? - How do I extract column values from one dataframe to another? 如何用另一个数据框列切片中的值替换数据框列的切片? - How do I replace a slice of a dataframe column with values from another dataframe column slice? 如何用另一个 dataframe 的值替换 dataframe 列中的一组值? - how do i replace set of values in a dataframe column with values from another dataframe? 我如何找到:每列中的第一个非NaN值是DataFrame中该列的最大值吗? - How do I find: Is the first non-NaN value in each column the maximum for that column in a DataFrame? Pandas Dataframe - 如何检查列中数值的符号,如果出现负数则删除符号并创建另一列? - Pandas Dataframe - How do you check the sign of numeric values in a column, remove the sign if negative & create another column if this has happened?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM