簡體   English   中英

如何重新排列數據幀中的行並獲得與熊貓中其他2列的百分比差異為新的新列?

[英]How to rearrange rows in dataframe and obtain a new columns having percentage difference of 2 other columns in pandas?

我有一個數據框,如下所示:

Case    Peak 'A'    Peak 'B'    Volume 'C'  Volume 'D'
 1       5.00       4.00         0.34         0.32
 2       5.70       6.00         0.14         0.15
 3       11.00      20.00        0.42         0.50

預期輸出如下所示:

在此處輸入圖片說明

其中:

需要添加“差異峰”列,它是百分比差異,即([[BA)/ B] * 100)

'Diff Vol'列將添加([(DC)/ D] * 100),這是百分比差異。

要為“峰”添加“在范圍內”,如果“差異峰”在-15%到25%的范圍內,則必須將該列填充為是。 如果不是,則如圖所示。

類似地,如果“差異體積”在“ -10%到20%”的范圍內,則在“體積”中填充“在范圍內”列。

我該怎么辦?

只需創建新列:

import numpy as np
df['Diff Peak'] = (df.B - df.A) / df.B  * 100
df['Diff Vol'] = (df.D - df.C) / df.D * 100
df['Within Range Peak'] = np.logical_and(df['Diff Peak'] >= -15.0, df['Diff Peak'] <= 25.0)
df['Within Range Vol'] = np.logical_and(df['Diff Vol'] >= -10.0, df['Diff Vol'] <= 20.0)

如果不需要在列中使用Multiindex ,則可以使用:

#use formulas
df['Diff Peak'] = df["Peak 'B'"].sub(df["Peak 'A'"]).div(df["Peak 'B'"]).mul(100)
df['Diff Vol'] = df["Volume 'D'"].sub(df["Volume 'C'"]).div(df["Volume 'D'"]).mul(100)
#check range, then add Yes or No
df['Peak Within Range'] = np.where(df['Diff Peak'].between(-15, 25), 'Yes', 'No')
df['Volumn Within Range'] = np.where(df['Diff Vol'].between(-10, 20), 'Yes', 'No')
#convert to string, rounding (if necessary), add %
df['Diff Peak'] = df['Diff Peak'].round(2).astype(str) + '%'
df['Diff Vol'] = df['Diff Vol'].round(2).astype(str) + '%'
print (df)
   Case  Peak 'A'  Peak 'B'  Volume 'C'  Volume 'D' Diff Peak Diff Vol  \
0     1       5.0       4.0        0.34        0.32    -25.0%   -6.25%   
1     2       5.7       6.0        0.14        0.15      5.0%    6.67%   
2     3      11.0      20.0        0.42        0.50     45.0%    16.0%   

  Peak Within Range Volumn Within Range  
0                No                 Yes  
1               Yes                 Yes  
2                No                 Yes  

但是如果需要在列中使用Multiindex

df = df.set_index('Case')
df['Peak Diff peak'] = df["Peak 'B'"].sub(df["Peak 'A'"]).div(df["Peak 'B'"]).mul(100)
df['Volume Diff Vol'] = df["Volume 'D'"].sub(df["Volume 'C'"]).div(df["Volume 'D'"]).mul(100)
df['Peak Within Range'] = np.where(df['Peak Diff peak'].between(-15, 25), 'Yes', 'No')
df['Volume Within Range'] = np.where(df['Volume Diff Vol'].between(-10, 20), 'Yes', 'No')
df['Peak Diff peak'] = df['Peak Diff peak'].round(2).astype(str) + '%'
df['Volume Diff Vol'] = df['Volume Diff Vol'].round(2).astype(str) + '%'

#filter columns start with Peak
df1 = df.filter(regex='^Peak')
#rename parts of columns 
df1.columns = df1.columns.str.replace('Peak', 'Peak (+25% to -15%)_')
#create MultiIndex
df1.columns = df1.columns.str.split('_ ', expand=True)
print (df1)
     Peak (+25% to -15%)                             
                     'A'   'B' Diff peak Within Range
Case                                                 
1                    5.0   4.0    -25.0%           No
2                    5.7   6.0      5.0%          Yes
3                   11.0  20.0     45.0%           No

#same as df1, only Volume    
df2 = df.filter(regex='^Volume')
df2.columns = df2.columns.str.replace('Volume', 'Volume (+20% to -10%)_')
df2.columns = df2.columns.str.split('_ ', expand=True)
print (df2)
     Volume (+20% to -10%)                            
                       'C'   'D' Diff Vol Within Range
Case                                                  
1                     0.34  0.32   -6.25%          Yes
2                     0.14  0.15    6.67%          Yes
3                     0.42  0.50    16.0%          Yes
#concat both dataframes to one
df3 = pd.concat([df1, df2], axis=1).reset_index()
print (df3)
  Case Peak (+25% to -15%)                              Volume (+20% to -10%)  \
                       'A'   'B' Diff peak Within Range                   'C'   
0    1                 5.0   4.0    -25.0%           No                  0.34   
1    2                 5.7   6.0      5.0%          Yes                  0.14   
2    3                11.0  20.0     45.0%           No                  0.42   


    'D' Diff Vol Within Range  
0  0.32   -6.25%          Yes  
1  0.15    6.67%          Yes  
2  0.50    16.0%          Yes  

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM