简体   繁体   English

Pandas:根据 2 个系列之间的差异创建列表

[英]Pandas: Creating a list based on the differences between 2 series

I am writing a custom error message when 2 Pandas series are not equal and want to use '<' to point at the differences.当 2 个 Pandas 系列不相等并且想使用“<”来指出差异时,我正在编写一条自定义错误消息。

Here's the workflow for a failed equality:这是失败的平等的工作流程:

  1. Convert both lists to Python: pd.Series([list])将两个列表转换为 Python: pd.Series([list])
  2. Side by side comparison in a dataframe: table = pd.concat([list1], [list2]), axis=1在 dataframe 中并排比较: table = pd.concat([list1], [list2]), axis=1
  3. Add column and index names: table.columns = ['...', '...'] , table.index = ['...', '...']添加列和索引名称: table.columns = ['...', '...'] , table.index = ['...', '...']

Current output:当前output:

|Yours|Actual| |你的|实际的|

|1|1| |1|1|

|2|2| |2|2|

|4|3| |4|3|

Desired output:所需的 output:

|Yours|Actual|-| |你的|实际|-|

|1|1|| |1|1||

|2|2|| |2|2||

|4|3|<| |4|3|<|

The naive solution is iterating through each list index and if it's not equal, appending '<' to another list then putting this list into pd.concat() but I am looking for a method using Pandas. For example,天真的解决方案是遍历每个列表索引,如果它不相等,则将“<”附加到另一个列表,然后将此列表放入pd.concat()但我正在寻找一种使用 Pandas 的方法。例如,

error_series = '<' if (abs(yours - actual) >= 1).all(axis=None) else ''

Ideally it would append '<' to a list if the difference between the results is greater than the Margin of Error of 1, otherwise append nothing理想情况下,如果结果之间的差异大于 1 的误差幅度,则 append '<' 到列表,否则 append 没有

Note: Removed tables due to StackOverflow being picky and not letting my post my question注意:由于 StackOverflow 很挑剔并且不让我发布我的问题而删除了表格

You can create the DF and give index and column names in one line:您可以创建 DF 并在一行中给出索引和列名:

import pandas as pd
list1 = [1,2,4]
list2 = [1,2,10]
df = pd.DataFrame(zip(list1, list2), columns=['Yours', 'Actual'])

Create a boolean mask to find the rows that have a too large difference:创建一个 boolean 掩码以查找差异过大的行:

margin_of_error = 1
mask = df.diff(axis=1)['Actual'].abs()>margin_of_error

Add a column to the DF and set the values of the mask as you want:向 DF 添加一列并根据需要设置掩码的值:

df['too_different'] = df.diff(axis=1)['Actual'].abs()>margin_of_error
df['too_different'].replace(True, '<', inplace=True)
df['too_different'].replace(False, '', inplace=True)

output: output:

   Yours  Actual too_different
0      1       1              
1      2       2              
2      4      10             <

or you can do something like this:或者你可以这样做:

df = df.assign(diffr=df.apply(lambda x: '<' 
                              if (abs(x['yours'] - x['actual']) >= 1) 
                              else '', axis=1))
print(df)
'''
   yours  actual diffr
0      1       1      
1      2       2      
2      4       3     <

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM