比较多列中的值并在 Python 中的另一列中添加新值

Question

我有一个房屋租金价格数据如下：

import pandas as pd
import numpy as np
data = {
    "HouseName": ["A", "A", "B", "B", "B"],
    "Type": ["OneRoom", "TwoRooms", "OneRoom", "TwoRooms", "ThreeRooms"],
    "Jan_S": [1100, 1776, 1228, 1640, np.NaN],
    "Feb_S": [1000, 1805, 1231, 1425, 1800],
    "Mar_S": [1033, 1748, 1315, 1591, 2900],
    "Jan_L": [1005, np.NaN, 1300, np.NaN, 7000]
}
df = pd.DataFrame.from_dict(data)
print(df)

  HouseName        Type   Jan_S  Feb_S  Mar_S   Jan_L 
0         A     OneRoom  1100.0   1000   1033  1005.0 
1         A    TwoRooms  1776.0   1805   1748     NaN 
2         B     OneRoom  1228.0   1231   1315  1300.0 
3         B    TwoRooms  1640.0   1425   1591     NaN 
4         B  ThreeRooms     NaN   1800   2900  7000.0

我需要意识到两件事：首先，我想根据“Jan_S”、“Feb_S”、“Mar_S”、“Jan_L”列找到一月份的合理租金价格。 这里 S 和 L 表示两个不同的数据源，它们都可能有异常值和 nans，但 S 的数据将优先作为 1 月份的最终价格。 其次，对于同一个HouseName，我需要检查并确保一房的价格低于两房，两房的价格低于三房。 我的最终结果将如下所示：

HouseName        Type    Jan_S    Feb_S  Mar_S   Jan_L  
0         A     OneRoom  1100.0   1000   1033  1005.0     
1         A    TwoRooms  1776.0   1805   1748     NaN     
2         B     OneRoom  1228.0   1231   1315  1300.0   
3         B    TwoRooms  1640.0   1425   1591     NaN   
4         B  ThreeRooms     NaN   1800   2900  7000.0    

      Result(Jan)  
0         1100  
1         1776  
2         1228  
3         1640  
4         1800

我的想法是检查 Jan_S 是否在 Jan_L 的 0.95 和 1.05 范围内，如果是，则将 Jan_S 作为最终结果，否则，继续检查来自 Feb_S 的值作为 Jan_S。

请分享您可能需要在 Python 中处理此问题的任何想法。 谢谢！ 以下是一些可能有帮助的参考资料。

从多列中查找最近的值并添加到 Python 中的新列

比较Python中一列的多个条件下的值

检查一列中的值是否在 Python 中另一列的区间值中

Answer 1

您可以为此使用 fillna。

如果您想对选择的列设置条件，那么您需要找出过滤列以从中选择值的逻辑。

我正在使用所有价格列的 min() 显示逻辑

# filter out the price columns
price_cols = df.columns[~df.columns.isin(['HouseName','Type', 'Jan_S'])]

# then figure out the logic to filter the columns you need and use fillna
# here with the min of all columns as example
df['Jan_S'] = df['Jan_S'].fillna(df[price_cols].apply(min, axis=1))

比较多列中的值并在 Python 中的另一列中添加新值

问题描述

1 个解决方案

解决方案1
1 已采纳 2018-12-30 05:54:50

比较多列中的值并在 Python 中的另一列中添加新值

问题描述

1 个解决方案

解决方案1 1 已采纳 2018-12-30 05:54:50

解决方案1
1 已采纳 2018-12-30 05:54:50