简体   繁体   English

比较一列中的值是否在另一列python pandas中的两个值之间

[英]compare whether value in one column is between two values in another column python pandas

I have two data frames as follow: 我有两个数据框如下:

A = pd.DataFrame({"value":[3, 7, 5 ,18,23,27,21,29]})

B = pd.DataFrame({"low":[1, 6, 11 ,16,21,26], "high":[5,10,15,20,25,30], "name":["one","two","three","four","five", "six"]})

I want to find whether "value" in A is between 'high' and low' in B, and if so, I want to copy the column name from B to A. 我想知道A中的“值”是否在B中的“高”和“低”之间,如果是,我想将列名从B复制到A.

The output should look like this: 输出应如下所示:

A = pd.DataFrame({"value":[3, 7, 5 ,18,23,27,21,29], "name":["one","two","one","four","five", "six", "five", "six"]})

My function uses iterrows as follows: 我的函数使用iterrows如下:

def func1(row):
    x = row['value']
    for index,value in B.iterrows():
        if ((value['low'] <= x) &(x<=value['high'])):
            return value['name']

But it doesn't yet achieve what i want to do, 但它还没有实现我想做的事情,

thank you, 谢谢,

You can use a list comprehension to iterate through the values in A , and then use loc to get the relevant mapped values. 您可以使用列表推导来迭代A的值,然后使用loc来获取相关的映射值。 le is less than or equal to, and ge is greater than or equal to. le小于或等于,且ge大于或等于。

For example, v = 3 in the first row. 例如,第一行中v = 3 Using simple boolean indexing: 使用简单的布尔索引:

>>> B[(B['low'].le(v)) & (B['high'].ge(v))]
   high  low name
0     5    1  one

Assuming that DataFrame B does not have any overlapping ranges, then you will get back one row as above. 假设DataFrame B没有任何重叠范围,那么您将返回上面的一行。 One then uses loc to get the name column, as below. 然后使用loc获取name列,如下所示。 Because each returned name is aa series, you need get the first and only scalar value (using iat , for example). 因为每个返回的名称都是一个系列,所以您需要获取第一个和唯一的标量值(例如,使用iat )。

A['name'] = [B.loc[(B['low'].le(v)) & (B['high'].ge(v)), 'name'].iat[0] 
             for v in A['value']]

>>> A
   value  name
0      3   one
1      7   two
2      5   one
3     18  four
4     23  five
5     27   six
6     21  five
7     29   six

I believe you are looking for something like this: 我相信你正在寻找这样的东西:

In [1]: import pandas as pd

In [2]: A = pd.DataFrame({"value":[3, 7, 5 ,18,23,27,21,29]})

In [3]: 

In [3]: B = pd.DataFrame({"low":[1, 6, 11 ,16,21,26], "high":[5,10,15,20,25,30], "name":["one","two","three","four","five", "six"]})

In [4]: A
Out[4]: 
   value
0      3
1      7
2      5
3     18
4     23
5     27
6     21
7     29

In [5]: B
Out[5]: 
   high  low   name
0     5    1    one
1    10    6    two
2    15   11  three
3    20   16   four
4    25   21   five
5    30   26    six

In [6]: def func1(x):
   ...:     for row in B.itertuples():
   ...:         if row.low <= x <= row.high:
   ...:             return row.name
   ...:         

In [7]: A.value.map(func1)
Out[7]: 
0     one
1     two
2     one
3    four
4    five
5     six
6    five
7     six
Name: value, dtype: object

In [8]: A['name'] = A['value'].map(func1)

In [9]: A
Out[9]: 
   value  name
0      3   one
1      7   two
2      5   one
3     18  four
4     23  five
5     27   six
6     21  five
7     29   six

I use itertuples because it should be a little bit faster but in general this will not be very efficient. 我使用itertuples因为它应该快一点,但一般来说这不会非常有效。 This is a solution but there might be better ones. 这是一个解决方案,但可能会有更好的解决方案。

Edited to Add: 编辑添加:

In [8]: timeit A['value'].map(func1)
100 loops, best of 3: 10.5 ms per loop

In [9]: timeit [B.loc[(B['low'].le(v)) & (B['high'].ge(v)), 'name'].tolist()[0] for v in A['value']]
100 loops, best of 3: 9.06 ms per loop

Quick and dirty test shows that Alexander's approach is faster. 快速而肮脏的测试表明Alexander的方法更快。 I wonder how it scales. 我想知道它是如何扩展的。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Python Pandas:检查一列中的值是否存在于另一列的行子集中 - Python Pandas: Check whether a value in one column is present in subsets of rows in another column 找到在另一列中只取一个值的值 pandas python - find the values that ONLY take one value in another column pandas python Pandas 将一列中列表中的项目与另一列中的单个值进行比较 - Pandas compare items in list in one column with single value in another column Python Pandas - 过滤 pandas dataframe 以获取一列中具有最小值的行,以获取另一列中的每个唯一值 - Python Pandas - filter pandas dataframe to get rows with minimum values in one column for each unique value in another column Pandas/Python:根据另一列中的值设置一列的值 - Pandas/Python: Set value of one column based on value in another column 一次将列值与另一次比较 pandas 日期时间索引 - Compare column value at one time to another pandas datetime index 比较另一列中 python pandas 中的缺失值 - Compare missing values in python pandas from another column 用熊猫中另一列的每个值替换一个列的值 - Replace values of one column for each value of another column in pandas Pandas/ Python 根据另一列的字符串值列出一列的值 - Pandas/ Python list values of one column based on string value of another column Python Pandas:根据另一列的值选择一个列的多个单元格值 - Python Pandas: Select Multiple Cell Values of one column based on the Value of another Column
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM