[英]PYTHON check if a value in a column Dataset is within a range of values reported in another dataset
Have read through similar post but can't find an exact solution.已阅读类似的帖子,但找不到确切的解决方案。 I have a dataset in a column named "A" and want to check if each value in this column is contained within any of the intervals in another dataset with two column intervals "Start" and "End".
我在名为“A”的列中有一个数据集,并想检查该列中的每个值是否包含在另一个数据集中的任何间隔内,并具有两个列间隔“开始”和“结束”。 Return True or False in column "B" Please see attached image (data always in ascending order).
在“B”列中返回 True 或 False 请参见附图(数据始终按升序排列)。 Thank You
谢谢你
This is not the most efficient solution but it should do what you are asking:这不是最有效的解决方案,但它应该满足您的要求:
import pandas as pd
import numpy as np
df1 = pd.DataFrame({"A":list(range(20))})
df2 = pd.DataFrame({"START":[1,3,5,7],
"END":[2,4,6,8]})
def compare_with_df(x,df):
for row in range(df.shape[0]):
if x >= df.loc[row,'START'] and x <= df.loc[row,'END']:
return True
return False
df1['B'] = df1['A'].apply(lambda x:compare_with_df(x,df2))
As you can see the compare_with_df()
function loops through df2
and compares a given x
to all possible ranges (this can and probably should be optimized for larger datasets).如您所见,
compare_with_df()
function 循环遍历df2
并将给定的x
与所有可能的范围进行比较(这可以并且可能应该针对更大的数据集进行优化)。 The apply()
method is equivalent to looping trough the values of the give column (series). apply()
方法等效于循环遍历给定列(系列)的值。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.