简体   繁体   中英

PYTHON check if a value in a column Dataset is within a range of values reported in another dataset

Have read through similar post but can't find an exact solution. I have a dataset in a column named "A" and want to check if each value in this column is contained within any of the intervals in another dataset with two column intervals "Start" and "End". Return True or False in column "B" Please see attached image (data always in ascending order). Thank You数据示例

This is not the most efficient solution but it should do what you are asking:

import pandas as pd
import numpy as np

df1 = pd.DataFrame({"A":list(range(20))})


df2 = pd.DataFrame({"START":[1,3,5,7],
                     "END":[2,4,6,8]})


def compare_with_df(x,df):
  for row in range(df.shape[0]):
    if x >= df.loc[row,'START'] and x <= df.loc[row,'END']:
      return True
  return False

df1['B'] = df1['A'].apply(lambda x:compare_with_df(x,df2))

As you can see the compare_with_df() function loops through df2 and compares a given x to all possible ranges (this can and probably should be optimized for larger datasets). The apply() method is equivalent to looping trough the values of the give column (series).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM