PYTHON 检查列数据集中的值是否在另一个数据集中报告的值范围内

Question

Have read through similar post but can't find an exact solution.已阅读类似的帖子，但找不到确切的解决方案。 I have a dataset in a column named "A" and want to check if each value in this column is contained within any of the intervals in another dataset with two column intervals "Start" and "End".我在名为“A”的列中有一个数据集，并想检查该列中的每个值是否包含在另一个数据集中的任何间隔内，并具有两个列间隔“开始”和“结束”。 Return True or False in column "B" Please see attached image (data always in ascending order).在“B”列中返回 True 或 False 请参见附图（数据始终按升序排列）。 Thank You谢谢你 数据示例

Answer 1

This is not the most efficient solution but it should do what you are asking:这不是最有效的解决方案，但它应该满足您的要求：

import pandas as pd
import numpy as np

df1 = pd.DataFrame({"A":list(range(20))})


df2 = pd.DataFrame({"START":[1,3,5,7],
                     "END":[2,4,6,8]})


def compare_with_df(x,df):
  for row in range(df.shape[0]):
    if x >= df.loc[row,'START'] and x <= df.loc[row,'END']:
      return True
  return False

df1['B'] = df1['A'].apply(lambda x:compare_with_df(x,df2))

As you can see the compare_with_df() function loops through df2 and compares a given x to all possible ranges (this can and probably should be optimized for larger datasets).如您所见， compare_with_df() function 循环遍历df2并将给定的x与所有可能的范围进行比较（这可以并且可能应该针对更大的数据集进行优化）。 The apply() method is equivalent to looping trough the values of the give column (series). apply()方法等效于循环遍历给定列（系列）的值。

PYTHON 检查列数据集中的值是否在另一个数据集中报告的值范围内

问题描述

1 个解决方案

解决方案1
0 2020-05-21 22:46:00

PYTHON 检查列数据集中的值是否在另一个数据集中报告的值范围内

问题描述

1 个解决方案

解决方案1 0 2020-05-21 22:46:00

解决方案1
0 2020-05-21 22:46:00