简体   繁体   English

从某个范围内的列中找到一个固定值,并在熊猫数据框中找到另一列的每个唯一值

[英]find a fix value from a column around a range with each unique values of another column in pandas data frame

I have a data frame like this: 我有一个像这样的数据框:

df
col1      col2
 1        50000
 1        2000
 2        51000
 3        100
 3        5000
 3        50500
 4        200
 4        51500
 5        49000

I want to identify the values with plus minus 10 percent for each of col1 values which occurs for every col1 unique values. 我想为每个col1唯一值出现的每个col1值标识正负10%的值。

the final output should look like 最终输出应如下所示

col1        col2
  1         50000
  2         51000
  3         50500
  4         51500
  5         49000

if other values other than the values around 50000 presents and have within plus minus 10 percent range, add those with the values around 50000 如果存在除50000左右以外的其他值并且在正负10%范围内,则将那些具有50000左右的值相加

How to do it using pandas/python with most efficient way ? 如何以最有效的方式使用pandas / python?

Use list cpmprehension for loop by all unique values of col2 , filter by +-10% with Series.between and boolean indexing and compare if all values exist in all groups with set created by col1 . 使用列表cpmprehension for循环的所有唯一值col2通过,过滤器+-10%Series.betweenboolean indexing ,如果各组存在通过创建集中的所有值进行比较col1 Last filter by Series.isin : Series.isin最后一个过滤器:

s = set(df['col1'])
print (s)
{1, 2, 3, 4, 5}

a = [x for x in df['col2'].unique() 
     if set(df.loc[df['col2'].between(x - x *.1, x + x*.1), 'col1']) == s]
print (a)
[50000, 51000, 50500, 51500, 49000]

df = df[df['col2'].isin(a)]
print (df)
   col1   col2
0     1  50000
2     2  51000
5     3  50500
7     4  51500
8     5  49000

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Pandas Dataframe:从另一列中唯一值最多的列中查找唯一值 - Pandas Dataframe: Find unique value from one column which has the largest number of unique values in another column 如果值使用pandas落在另一个数据框的范围内,则从另一个数据框添加列 - add column from another data frame if the value falls under the range from the other data frame using pandas pandas 循环遍历列中每个唯一值的数据帧 - pandas loop through data frame for each unique value in column 如何为pandas数据框中按ID分组的每一列找到每个唯一值的最小值 - How to find minimum for each unique value for every column grouped by ID in pandas data frame Python pandas dataframe:为另一列的每个唯一值查找最大值 - Python pandas dataframe: find max for each unique values of an another column Pandas,对于一列中的每个唯一值,在另一列中获取唯一值 - Pandas, for each unique value in one column, get unique values in another column Python Pandas - 过滤 pandas dataframe 以获取一列中具有最小值的行,以获取另一列中的每个唯一值 - Python Pandas - filter pandas dataframe to get rows with minimum values in one column for each unique value in another column 是否有熊猫函数来转置数据框以为现有列的每个唯一值创建单独的列? - Is there a pandas function to transpose a data frame to create a separate column for each unique value of an existing column? 为熊猫数据框的单独列(来自特定列范围)的最大值选择相应的列值 - Select corresponding column value for max value of separate column(from a specific range of column) of pandas data frame Python Pandas-使用另一个数据框列的值更新数据框列 - Python Pandas-Update a data frame column with values from another
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM