简体   繁体   English

为R中的数据帧中的值设置最大限制

[英]Setting a maximum limit for values in a data frame in R

In a data frame (in R), I have two columns - the first is a list of species names ( species ), the second is the number of occurrence records I have for that species ( number ). 在数据框中(在R中),我有两列-第一列是物种名称( species )的列表,第二列是该species的发生记录numbernumber )。 There is a large variation in the number column with most values being <100 but a few being very high values (>100,000), and there are many rows (~4000). number列的变化很大,大多数值小于100,但有些值非常高(大于100,000),并且行数很多(约4000)。 Here is a simplified example: 这是一个简化的示例:

    x<-data.frame(species=c("a","b","c","d","e","f","g","h","i","j"),number=c(53,17,67,989,135,67,13,786,100400,28))   

Basically what I want to do is reduce the maximum number of records (the value in the number column) until the mean of all the values in this column stabilises. 基本上,我想做的是减少最大记录数(在number列中的值),直到稳定该列中所有值的平均值。

To do this, I need to set a maximum limit for values in the number column so that any value > this limit is reduced to this maximum limit, and record the mean. 为此,我需要为number列中的值设置一个最大限制,以便将大于此限制的任何值减小到该最大限制,并记录平均值。 I want to repeat this multiple times, each time reducing the maximum limit by 100. 我想重复多次,每次将最大限制减少100。

I've not been able to find any similar questions online and am not really sure where to start with this! 我无法在网上找到任何类似的问题,也不确定如何从此开始! Any help, even just a point in the right direction, would be much appreciated! 任何帮助,甚至只是朝着正确方向的一点,都将不胜感激! Cheers 干杯

you should use the pmin value : 您应该使用pmin值:

pmin(x$number, 1e3)
# to test multiple limits :
mns <- sapply(c(1e6, 1e4, 1e2), function(u) mean(pmin(x$number, u)))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM