简体   繁体   English

查找大于0的最小值

[英]Find minimum value greater than 0

I have a data frame that contains numerical values 1:4 with some NA's. 我有一个数据框,其中包含带有一些NA的数值1:4。 For each row, I would like to calculate the frequency (as a percentage) of the value with the fewest occurrences greater than 0. 对于每一行,我想计算出现次数最少的大于0的值的频率(百分比)。

Here is a sample data frame to work with. 这是一个示例数据框架。

    df = as.data.frame(rbind(c(1,2,1,2,2,2,2,1,NA,2),c(2,3,3,2,3,3,NA,2,NA,NA),c(4,1,NA,NA,NA,1,1,1,4,4),c(3,3,3,4,4,4,NA,4,3,4)))

      V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
    1  1  2  1  2  2  2  2  1 NA   2
    2  2  3  3  2  3  3 NA  2 NA  NA
    3  4  1 NA NA NA  1  1  1  4   4
    4  3  3  3  4  4  4 NA  4  3   4

I have 2 points that I am struggling with. 我有2分正在苦苦挣扎。 1) finding the lowest frequency of a value greater than 0 and 2)applying the function to each row of my data frame. 1)找到一个大于0的值的最低频率,以及2)将函数应用于数据帧的每一行。 When I started working on this function I implemented it using the code below, but it did not appear to be applied to every row. 当我开始使用此功能时,我使用下面的代码实现了该功能,但似乎并未将其应用于所有行。 My result for value.1, value.2, etc was the same for every row. 我的value.1,value.2等结果对于每一行都是相同的。

    Low_Freq = function(x){
      value.1 = sum(x==1, na.rm=TRUE) #count the number of 1's per row
      value.2 = sum(x==2, na.rm=TRUE) #count the number of 2's per row
      value.3 = sum(x==3, na.rm=TRUE) #count the number of 3's per row
      value.4 = sum(x==4, na.rm=TRUE) #count the number of 4's per row
      num.values = rowSums(!is.na(x), na.rm=TRUE) #count total number of non-NA values in each row

      #what is the minimum frequency value greater than 0 among value.1, value.2, value.3, and value.4 for EACH row?
      min.value.freq = min(cbind(value.1,value.2,value.3,value.4)) 

      out = min.value.freq/num.values #calculate the percentage of the minimum value for each row
    }

    df$Low_Freq = apply(df, 1, function(x))

Then I started using rowSums() to compute value.1, value.2, value.3, and value.4. 然后,我开始使用rowSums()计算value.1,value.2,value.3和value.4。 This fixed my problem of counting value.1, value.2, etc for each row, however, I then had to apply the function without the use of apply() for it to run: 这解决了我为每一行计数value.1,value.2等的问题,但是,我随后不得不应用该函数而不使用apply()来运行它:

    Low_Freq = function(x){
      value.1 = rowSums(x==1, na.rm=TRUE) #count the number of 1's per row
      value.2 = rowSums(x==2, na.rm=TRUE) #count the number of 2's per row
      value.3 = rowSums(x==3, na.rm=TRUE) #count the number of 3's per row
      value.4 = rowSums(x==4, na.rm=TRUE) #count the number of 4's per row
      num.values = rowSums(!is.na(x), na.rm=TRUE) #count total number of non-NA values in each row

      #what is the minimum frequency value greater than 0 among value.1, value.2, value.3, and value.4 for EACH row?
      min.value.freq = min(cbind(value.1,value.2,value.3,value.4)) 

      out = min.value.freq/num.values #calculate the percentage of the minimum value for each row
    }

    df$Low_Freq = Low_Freq(df)

So the act of applying to each row then seemed to occur within the function itself. 因此,应用于每一行的动作似乎发生在函数本身内。 That's all fine and dandy, but when I go to make my final calculation which will be my output, I cannot figure out how to identify which of values 1, 2, 3, or 4 has the lowest frequency for each row. 一切都很好,但是当我去做最后的计算,这将是我的输出时,我无法弄清楚如何确定值1、2、3或4中哪一行的频率最低。 This value must be divided by the number of non-NA values for each row. 该值必须除以每行非NA值的数量。

My desired result should look like this: 我期望的结果应如下所示:

      V1 V2 V3 V4 V5 V6 V7 V8 V9 V10  Low_Freq
    1  1  2  1  2  2  2  2  1 NA   2 0.3333333
    2  2  3  3  2  3  3 NA  2 NA  NA 0.4285714
    3  4  1 NA NA NA  1  1  1  4   4 0.4285714
    4  3  3  3  4  4  4 NA  4  3   4 0.4444444

I feel like I am going in circles with this seemingly simple function. 我觉得我似乎正在用这个看似简单的功能盘旋。 Any help would be appreciated. 任何帮助,将不胜感激。

Thank you. 谢谢。

The table function will return the frequency of each value that appears, ignoring NA values. table函数将返回出现的每个值的频率,而忽略NA值。 Therefore, the min of the table result is the minimum frequency of a value that shows up in your row, and the sum is the number of non- NA values in your row. 因此, min的的table结果是,你行中示出了一个值的最小频率,并且总和是非的数目NA您的行中的值。

Low_Freq = function(x){
  tab = table(x)
  return(min(tab) / sum(tab))
}
df$Low_Freq = apply(df, 1, Low_Freq)
df
#   V1 V2 V3 V4 V5 V6 V7 V8 V9 V10  Low_Freq
# 1  1  2  1  2  2  2  2  1 NA   2 0.3333333
# 2  2  3  3  2  3  3 NA  2 NA  NA 0.4285714
# 3  4  1 NA NA NA  1  1  1  4   4 0.4285714
# 4  3  3  3  4  4  4 NA  4  3   4 0.4444444

If you wanted to not use 5s for the numerator but to use them for the denominator, you could do: 如果您不希望分子使用5s,而是使用分母,则可以执行以下操作:

df = as.data.frame(rbind(c(1,2,1,2,2,2,2,1,NA,2),c(2,3,3,2,3,3,NA,2,NA,NA),c(4,1,NA,NA,NA,1,1,1,4,4),c(3,3,3,4,4,4,5,4,3,4)))
Low_Freq = function(x){
  tab = table(x[x != 5])
  return(min(tab) / sum(!is.na(x)))
}
df$Low_Freq = apply(df, 1, Low_Freq)
df
#   V1 V2 V3 V4 V5 V6 V7 V8 V9 V10  Low_Freq
# 1  1  2  1  2  2  2  2  1 NA   2 0.3333333
# 2  2  3  3  2  3  3 NA  2 NA  NA 0.4285714
# 3  4  1 NA NA NA  1  1  1  4   4 0.4285714
# 4  3  3  3  4  4  4  5  4  3   4 0.4000000

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 dataframe 中的最小值大于 R 中的 0 - minimum value in dataframe greater than 0 in R R:在数据帧的列中查找大于或等于不同数据帧中列的行值的最小值 - R: Find Minimum Value in Column of Data Frame that is Greater Than or Equal to Row Value of Column in a Different Data Frame 如何找到大于r中特定值的组内最小值 - How to find minimum within group value greater than certain value in r 查找用户输入值较大的向量的最小索引 - Find the minimum index of a vector where the user input value is greater 如何找到大于或等于 a 的值,然后在 R 中找到这些最大值中的最小值? - How do I find the values greater than or equal to a and then find the minimum of these greatest values in R? 如何参考大于零的其他值获得最小单个值? - How to get minimum single value with reference to other value greater than zero? 在向量中查找大于 X 的第一个值的位置 - Find position of first value greater than X in a vector 查找大于列表中每个元素的最小值 - Find smallest value greater than each element in list 在 r 中查找大于特定频率的特定值的数量 - Find the number of specific value where is greater than a specific frequency in r 查找R中行值大于零的列索引 - Find column index where row value is greater than zero in R
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM