简体   繁体   English

比较R中的两行dataframe

[英]Compare the two rows of dataframe in R

I have an datset in which I want to find and display rows with any invalid data, eg the rate value is not in the range of MinCI and MaxCI, MinCI is larger than MaxCI, etc. If they exist, change the MinCI and MaxCI values in these rows into NA.我有一个数据集,我想在其中查找并显示包含任何无效数据的行,例如速率值不在 MinCI 和 MaxCI 的范围内,MinCI 大于 MaxCI 等。如果它们存在,请更改 MinCI 和 MaxCI 值在这些行中进入 NA。

   MinCI      MaxCI   City
    2.0        6.0    ABC
    4.2        8.0    XYZ
    3.6        1.2    CRS
    6.4        8.9    WUI
    7.8        5.4    IRK

So, in row 3 and 5 MinCI is greater than MaxCI so we want this columns value to be fill NA using R.因此,在第 3 行和第 5 行中,MinCI 大于 MaxCI,因此我们希望使用 R 将此列值填充为 NA。 For the complete column of the dataset对于数据集的完整列

We can create a logical index and use that index to assign the column values to NA我们可以创建一个逻辑索引并使用该索引将列值分配给NA

i1 <- with(df1, MaxCI < MinCI)
df1[i1, c('MaxCI', 'MinCI')] <- NA
df1
#  MinCI MaxCI City
#1   2.0   6.0  ABC
#2   4.2   8.0  XYZ
#3    NA    NA  CRS
#4   6.4   8.9  WUI
#5    NA    NA  IRK

data数据

df1 <- structure(list(MinCI = c(2, 4.2, 3.6, 6.4, 7.8), MaxCI = c(6, 
8, 1.2, 8.9, 5.4), City = c("ABC", "XYZ", "CRS", "WUI", "IRK"
)), class = "data.frame", row.names = c(NA, -5L))

A dplyr option: dplyr选项:

library(dplyr)
df1 %>% 
  mutate(across(MinCI:MaxCI, ~na_if(., MinCI < MaxCI)))

# A tibble: 5 x 3
  MinCI MaxCI City 
  <dbl> <dbl> <chr>
1   2     6   ABC  
2   4.2   8   XYZ  
3  NA    NA   CRS  
4   6.4   8.9 WUI  
5  NA    NA   IRK

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM