简体   繁体   中英

Compare the two rows of dataframe in R

I have an datset in which I want to find and display rows with any invalid data, eg the rate value is not in the range of MinCI and MaxCI, MinCI is larger than MaxCI, etc. If they exist, change the MinCI and MaxCI values in these rows into NA.

   MinCI      MaxCI   City
    2.0        6.0    ABC
    4.2        8.0    XYZ
    3.6        1.2    CRS
    6.4        8.9    WUI
    7.8        5.4    IRK

So, in row 3 and 5 MinCI is greater than MaxCI so we want this columns value to be fill NA using R. For the complete column of the dataset

We can create a logical index and use that index to assign the column values to NA

i1 <- with(df1, MaxCI < MinCI)
df1[i1, c('MaxCI', 'MinCI')] <- NA
df1
#  MinCI MaxCI City
#1   2.0   6.0  ABC
#2   4.2   8.0  XYZ
#3    NA    NA  CRS
#4   6.4   8.9  WUI
#5    NA    NA  IRK

data

df1 <- structure(list(MinCI = c(2, 4.2, 3.6, 6.4, 7.8), MaxCI = c(6, 
8, 1.2, 8.9, 5.4), City = c("ABC", "XYZ", "CRS", "WUI", "IRK"
)), class = "data.frame", row.names = c(NA, -5L))

A dplyr option:

library(dplyr)
df1 %>% 
  mutate(across(MinCI:MaxCI, ~na_if(., MinCI < MaxCI)))

# A tibble: 5 x 3
  MinCI MaxCI City 
  <dbl> <dbl> <chr>
1   2     6   ABC  
2   4.2   8   XYZ  
3  NA    NA   CRS  
4   6.4   8.9 WUI  
5  NA    NA   IRK

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM