简体   繁体   中英

Count the number of rows where all columns have identical values

I have a dataframe and I want to count the number of rows which have the same value for all the columns, within each row.

For example, I have this data:

cmp <- read.table(text = "
A B C D
1 1 1 0
1 1 1 1
2 2 2 2
3 3 3 0", header = TRUE)

Here, the count is 2, because the second row and third row have only one unique value each, only 1 s, and only 2 s, respectively.

Thanks in advance.

This, which uses apply() to count the number of distinct elements in each row, should do the trick:

sum(apply(cmp, 1, function(x) length(unique(x))==1))
## [1] 2

Count the number of values per row which are equal to the first value. If this count is equal to the number of columns, then all values in the row are identical.

sum(rowSums(cmp == cmp[ , 1]) == ncol(cmp))
#[1] 2

You could check if maximum value and minimum value across the rows are same

sum(do.call(pmax, cmp) == do.call(pmin, cmp))
#[1] 2

To obtain the rows where identical values are present

which(do.call(pmax, cmp) == do.call(pmin, cmp))
#[1] 2 3

The tidyverse way:

df %>% 
  rowwise() %>% 
  mutate(unique_vals = length(unique(c_across(everything()))))

This gives you the number of unique values for the selected columns -- feel free to change everything() to whatever you need. You can then filter/sum this variable as you please.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM