How to generate list/table of all the observations with a given value. In R

Question

I have a large dataset (asv_ar2) indicating the number of times a given species has been recorded on a given location. It looks like the following:

Specie	loc1	loc2	loc4
sp1	0	1	4
sp2	7	3	2
sp3	3	1	0

I would like to get for each species a list/table with the locations where it's been found (where the value of that variable is not 0). Something like:

sp1 loc2, loc4
sp2 loc1, loc2, loc4
sp3 loc1, loc2

or the other way around, with the species found in a location.

I can select rows with values>0 with the filter function of dplyr, but only location by location. a1<-filter(asv_ar2,asv_ar2[,2]>0)[,c(1,2,8)] I tried making a loop that joins them all together, but it only shows the first location

for(i in 2:1156){ locs<-filter(asv_ar2,asv_ar2[,i]>0)[c(1,i)]}

I don't know how to join all the iterations. Or if there is a better way to do all this.

Any suggestions?

Thank you

Answer 1

I hope this is what you have in mind:

library(dplyr)
library(tidyr)
library(purrr)

df %>%
  mutate(data = pmap(df %>% select(!Specie), ~ names(c(...)[c(...) != 0]))) %>%
  unnest_wider(data)


# A tibble: 3 x 8
  Specie  loc1  loc2  loc3  loc4 ...1  ...2  ...3 
  <chr>  <int> <int> <int> <int> <chr> <chr> <chr>
1 sp1        0     1     0     4 loc2  loc4  NA   
2 sp2        7     3     0     2 loc1  loc2  loc4 
3 sp3        3     1     0     0 loc1  loc2  NA

Answer 2

You can add a new column with column names where the value is greater than 0 in a row.

asv_ar2$locs <- apply(asv_ar2[-1] > 0, 1, function(x) 
                      toString(names(asv_ar2[-1])[x])) 

asv_ar2

#  Specie loc1 loc2 loc3 loc4             locs
#1    sp1    0    1    0    4       loc2, loc4
#2    sp2    7    3    0    2 loc1, loc2, loc4
#3    sp3    3    1    0    0       loc1, loc2

In dplyr you can use rowwise :

library(dplyr)

asv_ar2 %>%
  rowwise() %>%
  mutate(locs  = toString(names(.[-1])[c_across(starts_with('loc')) > 0]))

Answer 3

We could do this in tidyverse in a more vectorized way ie without using rowwise . Here, we loop across the 'loc' columns, return the column name ( cur_column ) if the value is not 0 (the default case_when return is NA ), speicify the .names to create new columns by adding a suffix or prefix ( _new ), then make use of unite to collapse those '_new' columns to a single one

library(dplyr)
library(tidyr)
df1 %>% 
   mutate(across(starts_with('loc'), ~ case_when(. != 0 ~ cur_column()), 
     .names = '{.col}_new')) %>% 
   unite(locs, ends_with('new'), sep=", ", na.rm = TRUE)
#  Specie loc1 loc2 loc3 loc4             locs
#1    sp1    0    1    0    4       loc2, loc4
#2    sp2    7    3    0    2 loc1, loc2, loc4
#3    sp3    3    1    0    0       loc1, loc2

data

df1 <- structure(list(Specie = c("sp1", "sp2", "sp3"), loc1 = c(0L, 
7L, 3L), loc2 = c(1L, 3L, 1L), loc3 = c(0L, 0L, 0L), loc4 = c(4L, 
2L, 0L)), class = "data.frame", row.names = c(NA, -3L))

Answer 4

You can do:

apply(df, 1, function(x) paste(x[1], paste(names(which(x[-1] > 0)), collapse = ", ")))
[1] "sp1 loc2, loc4"       "sp2 loc1, loc2, loc4" "sp3 loc1, loc2"

How to generate list/table of all the observations with a given value. In R

Question

4 answers

solution1
2 2021-05-14 09:27:38

solution2
1 ACCPTED 2021-05-14 09:22:02

solution3
1 2021-05-14 17:54:59

data

solution4
0 2021-05-14 09:23:31

How to generate list/table of all the observations with a given value. In R

Question

4 answers

solution1 2 2021-05-14 09:27:38

solution2 1 ACCPTED 2021-05-14 09:22:02

solution3 1 2021-05-14 17:54:59

data

solution4 0 2021-05-14 09:23:31

solution1
2 2021-05-14 09:27:38

solution2
1 ACCPTED 2021-05-14 09:22:02

solution3
1 2021-05-14 17:54:59

solution4
0 2021-05-14 09:23:31