简体   繁体   中英

How to index the first element of a list, and apply it to each row of a dataframe in R?

I have a column of list objects which contains elements of Dates, I want to select the first date and mutate as a new column, but now I have some indexing problems.

I have tried indexing the list, but it don't work for each rows, but always shows the first element of the first row. The code shows below:

> head(data$Date)
[[1]]
 [1] "2016-06-08" "2016-06-08" "2016-06-13" "2016-06-13" "2016-06-13" "2016-06-14"
 [7] "2016-06-14" "2016-06-14" "2016-06-14" "2016-06-14" "2016-06-14" "2016-09-15"
[13] "2016-10-31"

[[2]]
[1] "2016-10-02"

[[3]]
[1] "2016-09-25"

[[4]]
[1] "2017-02-16"

> data %>%
+     mutate(time1 = Date[[1]][1])%>%
+     select(time1)
# A tibble: 29,036 x 1
   time1     
   <chr>     
 1 2016-06-08
 2 2016-06-08
 3 2016-06-08
 4 2016-06-08
 5 2016-06-08
 6 2016-06-08

We can also use pluck with reduce that make sure the Date s are not coerced to numeric

library(tidyverse)
data %>%
    mutate(time1 =  map(Date, pluck, 1) %>%
                       reduce(c))

If we need the last , pluck the last

data %>%
   mutate(time1 = map(Date, pluck, last) %>% 
                    reduce(c))

unnest_wider() from tidyr worked for me well

data %>%
tidyr::unnest_wider(col = Date ) %>% 
  select(1, Date = 2)

For each element of the list (new row added for 2nd and following element) just use unnest()

data %>%
tidyr::unnest(cols = Date)

Try using the map function from the tidyverse package purrr :

data %>%
  mutate(time1 = map(Date, ~ .[[1]]) %>% unlist()) %>% 
  select(time1)

The map() function will extract the first element of each list element. As map returns a list by default, you need to unlist() the output to put it in a column the way you want to.

An alternative that only uses dplyr is to just use the rowwise() function, which applies functions within mutate separately in each row.

library(dplyr)

iris %>%
  group_by(Species) %>%
  summarise(petals = list(Petal.Length)) %>%
  rowwise() %>%
  mutate(first = first(petals), last = last(petals))
#> # A tibble: 3 × 4
#> # Rowwise: 
#>   Species    petals     first  last
#>   <fct>      <list>     <dbl> <dbl>
#> 1 setosa     <dbl [50]>   1.4   1.4
#> 2 versicolor <dbl [50]>   4.7   4.1
#> 3 virginica  <dbl [50]>   6     5.1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM