简体   繁体   中英

Count rows of dataframes within a list of dataframes

I have a list of dataframes, str(datalist,max.level = 1) reveals

List of 9
 $ :'data.frame':   200 obs. of  21 variables:
 $ :'data.frame':   200 obs. of  21 variables:
 $ :'data.frame':   200 obs. of  21 variables:
 $ :'data.frame':   200 obs. of  21 variables:
 $ :'data.frame':   200 obs. of  21 variables:
 $ :'data.frame':   200 obs. of  21 variables:
 $ :'data.frame':   200 obs. of  21 variables:
 $ :'data.frame':   200 obs. of  21 variables:
 $ :'data.frame':   41 obs. of  21 variables:

Now some the variables within the 21 variables of the dataframe are again dataframes. For eg. the 18th variable is a dataframe called topics which in turn contains 3 variables. How do I get the count of rows in each of the topics dataframe?

I tried using the map() function from the purrr package : x <- map(datalist, ~.x[["topics"]]) and thereafter sapply(x, NROW) but this gives me the number of rows of the original dataframe and not the topics dataframe. Any help would be appreciated.

To give you an example of what the topics dataframe looks like, datalist[[1]]$topics[[1]]

                  urlkey                   name    id
1            selfdefense           Self-Defense   443
2                martial           Martial Arts   681
3                jujitsu              Jiu Jitsu  9615
4     mixed-martial-arts     Mixed Martial Arts 15514
5             kickboxing             Kickboxing 18225
6              jiu-jitsu              Jiu-jitsu 21219
7     brazilian-jiujitsu    Brazilian Jiu-Jitsu 22237
8 mma-mixed-martial-arts MMA Mixed Martial Arts 35023
9    brazilian-jiu-jitsu    Brazilian Jiu Jitsu 46818

The solution you described works for me:

Make a reproducible example:

datalist <- list(
  data.frame(V1 = 1:2, topics = I(list(mtcars, mtcars))),
  data.frame(V1 = 1:2, topics = I(list(mtcars, mtcars)))
)
str(datalist)
# List of 2
#  $ :'data.frame': 2 obs. of  2 variables:
# ..$ V1    : int [1:2] 1 2
# ..$ topics:List of 2
# .. ..$ :'data.frame': 32 obs. of  11 variables:
# .. .. ..$ mpg : num [1:32] 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
# .. .. ..$ cyl : num [1:32] 6 6 4 6 8 6 8 4 4 6 ...
# .. .. ..$ disp: num [1:32] 160 160 108 258 360 ...
# .. .. ..$ hp  : num [1:32] 110 110 93 110 175 105 245 62 95 123 ...
# .. .. ..$ drat: num [1:32] 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
# .. .. ..$ wt  : num [1:32] 2.62 2.88 2.32 3.21 3.44 ...
# .. .. ..$ qsec: num [1:32] 16.5 17 18.6 19.4 17 ...
# .. .. ..$ vs  : num [1:32] 0 0 1 1 0 1 0 1 1 1 ...
# .. .. ..$ am  : num [1:32] 1 1 1 0 0 0 0 0 0 0 ...
# .. .. ..$ gear: num [1:32] 4 4 4 3 3 3 3 4 4 4 ...
# .. .. ..$ carb: num [1:32] 4 4 1 1 2 1 4 2 2 4 ...
# .. ..$ :'data.frame': 32 obs. of  11 variables:
# .. .. ..$ mpg : num [1:32] 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
# .. .. ..$ cyl : num [1:32] 6 6 4 6 8 6 8 4 4 6 ...
# .. .. ..$ disp: num [1:32] 160 160 108 258 360 ...
# .. .. ..$ hp  : num [1:32] 110 110 93 110 175 105 245 62 95 123 ...
# .. .. ..$ drat: num [1:32] 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
# .. .. ..$ wt  : num [1:32] 2.62 2.88 2.32 3.21 3.44 ...
# .. .. ..$ qsec: num [1:32] 16.5 17 18.6 19.4 17 ...
# .. .. ..$ vs  : num [1:32] 0 0 1 1 0 1 0 1 1 1 ...
# .. .. ..$ am  : num [1:32] 1 1 1 0 0 0 0 0 0 0 ...
# .. .. ..$ gear: num [1:32] 4 4 4 3 3 3 3 4 4 4 ...
# .. .. ..$ carb: num [1:32] 4 4 1 1 2 1 4 2 2 4 ...
# .. ..- attr(*, "class")= chr "AsIs"
# $ :'data.frame':  2 obs. of  2 variables:
# ..$ V1    : int [1:2] 1 2
# ..$ topics:List of 2
# .. ..$ :'data.frame': 32 obs. of  11 variables:
# .. .. ..$ mpg : num [1:32] 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
# .. .. ..$ cyl : num [1:32] 6 6 4 6 8 6 8 4 4 6 ...
# .. .. ..$ disp: num [1:32] 160 160 108 258 360 ...
# .. .. ..$ hp  : num [1:32] 110 110 93 110 175 105 245 62 95 123 ...
# .. .. ..$ drat: num [1:32] 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
# .. .. ..$ wt  : num [1:32] 2.62 2.88 2.32 3.21 3.44 ...
# .. .. ..$ qsec: num [1:32] 16.5 17 18.6 19.4 17 ...
# .. .. ..$ vs  : num [1:32] 0 0 1 1 0 1 0 1 1 1 ...
# .. .. ..$ am  : num [1:32] 1 1 1 0 0 0 0 0 0 0 ...
# .. .. ..$ gear: num [1:32] 4 4 4 3 3 3 3 4 4 4 ...
# .. .. ..$ carb: num [1:32] 4 4 1 1 2 1 4 2 2 4 ...
# .. ..$ :'data.frame': 32 obs. of  11 variables:
# .. .. ..$ mpg : num [1:32] 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
# .. .. ..$ cyl : num [1:32] 6 6 4 6 8 6 8 4 4 6 ...
# .. .. ..$ disp: num [1:32] 160 160 108 258 360 ...
# .. .. ..$ hp  : num [1:32] 110 110 93 110 175 105 245 62 95 123 ...
# .. .. ..$ drat: num [1:32] 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
# .. .. ..$ wt  : num [1:32] 2.62 2.88 2.32 3.21 3.44 ...
# .. .. ..$ qsec: num [1:32] 16.5 17 18.6 19.4 17 ...
# .. .. ..$ vs  : num [1:32] 0 0 1 1 0 1 0 1 1 1 ...
# .. .. ..$ am  : num [1:32] 1 1 1 0 0 0 0 0 0 0 ...
# .. .. ..$ gear: num [1:32] 4 4 4 3 3 3 3 4 4 4 ...
# .. .. ..$ carb: num [1:32] 4 4 1 1 2 1 4 2 2 4 ...
# .. ..- attr(*, "class")= chr "AsIs"

Your solution:

library(purrr)
map(datalist, ~ sapply(.x[["topics"]], NROW))
# [[1]]
# [1] 32 32
# 
# [[2]]
# [1] 32 32
count_rows <- function(dfs) {
nrow(dfs$topics)
}
count <- lapply(datalist, count_rows)

The count_rows function just subsets each dataframe in the list and then applies nrow on your "topics" dataframe.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM