dplyr is fast and I would like to use the %.% piping a lot. I want to use a table function (count by frequency) and preserve column name and have output be data.frame.
How can I achieve the same as the code below using only dplyr functions (imagine huge data.table (BIGiris) with 6M rows)
> out<-as.data.frame(table(iris$Species))
> names(out)[1]<-'Species'
> names(out)[2]<-'my_cnt1'
> out
output is this. Notice that I have to rename back column 1. Also, in dplyr mutate or other call - I would like to specify name for my new count column somehow.
Species my_cnt1
1 setosa 50
2 versicolor 50
3 virginica 50
imagine joining to a table like this (assume iris data.frame has 6M rows) and species is more like "species_ID"
> habitat<-data.frame(Species=c('setosa'),lives_in='sea')
final join and output (for joining, I need to preserve column names all the time)
> left_join(out,habitat)
Joining by: "Species"
Species my_cnt1 lives_in
1 setosa 50 sea
2 versicolor 50 <NA>
3 virginica 50 <NA>
>
For the first part you can use dplyr
like this
library(dplyr)
out <- iris %>% group_by(Species) %>% summarise(my_cnt1 = n())
out
Source: local data frame [3 x 2]
Species my_cnt1
1 setosa 50
2 versicolor 50
3 virginica 50
To continue in one chain do this:
out <- iris %>% group_by(Species) %>% summarise(my_cnt1 = n()) %>% left_join(habitat)
out
Source: local data frame [3 x 3]
Species my_cnt1 lives_in
1 setosa 50 sea
2 versicolor 50 NA
3 virginica 50 NA
By the way, dplyr
now uses %>%
in place of %.%
. It does the same thing and is part of the package magrittr
as well.
Or you can simply attach the dataframe and then run the table function. This will display the column names too.
> attach(iris)
> table(Species)
Species
setosa versicolor virginica
50 50 50
count()
may be a convenient option to get behavior similar to table()
:
iris %>%
group_by(Species) %>%
count(name="my_cnt1")
For table()
-like output with two factors:
iris %>%
group_by(Species) %>%
count(Petal.Width) %>%
pivot_wider(names_from = Petal.Width, values_from=n)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.