The following returns a data.table with 150 rows
library(data.table)
irisDT <- iris %>% data.table
irisDT[Sepal.Width > 3, Petal.Width_rank := row_number(Petal.Width),
by = "Species"]
However, I'm trying to do the subsetting Sepal.Width > 3
at the same time, instead of doing a "conditional mutate", ie I'm trying to do something like
library(dplyr)
iris %>%
filter(Sepal.Width > 3) %>%
group_by(Species) %>%
mutate(Petal.Width_rank = row_number(Petal.Width))
What's the idiomatic way to do this in data.table?
Chain your calls:
data.table(iris)[
Sepal.Width > 3
][,
Petal.Width_rank := rank(Petal.Width, ties="first"),
by=Species
][]
This produces 67 rows.
You could try
DT1 <- setDT(iris)[Sepal.Width >3, c(.SD,list(Petal.Width_rank=
row_number(Petal.Width))), by=Species]
dim(DT1)
#[1] 67 6
In data.table_1.9.5
, you can also use frank
with different options for ties (as mentioned by @docendo discimus in the comments)
DT2 <- setDT(iris)[Sepal.Width >3, c(.SD, list(Petal.Width_rank=
frank(Petal.Width, ties.method='first'))), Species]
dim(DT2)
#[1] 67 6
identical(DT1, DT2)
#[1] TRUE
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.