I imported a big dataset (~6 million rows) to R using ffbase package that lists people enrolled in high school in Brazil. In principle, I have 2 columns: Id (Student Id Number) and University (Institution's name).
I would like to create a column - named Group in my example - relating each university to its educational group:
Id University Group
000001 Anhanguera Kroton
000002 Unopar Kroton
000003 Anhembi Laureate
000004 FMU Laureate
PS: I have none information about educational groups in my dataset, but, I've got the information I need concerning which group corresponds to each university. In this way, I need to attach this detail to my data.
PS2: The class of University column is ff_vector.
I appreciate any contribution you might make.
If you have a long list of Groups, this may not be the quickest way, but, using mutate
from the dplyr package:
data <- data.frame("Id" = 000001:000004, "University" = c("Anhanguera", "Unopar", "Anhembi", "FMU"))
data <- mutate(data, Group = as.factor(
ifelse(University %in% "Anhanguera", "Kronton",
ifelse(University %in% "Unopar", "Kronton",
ifelse(University %in% "Anhembi", "Laureate",
ifelse(University %in% "FMU", "Laureate", NA))))))
data
str(data)
I used University here, but just substitute it with ff_vector
.
If you would like to keep Group as character, remove the as.factor()
.
I'm not familiar with ffbase
, but see ffbase2 for using dplyr and ffbase
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.