I have some messy data representing the feedback from the PO creation process
PO <- c(1, 1, 2, 2, 3, 4, 5, 6)
Rating <- c(3, 0, 0, 1, 3, 4, 5, 4)
dt <- data.table(PO, Rating)
> dt
PO Rating
1: 1 3
2: 1 0
3: 2 0
4: 2 1
5: 3 3
6: 4 4
7: 5 5
8: 6 4
PO #1 has two ratings of 3 and 0, and PO #2 has rating of 0 and 1. In all such cases, I want to change the rows to the max for that PO
PO Rating
1: 1 3
2: 1 3 <- changed from 0
3: 2 1 <- changed from 0
4: 2 1
5: 3 3
6: 4 4
7: 5 5
8: 6 4
First step is to detect the POs having this issue. I have the following R code for this:
t <- dt[, .(U=length(unique(Rating))), by=.(PO)]
> t
PO U
1: 1 2
2: 2 2
3: 3 1
4: 4 1
5: 5 1
6: 6 1
This shows that PO #1 and #2 have two unique ratings. Now, my task is to find the max of these unique ratings and assign them back into the data table dt.
How do I do this in R?
Using data.table functions:
# subset by PO, then find the max Rating in each group, and reassign
# that max value to the Rating
dt[ , Rating := max(Rating, na.rm = TRUE), by = PO]
Cheers!
We can also order
and then assign the first element
dt[order(PO, -Rating), Rating := Rating[1], PO]
dt
# PO Rating
#1: 1 3
#2: 1 3
#3: 2 1
#4: 2 1
#5: 3 3
#6: 4 4
#7: 5 5
#8: 6 4
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.