I have a column in a data frame that consists of letters describing wind directions. I need to find the most common direction for each row, which would involve counting the number of occurrences of each letter, and then selecting the letter that was most common. This is an example of the data frame:
structure(list(Day = c("15", "16", "17", "18", "19", "20"), Month = structure(c(4L,
4L, 4L, 4L, 4L, 4L), .Label = c("Dec", "Nov", "Oct", "Sep"), class = "factor"),
Year = structure(c(2L, 2L, 2L, 2L, 2L, 2L), .Label = c("2012",
"2013", "2014", "2015", "2018", "2019", "2020"), class = "factor"),
Time = structure(c(10L, 10L, 10L, 10L, 10L, 10L), .Label = c("1-2pm",
"10-11am", "11-12am", "12-1pm", "2-3pm", "3-4pm", "4-5pm",
"5-6pm", "7-8am", "8-9am", "9-10am"), class = "factor"),
Direction_Abrev = c("S-SE", "S-SE", "SW-S", "W-SE", "W-SW",
"SW-S")), row.names = c(NA, 6L), class = "data.frame")
I would like the resulting data frame to be like the following:
Day Month Year Time Direction_Abrev
1 15 Sep 2013 8-9am S
2 16 Sep 2013 8-9am S
3 17 Sep 2013 8-9am S
4 18 Sep 2013 8-9am W-SE
5 19 Sep 2013 8-9am W
6 20 Sep 2013 8-9am S
that returns the most common letter. There is an issue (like row 4), where all letters are equally common. In these cases I would like to return the original value if that is possible. Thanks in advance.
sapply(dat$Direction_Abrev, function(s) {
counts <- sort(table(setdiff(strsplit(s, ""), "-")), decreasing = TRUE)
if (length(counts) < 2 || counts[1] == counts[2]) s else names(counts)[1]
})
# S-SE S-SE SW-S W-SE W-SW SW-S
# "S" "S" "S" "W-SE" "W" "S"
Here is a base R option using strsplit
+ intersect
transform(
df,
Direction_Abrev = unlist(
ifelse(
lengths(
v <- sapply(
strsplit(Direction_Abrev, "-"),
function(x) do.call(intersect, strsplit(x, ""))
)
),
v,
Direction_Abrev
)
)
)
which gives
Day Month Year Time Direction_Abrev
1 15 Sep 2013 8-9am S
2 16 Sep 2013 8-9am S
3 17 Sep 2013 8-9am S
4 18 Sep 2013 8-9am W-SE
5 19 Sep 2013 8-9am W
6 20 Sep 2013 8-9am S
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.