I have a dataset that looks like this:
library(dplyr)
Data <- tibble(
Area1 = rep(c("A1 1AA", "B3 4TT","D1 1AA", "A10 6TY","A2 9GG"),2),
Area2 = c("A2 7BB", "B11 5TT","A14 9SS","A4 4HH","V6 9FF", "A11 6TT","B4 3DD","D1 4FF","G5 7DD","A2 7YY"))
I would like to sort it by Area1
and then Area2
, however arrange
does not produce the desired result because it's in lexicographical order.
Data %>% arrange(Area1,Area2) #not the desired order
Is there a way using dplyr
to produce this output that is in the desired order?
Output <- tibble(
Area1 = c("A1 1AA", "A1 1AA", "A2 9GG","A2 9GG","A10 6TY","A10 6TY", "B3 4TT","B3 4TT","D1 1AA","D1 1AA"),
Area2 = c("A2 7BB", "A11 6TT","A2 7YY","V6 9FF","A4 4HH", "G5 7DD","B4 3DD","B11 5TT","A14 9SS","D1 4FF"))
Seems like we can use mixedorder
with slice
library(dplyr)
library(gtools)
library(stringr)
Output2 <- Data %>%
slice(mixedorder(str_c(Area1, Area2)))
Or another option is to remove the numeric, non-numeric separately and use that in arrange
Output3 <- Data %>%
arrange(str_remove_all(Area1, "\\d+"),
readr::parse_number(Area1),
str_remove_all(Area2, "\\d+"),
readr::parse_number(Area2))
-checking with OP's expected
identical(Output, Output2)
#[1] TRUE
identical(Output, Output3)
#[1] TRUE
Here is another option using arrange()
and str_sort()
:
library(dplyr)
library(stringr)
Data %>%
arrange(across(starts_with("Area"), ~match(.x, str_sort(unique(.x), numeric = TRUE))))
# A tibble: 10 x 2
Area1 Area2
<chr> <chr>
1 A1 1AA A2 7BB
2 A1 1AA A11 6TT
3 A2 9GG A2 7YY
4 A2 9GG V6 9FF
5 A10 6TY A4 4HH
6 A10 6TY G5 7DD
7 B3 4TT B4 3DD
8 B3 4TT B11 5TT
9 D1 1AA A14 9SS
10 D1 1AA D1 4FF
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.