[英]Arrange data frame columns by class: numeric before character
Good afternoon,下午好,
Assume we have the following dataset:假设我们有以下数据集:
dput(head(cylinder))
structure(list(X19910108 = c("19910109", "19910104", "19910104",
"19910111", "19910104", "19910111"), X126 = c("X266", "B7", "T133",
"J34", "T218", "X249"), TVGUIDE = c("TVGUIDE", "MODMAT", "MASSEY",
"KMART", "MASSEY", "ROSES"), X25503 = c(25503L, 47201L, 39039L,
37351L, 38039L, 35751L), YES = c("YES", "YES", "YES", "NO", "YES",
"NO"), KEY = c("KEY", "KEY", "KEY", "KEY", "KEY", "KEY"), YES.1 = c("YES",
"YES", "YES", "YES", "YES", "YES"), BENTON = c("BENTON", "BENTON",
"BENTON", "BENTON", "BENTON", "BENTON"), GALLATIN = c("GALLATIN",
"GALLATIN", "GALLATIN", "GALLATIN", "GALLATIN", "GALLATIN"),
UNCOATED = c("UNCOATED", "UNCOATED", "UNCOATED", "UNCOATED",
"UNCOATED", "COATED"), UNCOATED.1 = c("UNCOATED", "COATED",
"UNCOATED", "COATED", "UNCOATED", "COATED"), NO = c("NO",
"NO", "NO", "NO", "NO", "NO"), LINE = c("LINE", "LINE", "LINE",
"LINE", "LINE", "LINE"), YES.2 = c("YES", "YES", "YES", "YES",
"YES", "YES"), Motter94 = c("Motter94", "WoodHoe70", "WoodHoe70",
"WoodHoe70", "WoodHoe70", "Motter94"), X821 = c(821L, 815L,
816L, 816L, 816L, 827L), X2 = c(2, 9, 9, 2, 2, 2), TABLOID = c("TABLOID",
"CATALOG", "CATALOG", "TABLOID", "CATALOG", "TABLOID"), NorthUS = c("NorthUS",
"NorthUS", "NorthUS", NA, "NorthUS", "CANADIAN"), X1911 = c(NA,
NA, 1910L, 1910L, 1910L, 1911L), X55 = c(55, 62, 52, 50,
50, 50), X46 = c(46L, 40L, 40L, 46L, 40L, 46L), X0.2 = c("0.3",
"0.433", "0.3", "0.3", "0.267", "0.3"), X17 = c(15, 16, 16,
17, 16.8, 16.5), X78 = c(80L, 80L, 75L, 80L, 76L, 75L), X0.75 = c(0.75,
NA, 0.3125, 0.75, 0.4375, 0.75), X20 = c(20L, 30L, 30L, 30L,
28L, 30L), X13.1 = c(6.6, 6.5, 5.6, 0, 8.6, 0), X1700 = c(1900L,
1850L, 1467L, 2100L, 1467L, 2600L), X50.5 = c(54.9, 53.8,
55.6, 57.5, 53.8, 62.5), X36.4 = c(38.5, 39.8, 38.8, 42.5,
37.6, 37.5), X0 = c(0, 0, 0, 5, 5, 6), X0.1 = c(0, 0, 0,
0, 0, 0), X2.5 = c(2.5, 2.8, 2.5, 2.3, 2.5, 2.5), X1 = c(0.7,
0.9, 1.3, 0.6, 0.8, 0.6), X34 = c(34, 40, 40, 35, 40, 30),
X40 = c(40L, 40L, 40L, 40L, 40L, 40L), X105 = c(105, 103.87,
108.06, 106.67, 103.87, 106.67), X100 = c(100L, 100L, 100L,
100L, 100L, 100L), band = c("noband", "noband", "noband",
"noband", "noband", "noband")), row.names = c(NA, 6L), class = "data.frame")
The columns types are:列类型为:
sapply(cylinder,class)
X19910108 X126 TVGUIDE X25503 YES KEY YES.1 BENTON GALLATIN
"character" "character" "character" "integer" "character" "character" "character" "character" "character"
UNCOATED UNCOATED.1 NO LINE YES.2 Motter94 X821 X2 TABLOID
"character" "character" "character" "character" "character" "character" "integer" "numeric" "character"
NorthUS X1911 X55 X46 X0.2 X17 X78 X0.75 X20
"character" "integer" "numeric" "integer" "character" "numeric" "integer" "numeric" "integer"
X13.1 X1700 X50.5 X36.4 X0 X0.1 X2.5 X1 X34
"numeric" "integer" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric"
X40 X105 X100 band
"integer" "numeric" "integer" "character"
I want to reorder the dataset columns such that the "numeric"
columns ( numeric & integers ) are first at the left.我想重新排序数据集列,使
"numeric"
列(数字和整数)位于左侧。 The "character"
columns must be at the right ! "character"
列必须在右边!
Thank you for help !谢谢你的帮助 !
We could also use a single where
with an |
我们也可以使用带有
|
的单个where
logical operator that combines the two expressions with a lambda call将两个表达式与 lambda 调用组合的逻辑运算符
library(dplyr)
df %>%
select(where(~ is.character(.)|is.numeric(.)))
Something along those lines would work沿着这些路线的东西会起作用
library(dplyr)
df %>% select(where(is.numeric), where(is.character))
Here is one option这是一种选择
cylinder[
order(
as.integer(
factor(sapply(cylinder, class),
levels = c("numeric", "integer", "character")
)
)
)
]
Compact with collapse
functions:具有
collapse
功能的紧凑型:
library(collapse)
colorderv(d, nv(d, 2))
# same but more explicit
colorderv(d, num_vars(d, return = "names")
Using nv
/ num_vars
together with data.table::setcolorder
, we can update column order by reference:使用
nv
/ num_vars
和data.table::setcolorder
,我们可以通过引用更新列顺序:
setcolorder(d, nv(d, 2))
I had found a possible solution, I'm asking if someone could suggest better one:我找到了一个可能的解决方案,我在问是否有人可以提出更好的解决方案:
unique(sapply(cylinder,class))
[1] "character" "integer" "numeric"
> my.order <-unique(sapply(cylinder,class))
> head(cylinder %>%
+ select(sapply(., class) %>% .[order(match(., my.order))] %>% names))
X19910108 X126 TVGUIDE YES KEY YES.1 BENTON GALLATIN UNCOATED UNCOATED.1 NO LINE YES.2 Motter94 TABLOID NorthUS
1 19910109 X266 TVGUIDE YES KEY YES BENTON GALLATIN UNCOATED UNCOATED NO LINE YES Motter94 TABLOID NorthUS
2 19910104 B7 MODMAT YES KEY YES BENTON GALLATIN UNCOATED COATED NO LINE YES WoodHoe70 CATALOG NorthUS
3 19910104 T133 MASSEY YES KEY YES BENTON GALLATIN UNCOATED UNCOATED NO LINE YES WoodHoe70 CATALOG NorthUS
4 19910111 J34 KMART NO KEY YES BENTON GALLATIN UNCOATED COATED NO LINE YES WoodHoe70 TABLOID <NA>
5 19910104 T218 MASSEY YES KEY YES BENTON GALLATIN UNCOATED UNCOATED NO LINE YES WoodHoe70 CATALOG NorthUS
6 19910111 X249 ROSES NO KEY YES BENTON GALLATIN COATED COATED NO LINE YES Motter94 TABLOID CANADIAN
X0.2 band X25503 X821 X1911 X46 X78 X20 X1700 X40 X100 X2 X55 X17 X0.75 X13.1 X50.5 X36.4 X0 X0.1 X2.5 X1
1 0.3 noband 25503 821 NA 46 80 20 1900 40 100 2 55 15.0 0.7500 6.6 54.9 38.5 0 0 2.5 0.7
2 0.433 noband 47201 815 NA 40 80 30 1850 40 100 9 62 16.0 NA 6.5 53.8 39.8 0 0 2.8 0.9
3 0.3 noband 39039 816 1910 40 75 30 1467 40 100 9 52 16.0 0.3125 5.6 55.6 38.8 0 0 2.5 1.3
4 0.3 noband 37351 816 1910 46 80 30 2100 40 100 2 50 17.0 0.7500 0.0 57.5 42.5 5 0 2.3 0.6
5 0.267 noband 38039 816 1910 40 76 28 1467 40 100 2 50 16.8 0.4375 8.6 53.8 37.6 5 0 2.5 0.8
6 0.3 noband 35751 827 1911 46 75 30 2600 40 100 2 50 16.5 0.7500 0.0 62.5 37.5 6 0 2.5 0.6
X34 X105
1 34 105.00
2 40 103.87
3 40 108.06
4 35 106.67
5 40 103.87
6 30 106.67
By luck, i had found a built-in function for this purpose:幸运的是,我为此目的找到了一个内置的 function:
df %>% relocate(where(is.numeric), .after = where(is.character))
https://dplyr.tidyverse.org/reference/relocate.html https://dplyr.tidyverse.org/reference/relocate.html
This solution implies that columns have names.此解决方案意味着列具有名称。 For example:
例如:
colnames(df)<-NULL
df %>% relocate(where(is.numeric), .after = where(is.character))
Will give the following error:会报以下错误:
Error: Can't select within an unnamed vector.
Run `rlang::last_error()` to see where the error occurred.
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.