[英]How can I sort a dataframe by a predetermined order of factor levels in R?
[英]How do I create a factor with three levels in a dataframe in R?
我為 19 個不同的距離創建了一個因子,我需要確定三個級別,一個用於直接影響 (DirImp),另一個用於我各自的間接影響距離,(Dist="1km_","2km_","3km_","4km_" ,"5km_","6km_","7km_","8km_","9km_","10km_","10km","20km","30km","40km","50km","60km"," 70km") 和其他到我的控制區域 (Contrl),從距離 0 (DirImp) 開始,每公里增加公里,直到達到 10 公里,從這一點開始,每 10 公里增加一次,直到達到 70 公里,並且最后的距離是控制。
因此,為了澄清,在我的DataFrame
中,我有一列(Dist),其中包含這些距離和其他列以及其他信息,我使用此代碼將其轉換為一個因子:
column Dist estructure:
levels(MY.DTAFRAME$Dist)
[1] "DirImp" "10km" "10km_" "1km_" "20km" "2km_" "30km"
[8] "3km_" "40km" "4km_" "50km" "5km_" "60km" "6km_"
[15] "70km" "7km_" "8km_" "9km_", "control"
How I would like it to be:
level 1 = Direct impact ("DirImp")
level 2 = Distances ("1km_","2km_","3km_","4km_","5km_","6km_","7km_","8km_","9km_","10km_","10km","20km","30km","40km","50km","60km","70km")
level 3 = Contrl Area ("Contrl")
Column Dist = ("DirImp", "1km_","2km_","3km_","4km_","5km_","6km_","7km_","8km_","9km_","10km_","10km","20km","30km","40km","50km","60km","70km", "control")
MY.DATAFRAME$DistFact <- factor(MY.DATAFRAME$Dist, level ordered = TRUE)
levels(MY.DTAFRAME$DistFact)
[1] "DirImp" "10km" "10km_" "1km_" "20km" "2km_" "30km"
[8] "3km_" "40km" "4km_" "50km" "5km_" "60km" "6km_"
[15] "70km" "7km_" "8km_" "9km_", "control"
問題要求的內容類似於以下內容嗎?
forcats::fct_collapse(y,
DirImp = grep("DirImp", y, ignore.case = TRUE, value = TRUE),
Distances = grep("km", y, ignore.case = TRUE, value = TRUE),
Control = grep("control", y, ignore.case = TRUE, value = TRUE)
)
# [1] Distances Distances Distances Distances Distances Distances
# [7] Distances Distances Distances Distances Distances Distances
#[13] Distances Distances Distances Distances Distances Distances
#[19] Distances Distances Distances Distances Distances Distances
#[25] Distances Distances Distances Distances Control Distances
#Levels: DirImp Distances Control
或者,也許更具可讀性,
grep_tmp <- function(pattern, x){
grep(pattern, x, ignore.case = TRUE, value = TRUE)
}
forcats::fct_collapse(y,
DirImp = grep_tmp("DirImp", y),
Distances = grep_tmp("^\\d+km", y),
Control = grep_tmp("control", y)
)
數據
根據問題中發布的levels
,這里是示例數據。
set.seed(1234)
x <- scan(text = '"DirImp" "10km" "10km_" "1km_" "20km" "2km_" "30km"
"3km_" "40km" "4km_" "50km" "5km_" "60km" "6km_"
"70km" "7km_" "8km_" "9km_" "control"', what = character())
y <- factor(sample(x, 30, TRUE), levels = x)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.