[英]Sum columns with similar names
我有一個帶有名稱的數字向量var
(predict.cv.glmnet的輸出)
var<-c(5.74,0.00,0.15,0.00,0.04,0.00,0.00,0.00,1.81,0.00)
names(var)<- cbind("(Intercept)","as.factor(holiday)1","as.factor(season)2","as.factor(season)3","as.factor(season)4","as.factor(weathersit)2", "as.factor(weathersit)3","windspeed","temp","hum")
(Intercept) as.factor(holiday)1 as.factor(season)2 as.factor(season)3 as.factor(season)4 as.factor(weathersit)2
5.74 0.00 0.15 0.00 0.04 0.00
as.factor(weathersit)3 windspeed temp hum
0.00 0.00 1.81 0.00
我想提取具有非零值的變量名稱,並且還要匯總因子水平(即,如果因子的至少一個水平為非零,則應包括整個因子。輸出應省略因子水平。我正在尋找一段代碼可以讓我得到這樣的結果:
"(Intercept)" "as.factor(season)" "temp"
我也有一個可用因子名稱fac
的變量:
fac<-c("as.factor(holiday)","as.factor(season)","as.factor(weathersit)")
"as.factor(holiday)" "as.factor(season)" "as.factor(weathersit)"
並考慮了匯總名稱相似的因素,同時省略了其級別,並檢查匯總因素的總和是否大於0,但我無法對其進行編碼。
我玩過which
和正則表達式:
var<-c(5.74,0.00,0.15,0.00,0.04,0.00,0.00,0.00,1.81,0.00)
names(var)<- cbind("(Intercept)","as.factor(holiday)1","as.factor(season)2","as.factor(season)3","as.factor(season)4","as.factor(weathersit)2", "as.factor(weathersit)3","windspeed","temp","hum")
X <- names(var)[which(var!=0)]
n <- grep( "as[.]factor.*", X )
X[n] <- gsub( ")[0-9]+$", ")", X[n] )
X <- unique(X)
X
#[1] "(Intercept)" "as.factor(season)" "temp"
which
選擇所述非零分量。 grep
用於查找因子索引。 然后gsub
刪除因子水平。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.