簡體   English   中英

總和具有相似名稱的列

[英]Sum columns with similar names

我有一個帶有名稱的數字向量var (predict.cv.glmnet的輸出)

var<-c(5.74,0.00,0.15,0.00,0.04,0.00,0.00,0.00,1.81,0.00)
names(var)<- cbind("(Intercept)","as.factor(holiday)1","as.factor(season)2","as.factor(season)3","as.factor(season)4","as.factor(weathersit)2", "as.factor(weathersit)3","windspeed","temp","hum")

(Intercept)    as.factor(holiday)1     as.factor(season)2     as.factor(season)3     as.factor(season)4      as.factor(weathersit)2 
   5.74              0.00                    0.15                       0.00                  0.04                   0.00 
as.factor(weathersit)3              windspeed                   temp                    hum 
           0.00                       0.00                      1.81                    0.00 

我想提取具有非零值的變量名稱,並且還要匯總因子水平(即,如果因子的至少一個水平為非零,則應包括整個因子。輸出應省略因子水平。我正在尋找一段代碼可以讓我得到這樣的結果:

"(Intercept)"        "as.factor(season)"         "temp"   

我也有一個可用因子名稱fac的變量:

fac<-c("as.factor(holiday)","as.factor(season)","as.factor(weathersit)")


 "as.factor(holiday)"    "as.factor(season)"     "as.factor(weathersit)"

並考慮了匯總名稱相似的因素,同時省略了其級別,並檢查匯總因素的總和是否大於0,但我無法對其進行編碼。

我玩過which和正則表達式:

var<-c(5.74,0.00,0.15,0.00,0.04,0.00,0.00,0.00,1.81,0.00)
names(var)<- cbind("(Intercept)","as.factor(holiday)1","as.factor(season)2","as.factor(season)3","as.factor(season)4","as.factor(weathersit)2", "as.factor(weathersit)3","windspeed","temp","hum")

X <- names(var)[which(var!=0)]
n <- grep( "as[.]factor.*", X )
X[n] <- gsub( ")[0-9]+$", ")", X[n] )

X <- unique(X)
X

#[1] "(Intercept)"       "as.factor(season)" "temp"  

which選擇所述非零分量。 grep用於查找因子索引。 然后gsub刪除因子水平。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM