總和具有相似名稱的列

Question

我有一個帶有名稱的數字向量var （predict.cv.glmnet的輸出）

var<-c(5.74,0.00,0.15,0.00,0.04,0.00,0.00,0.00,1.81,0.00)
names(var)<- cbind("(Intercept)","as.factor(holiday)1","as.factor(season)2","as.factor(season)3","as.factor(season)4","as.factor(weathersit)2", "as.factor(weathersit)3","windspeed","temp","hum")

(Intercept)    as.factor(holiday)1     as.factor(season)2     as.factor(season)3     as.factor(season)4      as.factor(weathersit)2 
   5.74              0.00                    0.15                       0.00                  0.04                   0.00 
as.factor(weathersit)3              windspeed                   temp                    hum 
           0.00                       0.00                      1.81                    0.00

我想提取具有非零值的變量名稱，並且還要匯總因子水平（即，如果因子的至少一個水平為非零，則應包括整個因子。輸出應省略因子水平。我正在尋找一段代碼可以讓我得到這樣的結果：

"(Intercept)"        "as.factor(season)"         "temp"

我也有一個可用因子名稱fac的變量：

fac<-c("as.factor(holiday)","as.factor(season)","as.factor(weathersit)")


 "as.factor(holiday)"    "as.factor(season)"     "as.factor(weathersit)"

並考慮了匯總名稱相似的因素，同時省略了其級別，並檢查匯總因素的總和是否大於0，但我無法對其進行編碼。

Answer 1

我玩過which和正則表達式：

var<-c(5.74,0.00,0.15,0.00,0.04,0.00,0.00,0.00,1.81,0.00)
names(var)<- cbind("(Intercept)","as.factor(holiday)1","as.factor(season)2","as.factor(season)3","as.factor(season)4","as.factor(weathersit)2", "as.factor(weathersit)3","windspeed","temp","hum")

X <- names(var)[which(var!=0)]
n <- grep( "as[.]factor.*", X )
X[n] <- gsub( ")[0-9]+$", ")", X[n] )

X <- unique(X)
X

#[1] "(Intercept)"       "as.factor(season)" "temp"

which選擇所述非零分量。 grep用於查找因子索引。 然后gsub刪除因子水平。

總和具有相似名稱的列

問題描述

1 個解決方案

解決方案1
0 已采納 2016-02-12 13:32:22

總和具有相似名稱的列

問題描述

1 個解決方案

解決方案1 0 已采納 2016-02-12 13:32:22

解決方案1
0 已采納 2016-02-12 13:32:22