简体   繁体   English

从数据框中删除因子级别

[英]remove factor level from dataframe

I downloaded the titanic train dataset from kaggle :我从kaggle下载了泰坦尼克号火车数据集:

My code is :我的代码是:

df = read.csv('titanic.csv', header=TRUE)
df$Pclass = as.factor(df$Pclass)
df$Survived = as.factor(df$Survived)
df = df[,c(2,3,5,6,12)]
df = na.omit(df)
rownames(df) <- 1:nrow(df)
df$Age[df$Age <= 18] = "child"
df$Age[(df$Age > 18) & (df$Age <= 60) & (df$Age != "child")] = "adult"
df$Age[(df$Age != "child") & (df$Age != "adult")] = "senior"
df$Age = as.factor(df$Age)
summary(df)

At this point the result of summary is :此时总结的结果是:

 Survived Pclass      Sex          Age      Embarked
 0:424    1:186   female:261   adult :553    :  2   
 1:290    2:173   male  :453   child :139   C:130   
          3:355                senior: 22   Q: 28   
                                            S:554 

My problem is Embarked variable:我的问题是登船变量:

barplot(table(df$Embarked), xlab="Port of Embarkment", ylab="Frequency", main="Histograma de la variable \n Embarked")

barplot output条形图输出

The levels of Embarked登船的水平

> levels(df$Embarked)
[1] ""  "C" "Q" "S"

Here is my problem, this first level : "" (empty) , I can't find a way to remove it.这是我的问题,第一级 : "" (empty) ,我找不到删除它的方法。 I've been testing several ways I found in stackoverflow without being able to solve my problem.我一直在测试我在 stackoverflow 中找到的几种方法,但无法解决我的问题。

After removing the lines with empty values for Embarked, refactorize:删除 Embarked 的空值行后,重构:

df <- df[df$Embarked!="",]
df$Embarked <- factor(df$Embarked)
barplot(table(df$Embarked), xlab="Port of Embarkment", 
        ylab="Frequency", main="Histograma de la variable \n Embarked")

Alternatively, you could also use droplevels :或者,您也可以使用droplevels

df <- droplevels(df)

New levels of Embarked : Embarked新关卡:

> levels(df$Embarked)
[1] "C" "Q" "S"

The advantage of this approach it will drop all unused levels from a factor.这种方法的优点是它将从一个因子中删除所有未使用的级别。 You can also drop all unused levels from factors in a whole data frame.您还可以从整个数据框中的因子中删除所有未使用的级别。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM