简体   繁体   English

如何将因子更改为数字变量或以其他方式处理我在线性回归中遇到的这个错误

[英]How can I change the factors to numeric variables or otherwise deal with this error I'm getting in my linear regression

Trying to run a linear regression model with this dataset, mro.csv, but when I run lm() it gives the error message:尝试使用此数据集 mro.csv 运行线性回归模型,但是当我运行 lm() 时,它给出了错误消息:

1: In model.response(mf, "numeric") :
  using type = "numeric" with a factor response will be ignored
2: In Ops.factor(y, z$residuals) : ‘-’ not meaningful for factors

Not sure what parts of the dataset are factors and not numeric, all the data is numbers except column names.Also unsure what the '-' not meaningful for factors part is about because there are no -'s in the dataset either.不确定数据集的哪些部分是因子而不是数字,所有数据都是数字,列名除外。也不确定“-”对因子部分没有意义是什么,因为数据集中也没有 -。

Not sure how to share the dataset, but here's the csv in a google sheet: mro.csv不知道如何共享数据集,但这是谷歌表中的 csv: mro.csv

> raw <- read.csv("/Users/cpt.jack/Downloads/mro.csv",header<-F,sep<-",") 
> colnames(raw)<- c("inlf","hours","kidslt6","kidsge6","age", "educ",  "wage", "repwage",             "hushrs", "husage", "huseduc","huswage",  "faminc",  "mtr",  "motheduc",  "fatheduc",    "unem","city", "exper",  "nwifeinc",  "lwage",  "expersq")  
> 
> 
> dim(raw)
[1] 753  22
> 
> set.seed(88)
> raw  <- raw[sample(nrow(raw)),]
> 
> 
> raw1<-raw[raw$inlf==1,]
> dim(raw)
[1] 753  22
> dim(raw1)
[1] 428  22
> 
> 
> reg1 <- lm(wage~ hours + kidslt6 + kidsge6 + age + educ + hushrs + husage + huseduc + huswage
+mtr+motheduc+fatheduc+unem
+exper+nwifeinc, data=raw1)

Warning messages:

1: In model.response(mf, "numeric") :
  using type = "numeric" with a factor response will be ignored

2: In Ops.factor(y, z$residuals) : ‘-’ not meaningful for factors
> reg1 <- lm(wage~ hours,data=raw1)

wage and lwage are being read as factor s because they contain the value "." wagelwage被视为factor s,因为它们包含值"." which can't be parsed as numeric.不能解析为数字。 This value can be handled manually.该值可以手动处理。

df <- read.csv(
  "~/Downloads/mro.csv",
  header = FALSE,
  stringsAsFactors = FALSE,
  col.names = c(
    "inlf", "hours", "kidslt6", "kidsge6", "age", "educ",  "wage",
    "repwage", "hushrs", "husage", "huseduc", "huswage",  "faminc",
    "mtr",  "motheduc",  "fatheduc", "unem", "city", "exper",
    "nwifeinc", "lwage", "expersq"
  )
)

df$wage <- as.numeric(ifelse(df$wage == ".", 0, df$wage))
df$lwage <- as.numeric(ifelse(df$lwage == ".", 0, df$lwage))

Now the lm should run without issues.现在lm应该可以正常运行了。

df <- df[sample(nrow(df)), ]
df1 <- df[df$inlf == 1, ]

reg1 <- lm(
  wage ~ hours + kidslt6 + kidsge6 + age + educ + hushrs + husage + huseduc +
         huswage + mtr + motheduc + fatheduc + unem + exper + nwifeinc,
  data = df1
)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 R:在执行线性回归时,我应该如何处理只有 1 个计数的变量? - R: How should I deal with variables that only have 1 count when performing linear regression? 我打算通过for循环附加我的线性回归结果,但出现错误。 我该如何解决? - I've intended attach my linear regression result by for loop but got an error. How can I solve it? 如何执行线性回归而没有错误? - How can I perform Linear regression without error? 线性模型中因子为数值的误差 - Error with factors as numeric in linear model 如何为这个线性 model 构建回归? - How can I build a regression for this linear model? 我如何预测我的线性元回归中的调节变量之一? - How do i predict one of the moderator variables in my linear meta-regression? 为什么我在逻辑回归中收到“权重错误 * y : 二元运算符的非数字参数”? - Why am I getting 'Error in weights * y : non-numeric argument to binary operator' in my logistic regression? 如何通过组合R中的所有变量来修改这些dplyr代码以进行多元线性回归 - How can I modify these dplyr code for multiple linear regression by combination of all variables in R 如何在数据框中的所有连续变量之间执行和存储线性回归模型? - How can I perform and store linear regression models between all continuous variables in a data frame? 如何使用 dplyr 运行所有子集回归并获得每个线性回归的变量 p 值作为 dataframe? - How can I run all subset regression and get p-values of variables per each linear regression as dataframe using dplyr?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM