简体   繁体   English

R-根据条件在数据框中创建新列

[英]R - create new column in data frame based on conditional

I need to create a column in a data frame with a string of yrs that will identify each yr as "leap" or "reg" (regular) automatically. 我需要在数据框中创建一列yrs,以自动将每个yr标识为"leap""reg" (常规)。

This is what I have thus far: 到目前为止,这是我所拥有的:

Delimit the time period 划定时间段

year<-(2009:2017)

Create a data frame with a single column for that time period 在该时间段内创建具有单个列的数据框

prd_df<-data.frame(year)

Create an empty column where "leap" and "reg" yrs will be identified 创建一个空列,其中将标识"leap""reg"

prd_df["leap"]<-NA

Base identification with a conditional loop 有条件循环的碱基识别

for(i in 1:length(prd_df$year)){
  if((prd_df$year[i]%%4==0)&(prd_df$year[i]%%100!=0)){
    prd_df$leap<-'leap'
  }else if((prd_df$year[i]%%4==0)&(prd_df$year[i]%%100==0)&(prd_df$year[i]%%400==0)){
    prd_df$leap<-'leap' 
  }else{
    prd_df$leap<-'reg'
  }
}

Create a table from the resulting data frame. 从结果数据框中创建一个表。

write.table(prd_df,
          file = "prd.csv",
          row.names = F, col.names = T,
          sep = "\t")

This is what I get: 这是我得到的:

"year"  "leap"
2009    "reg"
2010    "reg"
2011    "reg"
2012    "reg"
2013    "reg"
2014    "reg"
2015    "reg"
2016    "reg"
2017    "reg"

In the example above, 2012 and 2016 should be identified as "leap" in the second column, but it is not working. 在上面的示例中,第二列中应将2012和2016标识为"leap" ,但它不起作用。 The conditional has worked fine before as part of other codes but I can't get it to work now. 作为其他代码的一部分,该条件以前已经可以正常工作,但是我现在无法使其正常工作。 May it not be recognized prd_df$year as numeric? 可能无法将prd_df$year识别为数字吗?

Any suggestions will be most appreciated. 任何建议将不胜感激。

Thanks 谢谢

We can use an ifelse 我们可以用ifelse

prd_df$leap <- with(prd_df, ifelse(year %%4== 0 & year %%100 !=0, "leap", "reg"))
prd_df$leap
#[1] "reg"  "reg"  "reg"  "leap" "reg"  "reg"  "reg"  "leap" "reg" 

Or with case_when from dplyr 或与case_whendplyr

library(dplyr)
prd_df %>%
       mutate(leap = case_when(year %%4 == 0 & year %% 100 !=0 ~ "leap", 
                               TRUE ~"reg"))
#   year leap
#1 2009  reg
#2 2010  reg
#3 2011  reg
#4 2012 leap
#5 2013  reg
#6 2014  reg
#7 2015  reg
#8 2016 leap
#9 2017  reg

For your code , You missed a [i] , when assign the new value to column leaf 对于您的代码,将新值分配给列leaf时,您错过了[i]

 year<-(2009:2017)
    prd_df<-data.frame(year)
    prd_df["leap"]<-NA

    for(i in 1:length(prd_df$year)){
        if((prd_df$year[i]%%4==0)&(prd_df$year[i]%%100!=0)){
            prd_df$leap[i]<-'leap'#add [i] here
        }
        else if((prd_df$year[i]%%4==0)&(prd_df$year[i]%%100==0)&(prd_df$year[i]%%400==0)){
            prd_df$leap[i]<-'leap' #add [i] here
        }else{
            prd_df$leap[i]<-'reg'#add [i] here
        }
    }


prd_df
  year leap
1 2009  reg
2 2010  reg
3 2011  reg
4 2012 leap
5 2013  reg
6 2014  reg
7 2015  reg
8 2016 leap
9 2017  reg

ifelse multiple conditions ifelse多个条件

with(prd_df, ifelse(year %%4== 0 & year %%100 !=0, "leap", ifelse(year %%4== 0 & year %%100 !=0&year%%400==0,"leap","reg")))
[1] "reg"  "reg"  "reg"  "leap" "reg"  "reg"  "reg"  "leap" "reg" 

Try to search lubridate package - I think there should be function to check if year is leap. 尝试搜索lubridate软件包-我认为应该有功能来检查年份是否飞跃。 And for conditions use mutate with case_when from dplyr package. 对于条件,请从dplyr软件包中使用mutate与case_when。

Whole code should be no longer than 5 lines :) 整个代码不应超过5行:)

library(dplyr)
library(lubridate)

year_df <- data_frame(year = 1999:2017)

year_df <- year_df %>%
    mutate(leap = ifelse(leap_year(.$year), "leap", "reg"))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM