简体   繁体   中英

How to use a for-loop to apply a function to specific values in a column in a dataframe

I have a dataframe in which one column contains the time between two events, expressed in years. I'd like to have R make a new column containing the value of the observations containing a value of <1 year, expressed in days.

I have tried using lapply to solve this, but lapply gives a matrix as a value which is not ideal for me. I'd like to use a for loop for this, but my experience with for loops is limited.

dataframe <- data.frame(id=c(1,2,3,4,5), 
             names=c('a','b','c','d','e'),
             time_in_years=c(5.81, 0.39, 5.66, 4.18, 0.16),
             other_variable=c(3,4,23,0.7,76)
)

How would i go about constructing a for loop which adds a column containing the values of "time_in_years" between 0 and 1 times 365.25? Thanks!

Here is a base R solution using ifelse()

dataframe <- within(dataframe, days <- 365.25*ifelse(time_in_years<1, time_in_years,NA))

such that

> dataframe
  id names time_in_years other_variable     days
1  1     a          5.81            3.0       NA
2  2     b          0.39            4.0 142.4475
3  3     c          5.66           23.0       NA
4  4     d          4.18            0.7       NA
5  5     e          0.16           76.0  58.4400

Use mutate from dplyr

You can use something like this:

library(dplyr)
dataframe <- dataframe %>%
   mutate(days = ifelse(between(time_in_years, 0, 1), time_in_years * 365.25, NA))

> dataframe

   id names time_in_years other_variable   days
1  1     a          5.81            3.0       NA
2  2     b          0.39            4.0 142.4475
3  3     c          5.66           23.0       NA
4  4     d          4.18            0.7       NA
5  5     e          0.16           76.0  58.4400

Without external libraries

If you don't want to install any external package, you can use something like this:

dataframe$time_in_days <- ifelse(dataframe$time_in_years > 0 & dataframe$time_in_years < 1, 
                                 dataframe$time_in_years * 365.25, 
                                 NA)

Hope this helps.

The thing I like with data.table , is that you don't need ifelse in this situation:

library(data.table)
datatable <- setDT(dataframe)
datatable[time_in_years<1,days := time_in_years*365.25]

   id names time_in_years other_variable     days
1:  1     a          5.81            3.0       NA
2:  2     b          0.39            4.0 142.4475
3:  3     c          5.66           23.0       NA
4:  4     d          4.18            0.7       NA
5:  5     e          0.16           76.0  58.4400

you could also do with filter and a join in dplyr :

dataframe %>%
  filter(time_in_years < 1) %>%
  mutate(days = time_in_years * 365.25) %>%
  full_join(.,dataframe)

  id names time_in_years other_variable     days
1  2     b          0.39            4.0 142.4475
2  5     e          0.16           76.0  58.4400
3  1     a          5.81            3.0       NA
4  3     c          5.66           23.0       NA
5  4     d          4.18            0.7       NA

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM