Adding a factor to a dataframe depending on values in other columns

Question

Consider this dataframe.

    data <- structure(list(Sample1 = structure(1:10, .Label = c("100", "101", 
"102", "103", "104", "105", "106", "107", "108", "109"), class = "factor"), 
    Sample2 = structure(1:10, .Label = c("1", "10", "100", "101", 
    "102", "103", "104", "105", "106", "107"), class = "factor"), 
    Bray = c(0, -0.093229941171876, -0.101979485248057, -0.109527276554936, 
    -0.107218514918197, -0.12034240232431, -0.0867499433287722, 
    -0.0805681841664597, -0.086656413429741, -0.0871426867635103
    ), Space = c(0, 6.6986864383997, 6.6053482118659, 6.01295268566118, 
    6.43471833105382, 7.43673483458971, 7.78171093012327, 8.97899771689469, 
    9.32053646524705, 10.2821447179078), Time = c(0, 0, 42, 42, 
    42, 42, 42, 42, 42, 42)), .Names = c("Sample1", "Sample2", 
"Bray", "Space", "Time"), row.names = c(NA, 10L), class = "data.frame")

i would like to introduce a new column with a factor "Color" with the levels "Yes" and "No" depending if certain values appear in Sample1 or Sample2 . In this case, all rows with any value between 100 and 104 in columns Sample1 or Sample2 should get a "yes". How to do that?

Answer 1

We convert the 'Sample' columns to numeric and then use </> to get the logical vector, convert to numeric index and replace it with No/Yes

 data[1:2] <- lapply(data[1:2], function(x) as.numeric(as.character(x)))
 data$Color <- with(data, factor(c("No", "Yes")[((Sample1 < 104 & Sample1 > 100) |
                (Sample2 < 104 & Sample2 > 100))+1]))

NOTE: If the condition is including 100 and 104, change the </> to <=/>=

Or as @Frank mentioned %in% can work on factor columns as well (without changing the 'Sample' columns to numeric )

data$Color <- with(data, factor(c("No", "Yes")[((Sample1 %in% 100:104)| 
                       (Sample2 %in% 100:104)) + 1]))

Adding a factor to a dataframe depending on values in other columns

Question

1 answers

solution1
2 ACCPTED 2016-07-08 15:06:45

Adding a factor to a dataframe depending on values in other columns

Question

1 answers

solution1 2 ACCPTED 2016-07-08 15:06:45

solution1
2 ACCPTED 2016-07-08 15:06:45