[英]R: for loop output save only last results in output dataframe
I have the following for loop script:我有以下 for 循环脚本:
# Create example data
dataKM <- data.frame(x1 = 1:5,
x2 = 6:10,
x3 = 11:15)
# Duplicate dataframe
datatest <- dataKM[c(1:3)]
# for loop
for(i in colnames(dataKM[,2:ncol(dataKM)])) {
# median of each single column of dataframe
median <- median(dataKM[,i])
# add column in duplicated dataframe with 'High' or 'low' based on median for each column
datatest$median[dataKM[,i] <= median ] <- "Low"
datatest$median[dataKM[,i] > median ] <- "High"
}
I'm trying to repeat for loop for each column of dataKM dataframe and save results as column in dataset dataframe.我正在尝试对dataKM dataframe的每一列重复for循环,并将结果保存为数据集dataframe中的列。 My script save only the last iteration.我的脚本只保存最后一次迭代。 Probably I get a single output because I overwrite the previous value on each pass in the loop.可能我得到一个 output 因为我在循环中的每次传递时覆盖了以前的值。 I'd like to know how I can save all for loop output in their respective column.我想知道如何将所有 for loop output 保存在各自的列中。 Can anyone help me?谁能帮我? Thank you so much, I hope this can be useful even for someone else trying to do something similar.非常感谢,我希望这对尝试做类似事情的其他人也有用。
We can just use lapply
function我们可以使用lapply
function
datatest <- dataKM[c(2:3)]
datatest[] <- lapply(dataKM[-1] , function(x) ifelse(x <= median(x) , "Low" , "High"))
colnames(datatest) <- c("x2Median" , "x3Median")
cbind(dataKM , datatest)
x1 x2 x3 x2Median x3Median
1 1 6 11 Low Low
2 2 7 12 Low Low
3 3 8 13 Low Low
4 4 9 14 High High
5 5 10 15 High High
If you insist using for loop
try this如果你坚持使用for loop
试试这个
datatest <- dataKM[c(1:3)]
for(i in colnames(dataKM[-1])) {
median <- median(dataKM[,i])
datatest[[paste0(i,"median")]][dataKM[,i] <= median ] <- "Low"
datatest[[paste0(i,"median")]][dataKM[,i] > median ] <- "High"
}
I am not sure what is compared with what.我不确定什么与什么比较。 But here is an example were x2 value or x3 value is compared with its column median:但这里有一个示例,将 x2 值或 x3 值与其列中位数进行比较:
Here is a dplyr approach:这是 dplyr 方法:
library(dplyr)
dataKM %>%
mutate(across(-1, ~case_when(. <= median(., na.rm=TRUE) ~ "Low",
. > median(., nar.rm=TRUE) ~ "High"), .names = "Median_{.col}"))
x1 x2 x3 Median_x2 Median_x3
1 1 6 11 Low Low
2 2 7 12 Low Low
3 3 8 13 Low Low
4 4 9 14 High High
5 5 10 15 High High
Currently, you are updating a single new column, median .目前,您正在更新一个新列median 。 Simply adjust to create new median column with each iteration of for
loop, concatenating the column current column name and median .只需调整以在for
循环的每次迭代中创建新的中值列,将列当前列名称和中值连接起来。
# for loop
for(col in colnames(dataKM[,2:ncol(dataKM)])) {
curr_col <- dataKM[[col]]
# median of each single column of dataframe
col_median <- median(curr_col)
# add column in duplicated dataframe with 'High' or 'low' based on median for each column
datatest[[paste0(col, "_median")]][curr_col <= col_median] <- "Low"
datatest[[paste0(col, "_median")]][curr_col > col_median] <- "High"
}
Alternatively, with ifelse
:或者,使用ifelse
:
for(col in colnames(dataKM[,2:ncol(dataKM)])) {
curr_col <- dataKM[[col]]
col_median <- median(curr_col)
datatest[[paste0(col, "_median")]] <- ifelse(
curr_col <= col_median, "Low", " High"
)
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.