[英]R, how to replace only the numeric values of a dataframe?
I am working on R 3.4.3 on Windows 10. I have a dataframe made of numeric values and characters. 我正在使用Windows 10上的R 3.4.3。我有一个由数值和字符组成的数据框。 I would like to replace only the numeric values but when I do that the characters also change and are replaced.
我想只替换数值但是当我这样做时,字符也会改变并被替换。
How can I edit my function to make it affect only the numeric values and not the characters? 如何编辑我的函数以使其仅影响数值而不影响字符?
Here is the piece of code of my function: 这是我的函数的代码片段:
dataframeChange <- function(dFrame){
thresholdVal <- 20
dFrame[dFrame >= thresholdVal] <- -1
return(dFrame)
}
Here is a dataframe example: 这是一个数据框示例:
example_df <- data.frame(
myNums = c (1:5),
myChars = c("A","B","C","D","E"),
stringsAsFactors = FALSE
)
Thanks for the help! 谢谢您的帮助!
As Tim's comment, you should be aware of the location of the numeric columns which we can locate them using ind <- sapply(dFrame, is.numeric)
作为Tim的评论,您应该知道我们可以使用
ind <- sapply(dFrame, is.numeric)
找到它们的数字列的位置
dataframeChange <- function(dFrame){
#browser()
thresholdVal <- 20
ind <- sapply(dFrame, is.numeric)
dFrame[(dFrame[,ind] >= thresholdVal),ind] <- -1
#dFrame[dFrame >= thresholdVal] <- -1
return(dFrame)
}
Use mutate_if
from dplyr
: 使用
mutate_if
的dplyr
:
library(dplyr)
example_df %>% mutate_if(is.numeric, funs(if_else(. >= thresh, repl, .)))
myNums myChars
1 10 A
2 -1 B
3 -1 C
4 5 D
5 -1 E
Explanation: 说明:
mutate
family of functions is for variable assignment or updating. mutate
函数系列用于变量赋值或更新。 mutate_if
functions (specified within funs()
) are only applied to columns which satisfy the first argument (in this case, is.numeric()
) mutate_if
函数(内指定funs()
只应用到满足的第一个参数的列(在这种情况下, is.numeric()
if_else
clause based on OP rules. if_else
子句。 Data: 数据:
thresh <- 20
repl <- -1.0
example_df <- data.frame(
myNums = c(10,20,30,5,70),
myChars = c("A","B","C","D","E"),
stringsAsFactors = FALSE
)
example_df
myNums myChars
1 10 A
2 20 B
3 30 C
4 5 D
5 70 E
Using data.table
, we can avoid explicit loops and is faster. 使用
data.table
,我们可以避免显式循环并且更快。 Here I've set the threshold value as 2: 在这里,我将阈值设置为2:
# set to data table
setDT(example_df)
# get numeric columns
num_cols <- names(example_df)[sapply(example_df, is.numeric)]
# loop over all columns at once
example_df[,(num_cols) := lapply(.SD, function(x) ifelse(x>2,-1, x)), .SDcols=num_cols]
print(example_df)
myNums myChars
1: 1 A
2: 2 B
3: -1 C
4: -1 D
5: -1 E
Another data.table
solution. 另一个
data.table
解决方案。
library(data.table)
dataframeChange <- function(dFrame){
setDT(dFrame)
for(j in seq_along(dFrame)){
set(dFrame, i= which(dFrame[[j]] < 20), j = j, value = -1)
}
}
dataframeChange_dt(example_df)
example_df
# myNums myChars
# 1: -1 A
# 2: 20 B
# 3: 30 C
# 4: -1 D
# 5: 70 E
It does not explicitly call only numeric columns, however I tested on multiple datasets and it does not effect the non-numeric columns. 它没有显式地只调用数字列,但是我在多个数据集上测试它并不影响非数字列。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.