简体   繁体   English

R,如何只替换数据帧的数值?

[英]R, how to replace only the numeric values of a dataframe?

I am working on R 3.4.3 on Windows 10. I have a dataframe made of numeric values and characters. 我正在使用Windows 10上的R 3.4.3。我有一个由数值和字符组成的数据框。 I would like to replace only the numeric values but when I do that the characters also change and are replaced. 我想只替换数值但是当我这样做时,字符也会改变并被替换。

How can I edit my function to make it affect only the numeric values and not the characters? 如何编辑我的函数以使其影响数值而不影响字符?

Here is the piece of code of my function: 这是我的函数的代码片段:

dataframeChange <- function(dFrame){
  thresholdVal <- 20
  dFrame[dFrame >= thresholdVal] <- -1
  return(dFrame)
  }

Here is a dataframe example: 这是一个数据框示例:

example_df <- data.frame(
   myNums = c (1:5), 
   myChars = c("A","B","C","D","E"),
   stringsAsFactors = FALSE
 )

Thanks for the help! 谢谢您的帮助!

As Tim's comment, you should be aware of the location of the numeric columns which we can locate them using ind <- sapply(dFrame, is.numeric) 作为Tim的评论,您应该知道我们可以使用ind <- sapply(dFrame, is.numeric)找到它们的数字列的位置

dataframeChange <- function(dFrame){
                    #browser()
                    thresholdVal <- 20
                    ind <- sapply(dFrame, is.numeric)
                    dFrame[(dFrame[,ind] >= thresholdVal),ind] <- -1
                    #dFrame[dFrame >= thresholdVal] <- -1
                    return(dFrame)
                  }

Use mutate_if from dplyr : 使用mutate_ifdplyr

library(dplyr)

example_df %>% mutate_if(is.numeric, funs(if_else(. >= thresh, repl, .)))

  myNums myChars
1     10       A
2     -1       B
3     -1       C
4      5       D
5     -1       E

Explanation: 说明:

  • The mutate family of functions is for variable assignment or updating. mutate函数系列用于变量赋值或更新。
  • mutate_if functions (specified within funs() ) are only applied to columns which satisfy the first argument (in this case, is.numeric() ) mutate_if函数(内指定funs()只应用到满足的第一个参数的列(在这种情况下, is.numeric()
  • The updating function is a simple if_else clause based on OP rules. 更新函数是一个基于OP规则的简单if_else子句。

Data: 数据:

thresh <- 20
repl <- -1.0

example_df <- data.frame(
   myNums = c(10,20,30,5,70), 
   myChars = c("A","B","C","D","E"),
   stringsAsFactors = FALSE
 ) 

example_df
  myNums myChars
1     10       A
2     20       B
3     30       C
4      5       D
5     70       E

Using data.table , we can avoid explicit loops and is faster. 使用data.table ,我们可以避免显式循环并且更快。 Here I've set the threshold value as 2: 在这里,我将阈值设置为2:

# set to data table
setDT(example_df)

# get numeric columns
num_cols <- names(example_df)[sapply(example_df, is.numeric)]

# loop over all columns at once
example_df[,(num_cols) := lapply(.SD, function(x) ifelse(x>2,-1, x)), .SDcols=num_cols]

print(example_df)

   myNums myChars
1:      1       A
2:      2       B
3:     -1       C
4:     -1       D
5:     -1       E

Another data.table solution. 另一个data.table解决方案。

library(data.table)

dataframeChange <- function(dFrame){
    setDT(dFrame)
    for(j in seq_along(dFrame)){
       set(dFrame, i= which(dFrame[[j]] < 20), j = j, value = -1)
    }
}

dataframeChange_dt(example_df)

example_df
#    myNums myChars
# 1:     -1       A
# 2:     20       B
# 3:     30       C
# 4:     -1       D
# 5:     70       E

It does not explicitly call only numeric columns, however I tested on multiple datasets and it does not effect the non-numeric columns. 它没有显式地只调用数字列,但是我在多个数据集上测试它并不影响非数字列。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM