[英]apply function to each element in dataframe in R
我想運行以下函數,它將一個數字標准化為我的數據框中的每個元素
norm_fn <- function(raw_score, min_score, max_score){
if(raw_score <= min_score){
norm_score <- 1
} else if (raw_score >= max_score){
norm_score <- 1
} else {
norm_score <- ((raw_score - min_score)/(max_score - min_score))
}
return(norm_score)
}
set.seed(123)
dat <- data.frame(ID = 1:10,
col1 = runif(10),
col2 = runif(10),
col3 = runif(10))
mn <- 0.01;mx <- 0.8
dat[, 2:4] <- apply(dat[, 2:4], MARGIN = 2, FUN = norm_fn, min_score = mn, max_score = mx)
我收到錯誤警告消息,並且看起來該功能不適用於col2
和col3
:
1: In if (raw_score <= min_score) { :
the condition has length > 1 and only the first element will be used
2: In if (raw_score >= max_score) { :
the condition has length > 1 and only the first element will be used
3: In if (raw_score <= min_score) { :
the condition has length > 1 and only the first element will be used
4: In if (raw_score >= max_score) { :
the condition has length > 1 and only the first element will be used
5: In if (raw_score <= min_score) { :
the condition has length > 1 and only the first element will be used
6: In if (raw_score >= max_score) { :
the condition has length > 1 and only the first element will be used
tidyverse
方法
library(tidyverse)
dat %>%
rowwise() %>%
mutate(
across(.cols = col1:col3, norm_fn, min_score = mn, max_score = mx)
) %>%
ungroup()
#> # A tibble: 10 × 4
#> ID col1 col2 col3
#> <int> <dbl> <dbl> <dbl>
#> 1 1 0.351 1 1
#> 2 2 0.985 0.561 0.864
#> 3 3 0.505 0.845 0.798
#> 4 4 1 0.712 1
#> 5 5 1 0.118 0.817
#> 6 6 0.0450 1 0.884
#> 7 7 0.656 0.299 0.676
#> 8 8 1 0.0406 0.739
#> 9 9 0.685 0.402 0.353
#> 10 10 0.565 1 0.174
我們可以對函數進行向Vectorize
,因為函數使用未向量化的if/else
dat[2:4] <- lapply(dat[2:4], Vectorize(norm_fn), min_score = mn, max_score = mx)
-輸出
> dat
ID col1 col2 col3
1 1 0.35136395 1.00000000 1.0000000
2 2 0.98519637 0.56118248 0.8643081
3 3 0.50503408 0.84502612 0.7981099
4 4 1.00000000 0.71219418 1.0000000
5 5 1.00000000 0.11762618 0.8173491
6 6 0.04500823 1.00000000 0.8842158
7 7 0.65582973 0.29884523 0.6760329
8 8 1.00000000 0.04058169 0.7394203
9 9 0.68536078 0.40243129 0.3533668
10 10 0.56533511 1.00000000 0.1735616
或與cross相同across
方法
library(dplyr)
dat <- dat %>%
mutate(across(-ID, Vectorize(norm_fn), min_score = mn, max_score = mx))
dat
ID col1 col2 col3
1 1 0.35136395 1.00000000 1.0000000
2 2 0.98519637 0.56118248 0.8643081
3 3 0.50503408 0.84502612 0.7981099
4 4 1.00000000 0.71219418 1.0000000
5 5 1.00000000 0.11762618 0.8173491
6 6 0.04500823 1.00000000 0.8842158
7 7 0.65582973 0.29884523 0.6760329
8 8 1.00000000 0.04058169 0.7394203
9 9 0.68536078 0.40243129 0.3533668
10 10 0.56533511 1.00000000 0.1735616
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.