[英]Aggregating via dplyr - mutating a single column from factor to numeric
嗨,謝謝您的閱讀。
我一直在嘗試聚合一些數據,並且已經能夠通過聚合函數成功地做到這一點,但是我也想通過使用dplyr運行管道來嘗試做同樣的事情-但是我一直收到錯誤消息:
mutate_impl(.data,點)中的錯誤:評估錯誤:找不到函數“ 15.2”。
我目前有此數據集p:
sample gene ct
1 s001 gapdh 15.2
2 s001 gapdh 16
3 s001 gapdh 14.8
4 s002 gapdh 16.2
5 s002 gapdh 17
6 s002 gapdh 16.7
7 s003 gapdh Undetermined
8 s003 gapdh 14.6
9 s003 gapdh 15
10 s001 actb 24.5
11 s001 actb 24.2
12 s001 actb 24.7
13 s002 actb 25
14 s002 actb 25.7
15 s002 actb 25.5
16 s003 actb 27.3
17 s003 actb 27.4
18 s003 actb Undetermined
並希望它達到:
p2$sample p2$gene p2$ct.mean p2$ct.sd
1 s001 actb 24.46666667 0.25166115
2 s002 actb 25.40000000 0.36055513
3 s003 actb 27.35000000 0.07071068
4 s001 gapdh 15.33333333 0.61101009
5 s002 gapdh 16.63333333 0.40414519
6 s003 gapdh 14.80000000 0.28284271
我當前正在使用的代碼會導致上述錯誤:
library(dplyr)
p_ave_sd <- p %>%
filter(p$ct != "Undetermined") %>%
mutate_at(as.character(p$ct), as.numeric, rm.na = TRUE) %>%
group_by(p$gene) %>%
summarise(mean=mean(p$ct), sd=sd(p$ct))
這絕對是讓我絆倒的“變異”步驟,我已經嘗試過mutate_all(),mutate_if(is.factor,is.numeric)等,但是每個步驟都有其自身的錯誤。
謝謝您的幫助!
這是使用mutate_at
的方法。 如果只有一欄要轉換,則mutate
也可以工作,並且更直接。
library(dplyr)
dat2 <- dat %>%
filter(!ct %in% "Undetermined") %>%
# mutate(ct = as.numeric(ct)) %>% <<< This will also work
mutate_at(vars(ct), funs(as.numeric(.))) %>%
group_by(sample, gene) %>%
summarise(mean = mean(ct), sd = sd(ct)) %>%
ungroup()
dat2
# # A tibble: 6 x 4
# sample gene mean sd
# <chr> <chr> <dbl> <dbl>
# 1 s001 actb 24.5 0.252
# 2 s001 gapdh 15.3 0.611
# 3 s002 actb 25.4 0.361
# 4 s002 gapdh 16.6 0.404
# 5 s003 actb 27.4 0.0707
# 6 s003 gapdh 14.8 0.283
數據
dat <- read.table(text = " sample gene ct
1 s001 gapdh 15.2
2 s001 gapdh 16
3 s001 gapdh 14.8
4 s002 gapdh 16.2
5 s002 gapdh 17
6 s002 gapdh 16.7
7 s003 gapdh Undetermined
8 s003 gapdh 14.6
9 s003 gapdh 15
10 s001 actb 24.5
11 s001 actb 24.2
12 s001 actb 24.7
13 s002 actb 25
14 s002 actb 25.7
15 s002 actb 25.5
16 s003 actb 27.3
17 s003 actb 27.4
18 s003 actb Undetermined",
header = TRUE, stringsAsFactors = FALSE)
我不確定是否理解您的問題,但是可能是:
p_ave_sd <- p %>%
filter(ct != "undetermined") %>%
mutate(ct=as.numeric(ct)) %>%
group_by(gene,sample) %>%
summarise(mean=mean(ct), sd=sd(ct))
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.