[英]How do I use a function I created on values within a dataframe and replace the values with outcome of the function?
我创建了一个名为getExpressionLevel
的函数,问题是要求我使用该函数用下面的语句替换数字。 那么,我需要使用什么来实现这一目标?
getExpressionLevel的函数;
function(a) {
if (a<5) {
cat ("none")
}
if (a>=5&a<20) {
cat ("low")
}
if (a>=20&a<60) {
cat ("medium")
}
if (a>=60) {
cat ("high")
}
}
问题是;
创建一个名为expression_levels
的data.frame
,它具有10行(每个基因一个)和3列(每个细胞系一个)。 然后计算每个细胞系中每个基因的平均表达,并使用getExpressionLevel
函数相应地标记该表达。
这是我当前的data.frame。 其中的数据需要用getExpressionfunction的结果替换。
genename Kc167 BG3 S2
1 Clic 7.333333 48.33333 75.00000
2 Treh 24.666667 12.66667 52.33333
3 bib 31.333333 79.33333 82.00000
4 CalpC 65.000000 69.33333 63.66667
5 tud 59.666667 81.66667 16.33333
6 cort 74.333333 50.66667 28.66667
7 S2P 72.000000 39.66667 50.66667
8 Mitofilin 38.333333 29.00000 54.66667
9 Oxp 73.666667 49.33333 42.66667
10 Ada1-2 87.333333 42.00000 28.00000
这是预期的data.frame:
Kc167 BG3 S2
Clic low medium high
Treh medium low medium
bib medium high high
CalpC high high high
tud medium high low
cort high medium medium
S2P high medium medium
MitofiliN medium medium medium
Oxp high medium medium
Ada1-2 high medium medium
希望这可以帮助!
bin_breaks <- c(-Inf, 5, 20, 60, Inf)
bin_labels <- c("none", "low", "medium", "high")
df[,-1] <- sapply(df[,-1], function(x) cut(x,
breaks = bin_breaks,
labels = bin_labels,
right = F))
df
输出为:
genename Kc167 BG3 S2
1 Clic low medium high
2 Treh medium low medium
3 bib medium high high
4 CalpC high high high
5 tud medium high low
6 cort high medium medium
7 S2P high medium medium
8 Mitofilin medium medium medium
9 Oxp high medium medium
10 Ada1-2 high medium medium
样本数据:
df <- structure(list(genename = c("Clic", "Treh", "bib", "CalpC", "tud",
"cort", "S2P", "Mitofilin", "Oxp", "Ada1-2"), Kc167 = c(7.333333,
24.666667, 31.333333, 65, 59.666667, 74.333333, 72, 38.333333,
73.666667, 87.333333), BG3 = c(48.33333, 12.66667, 79.33333,
69.33333, 81.66667, 50.66667, 39.66667, 29, 49.33333, 42), S2 = c(75,
52.33333, 82, 63.66667, 16.33333, 28.66667, 50.66667, 54.66667,
42.66667, 28)), .Names = c("genename", "Kc167", "BG3", "S2"), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10"))
编辑:在代码中添加了适当的right
参数,以满足边界条件和OP的要求(礼貌@drf)。
功能方式。 知道如何使用功能总是很有帮助的。
## sample data
df <- data.table(genename = c('Clic','Treh','bib','CalpC'),
Kc167 = c(7.333,24.666,31.3333,65),
BG3 = c(48.33,12.66,79.33,69.33),
S2 = c(75.00,52.33,82.00,63.66))
## this function updates values based on following criterias
get_values <- function(x)
{
if(x < 5) return ('None')
else if ((x >= 5) && (x < 20)) return ('low')
else if ((x >= 20) && (x < 60)) return ('medium')
else if (x >= 60) return ('high')
}
## creating a new data frame with answers
df2 <- df$genename
df2$Kc167 <- sapply(df$Kc167, get_values)
df2$BG3 <- sapply(df$BG3, get_values)
df2$S2 <- sapply(df$S2, get_values)
genename Kc167 BG3 S2
1: Clic low medium high
2: Treh medium low medium
3: bib medium high high
4: CalpC high high high
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.