[英]How to calculate conditional entropy by multiple groups in R ..where did I go wrong
I've read many questions related to mine, however I can't figure out what's wrong with my code我已经阅读了许多与我相关的问题,但是我无法弄清楚我的代码有什么问题
The package I use is "dplyr" & "infotheo"我使用的 package 是“dplyr”和“infotheo”
Usage of infotheo here is condentropy(time2, time1)
这里使用 infotheo 是
condentropy(time2, time1)
my data is like我的数据就像
id <- c("1", "1", "1", "1", "2", "2", "2", "2", "3", "3", "3", "3")
cond <- c("1", "2", "1", "2", "1", "2", "1", "2", "1", "2", "1", "2")
time1 <- c("1", "3", "3", "2", "3", "3", "1", "1", "1", "2", "2", "1")
time2 <- c("3", "3", "2", "3", "3", "1", "1", "1", "2", "2", "1" ,"1")
df <- data.frame(id, cond, time1, time2)
I want to calculate it by id & condition, which means I'll get 6 entropy values from 3 person with two conditions.我想通过 id 和条件来计算它,这意味着我将从 3 个人的两个条件下获得 6 个熵值。 Here is my code
这是我的代码
df %>%
group_by(df$id, df$cond) %>%
summarize(condentropy(df$time2, df$time1))
why I only got one value for all the groups?为什么我对所有组只有一个值?
Thanks for the help in advance!我在这里先向您的帮助表示感谢!
Something like this像这样的东西
First, convert you data to numics首先,将您的数据转换为数字
df <- df %>% type_convert()
-- Column specification ------------------------------------------
cols(
id = col_double(),
cond = col_double(),
time1 = col_double(),
time2 = col_double()
)
Second, get at finding relevant means,其次,要设法寻找相关手段,
df %>%
group_by(id, cond) %>%
summarise(mean = mean(id))
`summarise()` has grouped output by 'id'. You can override using the `.groups` argument.
# A tibble: 6 x 3
# Groups: id [3]
id cond mean
<dbl> <dbl> <dbl>
1 1 1 1
2 1 2 1
3 2 1 2
4 2 2 2
5 3 1 3
6 3 2 3
Third, study this page for addition examples.第三,研究此页面以获取其他示例。
Convert the time columns to numeric, perform the grouping and summarize.将时间列转换为数字,执行分组和汇总。 Do not use df$ with dplyr verbs and be sure to assign the value of condentropy(...) to a column name.
不要将 df$ 与 dplyr 动词一起使用,并确保将 condentropy(...) 的值分配给列名。 The subject of the question refers to mean but the code suggests you want to calculate the conditional entropy so we provide both.
问题的主题是指均值,但代码建议您要计算条件熵,因此我们提供两者。
library(dplyr)
library(infotheo)
df %>%
mutate(time1 = as.numeric(time1), time2 = as.numeric(time2)) %>%
group_by(id, cond) %>%
summarize(cond_ent = condentropy(time2, time1),
mean1 = mean(time1), mean2 = mean(time2), .groups = "drop")
Use type.convert(as.is = TRUE)
to get numeric variables and then summarise
with across
: You don't have to use $
使用
type.convert(as.is = TRUE)
来获取数值变量,然后用across
summarise
:你不必使用$
This one:这个:
library(dplyr)
library(infotheo)
df %>%
as_tibble() %>%
type.convert(as.is=TRUE) %>%
group_by(id, cond) %>%
summarise(mean = mean(c(time1, time2)))
Output: Output:
id cond mean
<int> <int> <dbl>
1 1 1 2.25
2 1 2 2.75
3 2 1 2
4 2 2 1.5
5 3 1 1.5
6 3 2 1.5
OR或者
library(dplyr)
df %>%
as_tibble() %>%
type.convert(as.is=TRUE) %>%
group_by(id, cond) %>%
summarise(across(starts_with("time"), mean))
Output: Output:
id cond time1 time2
<int> <int> <dbl> <dbl>
1 1 1 2 2.5
2 1 2 2.5 3
3 2 1 2 2
4 2 2 2 1
5 3 1 1.5 1.5
6 3 2 1.5 1.5
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.