简体   繁体   English

如何计算 R 中的多个组的条件熵 ..我在哪里 go 错了

[英]How to calculate conditional entropy by multiple groups in R ..where did I go wrong

I've read many questions related to mine, however I can't figure out what's wrong with my code我已经阅读了许多与我相关的问题,但是我无法弄清楚我的代码有什么问题

The package I use is "dplyr" & "infotheo"我使用的 package 是“dplyr”和“infotheo”

Usage of infotheo here is condentropy(time2, time1)这里使用 infotheo 是condentropy(time2, time1)

my data is like我的数据就像

id <- c("1", "1", "1", "1", "2", "2", "2", "2", "3", "3", "3", "3")
cond <- c("1", "2", "1", "2", "1", "2", "1", "2", "1", "2", "1", "2")
time1 <- c("1", "3", "3", "2", "3", "3", "1", "1", "1", "2", "2", "1")
time2 <- c("3", "3", "2", "3", "3", "1", "1", "1", "2", "2", "1" ,"1")
df <- data.frame(id, cond, time1, time2)

I want to calculate it by id & condition, which means I'll get 6 entropy values from 3 person with two conditions.我想通过 id 和条件来计算它,这意味着我将从 3 个人的两个条件下获得 6 个熵值。 Here is my code这是我的代码

df %>%
group_by(df$id, df$cond) %>%
summarize(condentropy(df$time2, df$time1))

why I only got one value for all the groups?为什么我对所有组只有一个值?

在此处输入图像描述

Thanks for the help in advance!我在这里先向您的帮助表示感谢!

Something like this像这样的东西

First, convert you data to numics首先,将您的数据转换为数字

df <- df %>% type_convert()

-- Column specification ------------------------------------------
cols(
  id = col_double(),
  cond = col_double(),
  time1 = col_double(),
  time2 = col_double()
)

Second, get at finding relevant means,其次,要设法寻找相关手段,

df  %>%
    group_by(id, cond) %>%
    summarise(mean = mean(id))
`summarise()` has grouped output by 'id'. You can override using the `.groups` argument.
# A tibble: 6 x 3
# Groups:   id [3]
     id  cond  mean
  <dbl> <dbl> <dbl>
1     1     1     1
2     1     2     1
3     2     1     2
4     2     2     2
5     3     1     3
6     3     2     3

Third, study this page for addition examples.第三,研究此页面以获取其他示例。

Convert the time columns to numeric, perform the grouping and summarize.将时间列转换为数字,执行分组和汇总。 Do not use df$ with dplyr verbs and be sure to assign the value of condentropy(...) to a column name.不要将 df$ 与 dplyr 动词一起使用,并确保将 condentropy(...) 的值分配给列名。 The subject of the question refers to mean but the code suggests you want to calculate the conditional entropy so we provide both.问题的主题是指均值,但代码建议您要计算条件熵,因此我们提供两者。

library(dplyr)
library(infotheo)

df %>%
  mutate(time1 = as.numeric(time1), time2 = as.numeric(time2)) %>%
  group_by(id, cond) %>%
  summarize(cond_ent = condentropy(time2, time1), 
            mean1 = mean(time1), mean2 = mean(time2), .groups = "drop")

Use type.convert(as.is = TRUE) to get numeric variables and then summarise with across : You don't have to use $使用type.convert(as.is = TRUE)来获取数值变量,然后用across summarise :你不必使用$

This one:这个:

library(dplyr)
library(infotheo)
df %>% 
  as_tibble() %>% 
  type.convert(as.is=TRUE) %>% 
  group_by(id, cond) %>% 
  summarise(mean = mean(c(time1, time2)))

Output: Output:

     id  cond  mean
  <int> <int> <dbl>
1     1     1  2.25
2     1     2  2.75
3     2     1  2   
4     2     2  1.5 
5     3     1  1.5 
6     3     2  1.5 

OR或者

library(dplyr)
df %>% 
  as_tibble() %>% 
  type.convert(as.is=TRUE) %>% 
  group_by(id, cond) %>% 
  summarise(across(starts_with("time"), mean))

Output: Output:

     id  cond time1 time2
  <int> <int> <dbl> <dbl>
1     1     1   2     2.5
2     1     2   2.5   3  
3     2     1   2     2  
4     2     2   2     1  
5     3     1   1.5   1.5
6     3     2   1.5   1.5

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 这个涉及火车 function 的 R 代码在哪里出错? - Where did I go wrong with this R code involving the train function? 如何聚合数据集并计算 R 中跨组的连续变量的熵? - How can I aggregate a data set and calculate entropy of a continuous variable across groups in R? 在尝试使用 R 绘制双曲线时,我的 go 哪里出错了? - Where did I go wrong in my attempt to graph a hyperbola using R? 如何计算 R 中组的多个均值和标准差 - How I can calculate multiple means and standard deviations for groups in R 计算 R 中 DNA 多序列 Alignment 的熵 - Calculate Entropy for DNA Multiple Sequence Alignment in R 计算r中的传递熵 - calculate Transfer entropy in r 如何使用 R 中的 dplyr 按行计算多个组的平均值? - How to calculate mean by row for multiple groups using dplyr in R? 如何计算R中的模糊性能指标和归一化分类熵 - How to calculate fuzzy performance index and normalized classification entropy in R 我如何在R中某些行(属于其他子集)属于多个组的情况下进行分组? - How do I group by in R where some rows (that are subsets of others) belong to multiple groups? 我如何使用 dplyr 计算 R 中组之间的相关性? - How i can calculate the correlation between groups in R using dplyr?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM