[英]R: frequencies across rows
Consider this data frame:考虑这个数据框:
library(dplyr)
one <- c("no", "no", "no", "no", "yes", "yes", "yes", "yes")
two <- c("apple", "banana", "orange", "carrot", "apple", "banana", "orange", "carrot")
three <- c(4, 5, 6, 7, 3, 4, 5, 6)
df <- data.frame(one, two, three)
df
one two three
1 no apple 4
2 no banana 5
3 no orange 6
4 no carrot 7
5 yes apple 3
6 yes banana 4
7 yes orange 5
8 yes carrot 6
Then I pivot wider然后我pivot更宽
df2 <- df %>%
pivot_wider(names_from = one, values_from = three)
two no yes
<chr> <chr> <chr>
1 apple 4 3
2 banana 5 4
3 orange 6 5
4 carrot 7 6
Now, I want the relative frequencies across rows, but I cannot figure out to get there.现在,我想要跨行的相对频率,但我无法弄清楚如何到达那里。 There are the desired columns:有所需的列:
desired_column_no <- c(4/7,5/9,6/11,7/13)
desired_column_yes <- c(3/7,4/9,5/11,6/13)
df2 %>%
cbind(desired_column_no,
desired_column_yes)
two no yes desired_column_no desired_column_yes
1 apple 4 3 0.5714286 0.4285714
2 banana 5 4 0.5555556 0.4444444
3 orange 6 5 0.5454545 0.4545455
4 carrot 7 6 0.5384615 0.4615385
I've been playing around with group_by()
, summarize()
and across()
, but haven't gotten it to work correctly.我一直在玩group_by()
、 summarize()
和 cross( across()
,但还没有让它正常工作。 Any help is greatly appreciated!任何帮助是极大的赞赏!
With proportions
, before pivot_wider
:使用pivot_wider
proportions
:
library(dplyr)
library(tidyr)
df %>%
group_by(two) %>%
mutate(prop = proportions(three)) %>%
pivot_wider(names_from = one, values_from = c(three, prop))
two three_no three_yes prop_no prop_yes
<chr> <dbl> <dbl> <dbl> <dbl>
1 apple 4 3 0.571 0.429
2 banana 5 4 0.556 0.444
3 orange 6 5 0.545 0.455
4 carrot 7 6 0.538 0.462
Don't use data.frame(cbind(.))
, you're corrupting your data by converting numbers to strings.不要使用data.frame(cbind(.))
,您会通过将数字转换为字符串来破坏数据。 While it's reversible (and in general, "mostly" reversible but not always), it's also perfectly avoidable.虽然它是可逆的(通常“大部分”是可逆的,但并非总是如此),但它也是完全可以避免的。 Just use data.frame(.)
.只需使用data.frame(.)
。
We can use across
on your wider format.我们可以across
您更广泛的格式上使用。
df <- data.frame(one,two,three) %>%
pivot_wider(names_from = one, values_from = three)
df %>%
mutate(
across(c(no, yes), ~ . / (no + yes),
.names = "desired_column_{.col}")
)
# # A tibble: 4 x 5
# two no yes desired_column_no desired_column_yes
# <chr> <dbl> <dbl> <dbl> <dbl>
# 1 apple 4 3 0.571 0.429
# 2 banana 5 4 0.556 0.444
# 3 orange 6 5 0.545 0.455
# 4 carrot 7 6 0.538 0.462
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.