[英]Counting “1” in multiple columns in a data-frame
給定以下數據框:
fruit <- c("Orange", "Apple", "Apple", "Orange", "Banana", "Orange", "Banana", "Pear", "Banana", "Pear", "Pear", "Apple")
col2 <- c("0", "0", "1", "0", "0", "0", "1", "0", "0", "0", "0", "1")
col3 <- c("1", "1", "0", "0", "0", "1", "1", "0", "0", "0", "0", "0")
col4 <- c("0", "1", "0", "1", "1", "1", "0", "0", "0", "1", "1", "0")
df <- data.frame(fruit, col2, col3, col4)
fruit col2 col3 col4
1 Orange 0 1 0
2 Apple 0 1 1
3 Apple 1 0 0
4 Orange 0 0 1
5 Banana 0 0 1
6 Orange 0 1 1
7 Banana 1 1 0
8 Pear 0 0 0
9 Banana 0 0 0
10 Pear 0 0 1
11 Pear 0 0 1
12 Apple 1 0 0
我想計算每列中“1”的頻率並將它們顯示在表格中,如下所示:
fruit col2 col3 col4
1 Orange 0 2 2
2 Apple 2 1 1
3 Banana 1 1 1
4 Pear 0 0 1
我嘗試使用dplyr
來獲得結果,但無法在多個列中完成。 這是我使用的代碼:
df %>%
group_by(fruit) %>%
summarise(Count = n()) %>%
group_by_all %>%
spread(fruit, Count, fill = 0)
我是 R 的初學者,有人可以幫忙嗎?
我們可以across
按'fruit'分組后使用summarise
with cross遍歷列並獲得邏輯向量的sum
library(dplyr)
df %>%
group_by(fruit) %>%
summarise(across(everything(), ~ sum(. == '1')))
-輸出
# A tibble: 4 x 4
# fruit col2 col3 col4
#* <chr> <int> <int> <int>
#1 Apple 2 1 1
#2 Banana 1 1 1
#3 Orange 0 2 2
#4 Pear 0 0 2
使用基礎 R aggregate
:
aggregate(.~fruit, df, function(x) sum(x == '1'))
# fruit col2 col3 col4
#1 Apple 2 1 1
#2 Banana 1 1 1
#3 Orange 0 2 2
#4 Pear 0 0 2
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.