簡體   English   中英

如何在R中匯總和傳播數據

[英]How to summarize and spread data in R

我想將我的數據總結為只有三列,如下所示: col_1 = name of the country, col_2 = percentage of 0s, col_3 = percentage of 1s,

這是數據:

country = rep(c("USA", "UK", "AUS", "ARM", "BEL", "BRA", "CHN", "EGY", "FIN", "FRA"),
              times = c(10, 5, 15, 10, 10, 10, 5, 15, 10, 10))
score= sample(c(0,1), replace=F)
dat = data.frame(country, score)

非常感謝。

使用 reshape2

library(reshape2)
dat2=dcast(dat,country~score,value.var="score")
dat2[,c("0","1")]=dat2[,c("0","1")]/rowSums(dat2[,c("0","1")])

   country         0         1
1      ARM 0.5000000 0.5000000
2      AUS 0.5333333 0.4666667
3      BEL 0.5000000 0.5000000
4      BRA 0.5000000 0.5000000
5      CHN 0.4000000 0.6000000
6      EGY 0.5333333 0.4666667
7      FIN 0.5000000 0.5000000
8      FRA 0.5000000 0.5000000
9       UK 0.4000000 0.6000000
10     USA 0.5000000 0.5000000

另一種可能的解決方案,基於tidyverse

library(tidyverse)

country = rep(c("USA", "UK", "AUS", "ARM", "BEL", "BRA", "CHN", "EGY", "FIN", "FRA"),
              times = c(10, 5, 15, 10, 10, 10, 5, 15, 10, 10))
score= sample(c(0,1), replace=F)
dat = data.frame(country, score)

dat %>% 
  group_by(country) %>% 
  summarise(perc0s = 1-sum(score)/n(), perc1s=1-perc0s, .groups = "drop")

#> # A tibble: 10 × 3
#>    country perc0s perc1s
#>    <chr>    <dbl>  <dbl>
#>  1 ARM      0.5    0.5  
#>  2 AUS      0.467  0.533
#>  3 BEL      0.5    0.5  
#>  4 BRA      0.5    0.5  
#>  5 CHN      0.6    0.4  
#>  6 EGY      0.467  0.533
#>  7 FIN      0.5    0.5  
#>  8 FRA      0.5    0.5  
#>  9 UK       0.6    0.4  
#> 10 USA      0.5    0.5

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM