[英]Sorting data to get size, mean and SD for each group in R
A small sample of the data is shown below.数据的一个小样本如下所示。 Please consider I have more columns
请考虑我有更多的列
dat<-read.table (text="Tall1 Group1 Tall2 Group2 Tall3 Group3 Tall4 Group4
25 M 24 M 23 N 33 N
34 N 16 M 23 M 43 N
41 N 20 M 44 N 60 N
25 M 24 N 44 N 55 M
26 N 12 N 44 M 90 M
", header=TRUE)
I want to get the following table我想得到下表
NoM MeanM SDM NoN MeanN SDN
2 25 0 3 33.66 7.5
3 20 4 2 18 8.45
2 33.5 14.48 3 37 12.12
2 72.5 24.74 3 45.33 13.65
I want to get N, mean and SD for each group according to their Talls, ie, Tall1 with Group1, Tall2 with Group2 and...我想根据每个组的 Talls 获得 N、平均值和 SD,即 Tall1 和 Group1,Tall2 和 Group2 和......
library(dplyr)
library(tidyr)
dat %>%
pivot_longer(cols = everything(), names_to = c(".value", "set"), names_pattern = "([A-Za-z]*)(\\d+)$") %>%
group_by(set, Group) %>%
summarise(No=n(), Mean = mean(Tall), SD = sd(Tall), .groups = "drop") %>%
pivot_wider(names_from = Group, values_from = c(No, Mean, SD), names_sep = "") %>%
select(-set)
`summarise()` has grouped output by 'set'. You can override using the `.groups` argument.
# A tibble: 4 × 6
NoM NoN MeanM MeanN SDM SDN
<int> <int> <dbl> <dbl> <dbl> <dbl>
1 2 3 25 33.7 0 7.51
2 3 2 20 18 4 8.49
3 2 3 33.5 37 14.8 12.1
4 2 3 72.5 45.3 24.7 13.7
This should work.这应该有效。 I couldn't think of a good way to get the pairs of columns together, so had to kind of force it.
我想不出一个把成对的柱子放在一起的好方法,所以不得不强迫它。
library(dplyr)
library(tidyr)
dat_value<-dat %>% select(contains("Tall")) %>% pivot_longer(cols=1:4,names_to = "category",values_to="values")
dat_group<-dat %>% select(contains("Group")) %>% pivot_longer(cols=1:4,names_to = "category",values_to="group") %>%
select(group)
new_dat<-dat_value %>%
bind_cols(dat_group) %>%
group_by(category) %>%
summarize(NoM=sum(group=="M"),
MeanM=mean(values[group=="M"]),
SDM=sd(values[group=="M"]),
NoN=sum(group=="N"),
MeanN=mean(values[group=="N"]),
SDN=sd(values[group=="N"])
) %>%
ungroup()
new_dat
#> # A tibble: 4 × 7
#> category NoM MeanM SDM NoN MeanN SDN
#> <chr> <int> <dbl> <dbl> <int> <dbl> <dbl>
#> 1 Tall1 2 25 0 3 33.7 7.51
#> 2 Tall2 3 20 4 2 18 8.49
#> 3 Tall3 2 33.5 14.8 3 37 12.1
#> 4 Tall4 2 72.5 24.7 3 45.3 13.7
Created on 2022-01-14 by the reprex package (v2.0.1)由代表 package (v2.0.1) 于 2022 年 1 月 14 日创建
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.