简体   繁体   English

如何在 R 中将分层数据(从超过 20/20)从部分宽格式转换为长格式?

[英]How to transform hierarchical data (from Beyond 20/20) from a partially wide format to long format in R?

I have some census data which is partially in long and partially in wide format that was exported from Beyond 20/20.我有一些从 Beyond 20/20 导出的人口普查数据,部分是长格式,部分是宽格式。 I am struggling with how to reformat the data, since it is not totally in wide format as is.我正在为如何重新格式化数据而苦苦挣扎,因为它并不完全是宽格式。 In the following data set, there is the number of male and female people for each education level and state.在下面的数据集中,有每个教育级别和 state 的男性和女性人数。 For example:例如:

> have
  State Education Male Female
1    CA         1    3      4
2    CA         2    4      6
3    NV         1    7      8
4    NV         2    9     19

However, I want the data to be fully long format.但是,我希望数据是完全长格式的。 This would mean having the number of individuals that are in each unique state, education, and sex category.这意味着拥有每个独特的 state、教育和性别类别中的个体数量。 For example:例如:

> want
  State Education    Sex Number
1    CA         1   Male      4
2    CA         1 Female      3
3    CA         2   Male      4
4    CA         2 Female      6
5    NV         1   Male      7
6    NV         1 Female      8
7    NV         2   Male      9
8    NV         2 Female     19

Thanks in advance for any thoughts or advice.提前感谢您的任何想法或建议。

We could use pivot_longer from tidyr package:我们可以使用来自pivot_longer tidyr的 pivot_longer:

library(tidyr)
library(dplyr)
df %>% 
  pivot_longer(
    cols=c(Male, Female),
    names_to = "Sex", 
    values_to = "Number"
  )

  State Education Sex    Number
  <chr>     <int> <chr>   <int>
1 CA            1 Male        3
2 CA            1 Female      4
3 CA            2 Male        4
4 CA            2 Female      6
5 NV            1 Male        7
6 NV            1 Female      8
7 NV            2 Male        9
8 NV            2 Female     19

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM