简体   繁体   English

如何将因子水平转换为 R 中的变量?

[英]How do I convert factor levels to variables in R?

I am relatively new to R and trying to build a population pyramid.我对 R 比较陌生,正在尝试建立人口金字塔。 I need to have the population data for Males and Females side-by-side in two variables (popMale, pop female).我需要在两个变量(popMale,pop female)中并排获得男性和女性的人口数据。 Currently Sex is a factor with 2 levels.目前 Sex 是一个有 2 个级别的因素。 How do I convert these 2-factor levels to 2 new variables(popMale, popFemale).如何将这些 2 因子水平转换为 2 个新变量(popMale、popFemale)。 I would appreciate any help.我将不胜感激任何帮助。 Here is a dput snippet of my data:这是我的数据的一个 dput 片段:

structure(list(V1 = c("Location", "Dominican Republic", "Dominican Republic", 
"Dominican Republic", "Dominican Republic"), V2 = c("Sex", "Female", 
"Female", "Male", "Male"), V3 = c("Age", "0-4", "5-9", "0-4", 
"5-9"), V4 = c(1950L, 217L, 164L, 223L, 167L), V5 = c(1955L, 
277L, 199L, 286L, 204L)), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -5L))

As your data contains the column names in the first row, the first step to achieve your desired result would be to name your data according to the first row and drop it afterwards.由于您的数据包含第一行中的列名,因此获得所需结果的第一步是根据第一行命名您的数据,然后将其删除。 After doing so convert your data to long or tidy format, ie move the years and population numbers in separate columns using eg tidyr::pivot_longer .这样做之后,将您的数据转换为长格式或整洁格式,即使用tidyr::pivot_longer将年份和人口数量移动到单独的列中。 Finally, you could use tidyr::pivot_wider to spread the data for males and females in separate columns.最后,您可以使用tidyr::pivot_wider在单独的列中传播男性和女性的数据。

Note: Depending on the next steps in your analysis the last step isn't really needed and may actually complicate plotting a population pyramid.注意:根据分析中的后续步骤,最后一步并不是真正需要的,实际上可能会使绘制人口金字塔复杂化。

names(df) <- as.character(df[1,])
df <- df[-1,]

library(tidyr) 

df %>% 
  pivot_longer(matches("^\\d+"), names_to = "Year", values_to = "pop") %>% 
  pivot_wider(names_from = Sex, values_from = pop, names_glue = "pop{Sex}")
#> # A tibble: 4 × 5
#>   Location           Age   Year  popFemale popMale
#>   <chr>              <chr> <chr>     <int>   <int>
#> 1 Dominican Republic 0-4   1950        217     223
#> 2 Dominican Republic 0-4   1955        277     286
#> 3 Dominican Republic 5-9   1950        164     167
#> 4 Dominican Republic 5-9   1955        199     204

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM