如何根据突发事件组合列？

Question

I have the following df:我有以下 df：

SUMLEV STATE COUNTY AGEGRP TOT_POP TOT_MALE
50     1      1    0   55601    26995
50     7      33   0  218022   105657
50     14     500  0   24881    13133
50     4      70   0   22400    11921
50     3      900  0   57840    28500
50     22     11   0   10138     5527

I would like to make a new columns named CODE based on the columns state and county .我想根据列state和county创建一个名为CODE的新列。 I would like to paste the number from state to the number from county .我想将state的号码粘贴到county的号码中。 However, if county is a single or double digit number, I would like it to have zeroes before it, like 001 and 033 .但是，如果县是一位数或两位数，我希望它前面有零，例如001和033 。

Ideally the final df would look like:理想情况下，最终的 df 看起来像：

SUMLEV STATE COUNTY AGEGRP TOT_POP TOT_MALE CODE
50     1      1    0   55601    26995     1001
50     7      33   0  218022   105657     7033
50     14     500  0   24881    13133     14500
50     4      70   0   22400    11921     4070
50     3      900  0   57840    28500     3900
50     22     11   0   10138     5527     22011

Is there a short, elegant way of doing this?有没有一种简短而优雅的方式来做到这一点？

Answer 1

We can use sprintf我们可以使用sprintf

library(dplyr)
df %>%
    mutate(CODE = sprintf('%d%03d', STATE, COUNTY))
# SUMLEV STATE COUNTY AGEGRP TOT_POP TOT_MALE  CODE
#1     50     1      1      0   55601    26995  1001
#2     50     7     33      0  218022   105657  7033
#3     50    14    500      0   24881    13133 14500
#4     50     4     70      0   22400    11921  4070
#5     50     3    900      0   57840    28500  3900
#6     50    22     11      0   10138     5527 22011

If we need to split the column 'CODE' into two, we can use separate如果我们需要将“CODE”列一分为二，我们可以使用separate

library(tidyr)
df %>%
    mutate(CODE = sprintf('%d%03d', STATE, COUNTY)) %>% 
    separate(CODE, into = c("CODE1", "CODE2"), sep= "(?=...$)")

Or extract to capture substrings as a group或extract以捕获子串作为一个组

df %>%
    mutate(CODE = sprintf('%d%03d', STATE, COUNTY)) %>% 
    extract(CODE, into = c("CODE1", "CODE2"), "^(.*)(...)$")

Or with str_pad或者使用str_pad

library(stringr)
df %>%
    mutate(CODE = str_c(STATE, str_pad(COUNTY, width = 3, pad = '0')))

Or in base R或者在base R

df$CODE <- sprintf('%d%03d', df$STATE, df$COUNTY)

data数据

df <- structure(list(SUMLEV = c(50L, 50L, 50L, 50L, 50L, 50L), STATE = c(1L, 
7L, 14L, 4L, 3L, 22L), COUNTY = c(1L, 33L, 500L, 70L, 900L, 11L
), AGEGRP = c(0L, 0L, 0L, 0L, 0L, 0L), TOT_POP = c(55601L, 218022L, 
24881L, 22400L, 57840L, 10138L), TOT_MALE = c(26995L, 105657L, 
13133L, 11921L, 28500L, 5527L)), class = "data.frame", row.names = c(NA, 
-6L))

如何根据突发事件组合列？

问题描述

1 个解决方案

解决方案1
2 已采纳 2020-03-29 00:03:39

data数据

如何根据突发事件组合列？

问题描述

1 个解决方案

解决方案1 2 已采纳 2020-03-29 00:03:39

data数据

解决方案1
2 已采纳 2020-03-29 00:03:39