简体   繁体   中英

How can I create two numeric columns each containing one of the numbers in a character variable

sum = c("40 Da + 12 Da de primes", "40 Da + 12 Da de primes", "50 Da", "50 Da", "50 Da")

How do I separate such a variable into the following columns:

Price Bonus
40 12
40 12
50 0
50 0
50 0

Suppose the character vector is stored in a data.frame called df :

library(stringr)
library(dplyr)

df %>% 
  mutate(
    Price = as.numeric(coalesce(str_extract(sum, "^\\d+(?=\\sDa)"), "0")),
    Bonus = as.numeric(coalesce(str_extract(sum, "\\d+(?=\\sDa de primes)"), "0"))
    )

This returns

                      sum Price Bonus
1 40 Da + 12 Da de primes    40    12
2 40 Da + 12 Da de primes    40    12
3                   50 Da    50     0
4                   50 Da    50     0
5                   50 Da    50     0

We could use extract from tidyr

library(dplyr)
library(tidyr)
tibble(sum) %>% 
   extract(sum, into = c("Price", "Bonus"),
      "^(\\d+)\\D+(\\d+)?.*", convert = TRUE) %>% 
     mutate(Bonus = replace_na(Bonus, 0))
# A tibble: 5 × 2
  Price Bonus
  <int> <dbl>
1    40    12
2    40    12
3    50     0
4    50     0
5    50     0

data

sum = c("40 Da + 12 Da de primes", "40 Da + 12 Da de primes", "50 Da", "50 Da", "50 Da")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM