简体   繁体   中英

How do I separate strings in a variable and place them into new variables?

`

I'm starting with a variable with chr strings.

df$Specs: chr [1:28752] "4 GB RAM | 64 GB ROM | ExpandableUpto256GB", "..."

enter image description here

My goal is to create 3 new variables called "RAM", "ROM", "ExpandableUpto" with the corresponding strings as row observations "xGBRAM", "xGBROM", "xExpandableUpto". Then I just remove the chr strings and be left with numbers as characters. Then I will convert them to numbers and transform them all to GB units.

Here's where I'm at currently.

df$RAM: chr [1:28752] "4GBRAM", "..."

df$ROM: chr [1:28752] "64GBROM", "..."

df$ExpandableUpto": chr [1:28752] "ExpandableUpto256GB", "..."

enter image description here

I can get the chr strings into new variables "RAM" "ROM" and "ExpandableUpto" but since not all of the vectors have 3 sets of strings (some have 1 and 2), the strings fill the variables 1 at a time starting with "RAM". That means that some of my rows have "4GBROM" in the "RAM" variable. Is there a way to get only "RAM" strings in the "RAM" variable. etc?

What I started with:

enter image description here

remove whitespace in "Specs"

Mobile_phones7 <- Mobile_phones6 %>% mutate(Specs = stringr::str_remove_all(Specs, "\\s+"))

remove "|" from chr strings from "Specs"

Mobile_phones8 <- Mobile_phones7 %>% mutate(Specs = stringr::str_split(Specs, coll("|")))

split the character strings in "Specs" and place them in a list

of chr vectors of [1:3] strings.

Mobile_phones9 <- Mobile_phones8 %>% rowwise() %>% mutate(Specs = Reduce(paste, Specs))

separate Specs list vectors into 3 new variables separated by whitespace

Mobile_phones10 <- Mobile_phones9 %>% separate(Specs, c("RAM", "ROM", "ExpandableUpto"), sep = "\\s+")

This resulted in:

enter image description here

Thanks Ben

Please check the below code

code

library(tidyverse)
library(stringr)

data.frame(spec="4 GB RAM | 64 GB ROM | ExpandableUpto256GB") %>% 
tidyr::extract(col=spec, into = c('RAM', 'ROM', 'ExpandableUpto'), regex = '(.*)\\|(.*)\\|(.*)', remove = T) %>% 
mutate(across(c(RAM,ROM,ExpandableUpto), ~ str_remove_all(.x,'\\s')))

Created on 2023-01-20 with reprex v2.0.2

output

     RAM     ROM      ExpandableUpto
1 4GBRAM 64GBROM ExpandableUpto256GB


Please check updated code as per your comments

data & code

data.frame(spec=c("4 GB RAM | 64 GB ROM | ExpandableUpto256GB",
                  "6 GB RAM | 128 GB ROM", 
                  "128 GB ROM",
                  "0 MB ROM | Expandable Upto 32 GB",
                  "8 GB RAM | 128 GB ROM | Expandable Upto 1TB")) %>% 
  transmute(RAM=as.numeric(str_extract(spec, '\\d+(?=\\s\\w+\\sRAM)')),
         ROM=as.numeric(str_extract(spec, '\\d+(?=\\s\\w+\\sROM)')),
         ExpandableUpto=as.numeric(str_replace_all(str_extract(spec, '(?<=[Upto]).*\\d?'),'[:alpha:]',''))
         ) 

Created on 2023-01-22 with reprex v2.0.2

output

  RAM ROM ExpandableUpto
1   4  64            256
2   6 128             NA
3  NA 128             NA
4  NA   0             32
5   8 128              1


The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2025 STACKOOM.COM