R根据另一列拆分一列

Question

I want to split a column based on another. 我想基于另一个拆分列。 I explain in the following. 我在下面解释。
here is part of my data: 这是我的数据的一部分：

brand    products
APPLE    IPHONE6SPlus_16G
APPLE    IPHONE6S_64G
APPLE    IPHONE6S_16G
APPLE    IPhone6_32G
APPLE    iPadAir2_64G
APPLE    iPadmini2_16G
APPLE    iPadmini4_64G
HTC      ONEX
Samsung  SamsungGalaxy

I want to split brand based on Products . 我想根据Products拆分brand 。 here is what I actually want. 这是我真正想要的。

brand       products
iPhone6S    IPHONE6SPlus_16G
iPhone6S    IPHONE6S_64G
iPhone6S    IPHONE6S_16G
iPhone6     IPhone6_32G
APPLE       iPadAir2_64G
APPLE       iPadmini2_16G
APPLE       iPadmini4_64G
HTC         ONEX
Samsung     SamsungGalaxy

I just want to split APPLE into three new(APPLE, iPhone6S, iPhone6) based on products . 我只想将APPLE基于products分为三个新的（APPLE，iPhone6S，iPhone6）。 If the name in products contains IPHONE6SPlus , IPHONE6S , change brand to iPhone6S. 如果products中的名称包含IPHONE6SPlus ， IPHONE6S ，则将brand更改为iPhone6S。 If the name in products contains IPhone6 , change brand to iPhone6. 如果products中的名称包含IPhone6 ，则将brand更改为iPhone6。 And the remainings do not change. 其余的不会改变。

I think I can use iflese to do, but there are size (ie 16G, 64G, etc.) in products name. 我想我可以使用iflese来做，但是products名称中有大小（即16G，64G等）。
How can I ignore these size and split the data. 如何忽略这些大小并拆分数据。

Answer 1

We can do this using a couple of methods. 我们可以使用两种方法来做到这一点。 Here, is one with sub and == 在这里，是一个带有sub和==

v1 <- sub("^(.)(.)(.{5})(.).*", "\\L\\1\\U\\2\\L\\3\\U\\4", df1$products, perl = TRUE)
df1$brand[v1=="iPhone6S"] <- v1[v1 == "iPhone6S"]
df1
#     brand         products
#1 iPhone6S IPHONE6SPlus_16G
#2 iPhone6S     IPHONE6S_64G
#3 iPhone6S     IPHONE6S_16G
#4    APPLE      IPhone6_32G
#5    APPLE     iPadAir2_64G
#6    APPLE    iPadmini2_16G
#7    APPLE    iPadmini4_64G
#8      HTC             ONEX
#9  Samsung    SamsungGalaxy

The sub matches the pattern of first element capture as a group ( (.) ) from the beginning of the string ( ^ ), followed by next element as another group, next 5 elements as third group ( (.{5}) ), followed by another element as a group and the rest of the elements ( .* ). 所述sub的匹配pattern第一元件捕获的作为一个组（ (.)从字符串（的开头） ^ ），接着作为另一基团的下一个元素，下一个5种元素作为第三组（ (.{5})其次是另一个元素作为组，其余元素（ .* ）。 In the replacement, we either change the case to lower ( \\\\L ) or upper ( \\\\U ) for the backreference of those groups ( \\\\1 ) 在替换中，我们将大小写更改为小写（ \\\\L ）或大写（ \\\\U ），以用于这些组的后向引用（ \\\\1 ）

Or an easier option is with grepl 或者更简单的选择是使用grepl

df1$brand[grepl("IPHONE6S", df1$products)] <- "iPhone6S"

If the column have both lower and upper case characters, then it can be converted to either one of them using tolower or toupper and then do the processing 如果该列同时具有大写和小写字符，则可以使用tolower或toupper将其转换为其中之一，然后进行处理

df1$brand[grepl("IPHONE6S", toupper(df1$products))] <- "iPhone6S"

Suppose we want to change multiple elements, this can be done with looping 假设我们要更改多个元素，可以通过循环来完成

nm1 <- c("IPAD", "IPHONE", "SAMSUNG")
for(j in nm1) df1$brand[grepl(j, toupper(df1$products))] <- j
df1
#   brand         products
#1  IPHONE IPHONE6SPlus_16G
#2  IPHONE     IPHONE6S_64G
#3  IPHONE     IPHONE6S_16G
#4  IPHONE      IPhone6_32G
#5    IPAD     iPadAir2_64G
#6    IPAD    iPadmini2_16G
#7    IPAD    iPadmini4_64G
#8     HTC             ONEX
#9 SAMSUNG    SamsungGalaxy

Answer 2

'Dirty' solution but I hope it helps :) “肮脏”的解决方案，但我希望它能有所帮助:)

x <- c('IPHONE6SPlus','IPHONE6S')
b$new <- grepl(paste(x, collapse = "|"), b$products)
b$brand[b$new==TRUE] <- "Iphone6S"
b$new <- NULL
y <- c('IPhone6')
b$new <- grepl(paste(y, collapse = "|"), b$products)
b$brand[b$new==TRUE] <- "Iphone6"
b$new <- NULL

     brand         products
1 Iphone6S IPHONE6SPlus_16G
2 Iphone6S     IPHONE6S_64G
3 Iphone6S     IPHONE6S_16G
4  Iphone6      IPhone6_32G
5    APPLE     iPadAir2_64G
6    APPLE    iPadmini2_16G
7    APPLE    iPadmini4_64G
8      HTC             ONEX
9  Samsung    SamsungGalaxy

R根据另一列拆分一列

问题描述

2 个解决方案

解决方案1
1 已采纳 2017-04-21 09:03:34

解决方案2
1 2017-04-21 09:59:48

R根据另一列拆分一列

问题描述

2 个解决方案

解决方案1 1 已采纳 2017-04-21 09:03:34

解决方案2 1 2017-04-21 09:59:48

解决方案1
1 已采纳 2017-04-21 09:03:34

解决方案2
1 2017-04-21 09:59:48