如何通过提取将列拆分为两列？

Question

I would like to split columns into two and extract and keep the numbers alone in one column.我想将列分成两列，然后将数字单独提取并保留在一列中。

df <- data.frame(V1 = c("[1] Strongly disagree", "[2] Somewhat disagree", "[3] Neither", "[4] Somewhat agree", "[5] Strongly agree"))

                  V1
 [1] Strongly disagree
 [2] Somewhat disagree
 [3] Neither
 [4] Somewhat agree
 [5] Strongly agree

I tried using the separate function from tidyr :我尝试使用tidyr的separate函数：

tidyr::separate(df, V1, into = c("Value", "Label"), sep = "] ")

Value   Label
[1      Strongly disagree           
[2      Somewhat disagree           
[3      Neither         
[4      Somewhat agree          
[5      Strongly agree

I might be able to remove the [ with another function, but I was wondering if I can fix this in one step and wonder if there is another function that does the job.我也许可以用另一个函数删除[ ，但我想知道我是否可以一步解决这个问题，并想知道是否有另一个函数可以完成这项工作。

I am trying to get this in the end我试图最终得到这个

        Label        Value
 Strongly disagree     1
 Somewhat disagree     2
 Neither               3
 Somewhat agree        4
 Strongly agree        5

Answer 1

If you are more into base R, here is the base R solution:如果您更喜欢基础 R，这里是基础 R 解决方案：

df <- data.frame(V1 = c("[1] Strongly disagree", "[2] Somewhat disagree", "[3] Neither", "[4] Somewhat agree", "[5] Strongly agree"))

df$value = as.numeric(regmatches(df$V1, regexpr(r"(\d)", df$V1)))

df$V1 = regmatches(df$V1, regexpr("(?<=] ).*", df$V1, perl=TRUE))
df
#>                  V1 value
#> 1 Strongly disagree     1
#> 2 Somewhat disagree     2
#> 3           Neither     3
#> 4    Somewhat agree     4
#> 5    Strongly agree     5

^{Created on 2020-09-05 by the reprex package (v0.3.0)}^{由reprex 包(v0.3.0) 于 2020 年 9 月 5 日创建}

regmatches is a base R function, which returns the matched value from the vector, it takes as an input a vector and a regexpr object. regmatches是一个基本的 R 函数，它从向量中返回匹配的值，它将向量和一个regexpr对象作为输入。

If the first case ( value column) \\d is used to extract the digit.如果第一种情况（ value列） \\d用于提取数字。 In second case, (?<=] ).* is used to return anything that matches after ] ,在第二种情况下， (?<=] ).*用于返回在]之后匹配的任何内容，

Answer 2

Try this approach:试试这个方法：

library(tidyverse)
#Data
df <- data.frame(V1 = c("[1] Strongly disagree",
                        "[2] Somewhat disagree",
                        "[3] Neither", 
                        "[4] Somewhat agree",
                        "[5] Strongly agree"))
#Mutate
df %>% separate(V1,into = c('V1','V2'),sep = ']') %>%
  mutate(V1=gsub("[[:punct:]]",'',V1))

Output:输出：

  V1                 V2
1  1  Strongly disagree
2  2  Somewhat disagree
3  3            Neither
4  4     Somewhat agree
5  5     Strongly agree

If you want further to have other names you can use rename() :如果您想进一步拥有其他名称，可以使用rename() ：

#Mutate 2
df %>% separate(V1,into = c('V1','V2'),sep = ']') %>%
  mutate(V1=gsub("[[:punct:]]",'',V1)) %>%
  rename(Label=V2,Value=V1) %>% select(c(2,1))

Output:输出：

               Label Value
1  Strongly disagree     1
2  Somewhat disagree     2
3            Neither     3
4     Somewhat agree     4
5     Strongly agree     5

Answer 3

Another way you can try str_extract to get the value and str_remove to get rid of square brackets in the label column.你可以尝试另一种方式str_extract获得的价值和str_remove摆脱方括号在标签栏。

library(dplyr)
library(stringr)
df %>% 
  transmute(value = str_extract(V1, "\\d+"),
         label = str_remove(V1, "\\[.*\\]"))
#    value              label
# 1      1  Strongly disagree
# 2      2  Somewhat disagree
# 3      3            Neither
# 4      4     Somewhat agree
# 5      5     Strongly agree

Answer 4

An option with extract一个带有extract的选项

library(tidyr)
library(dplyr)
df %>% 
   extract(V1, into = c("Value", "Label"), "^\\[(\\d+)\\]\\s*(.*)")
#  Value             Label
#1     1 Strongly disagree
#2     2 Somewhat disagree
#3     3           Neither
#4     4    Somewhat agree
#5     5    Strongly agree

如何通过提取将列拆分为两列？

问题描述

4 个解决方案

解决方案1
3 2020-09-04 18:38:00

解决方案2
2 2020-09-04 17:53:11

解决方案3
2 已采纳 2020-09-04 19:48:48

解决方案4
1 2020-09-04 23:31:07

如何通过提取将列拆分为两列？

问题描述

4 个解决方案

解决方案1 3 2020-09-04 18:38:00

解决方案2 2 2020-09-04 17:53:11

解决方案3 2 已采纳 2020-09-04 19:48:48

解决方案4 1 2020-09-04 23:31:07

解决方案1
3 2020-09-04 18:38:00

解决方案2
2 2020-09-04 17:53:11

解决方案3
2 已采纳 2020-09-04 19:48:48

解决方案4
1 2020-09-04 23:31:07