使用R中的正则表达式从字符串中提取信息

Question

I have data like this i want to extract some information from x and y 我有这样的数据，我想从x和y中提取一些信息

x= "{\"device_codename\": \"nikel\", \"brand\": \"Xiaomi\"}" 
y= {"percent_incoming_nighttime": 0.88, "percent_outgoing_daytime": 9.29}

The result 结果

device_codename   brand     percent_incoming_nighttime percent_outgoing_daytime
nikel             Xiaomi    0.88                       9.29

I have tired using grep but iam getting errors any suggestion? 我已经厌倦了使用grep，但是我收到任何建议的错误？

grep("device_codename", x, perl=TRUE, value=TRUE)

Answer 1

This is possibly JSON format. 这可能是JSON格式。 There are tools to handle those. 有处理这些的工具。

library(jsonlite)

x = "{\"device_codename\": \"nikel\", \"brand\": \"Xiaomi\"}" 
y = '{"percent_incoming_nighttime": 0.88, "percent_outgoing_daytime": 9.29}'

> unlist(fromJSON(x))
device_codename           brand 
        "nikel"        "Xiaomi" 
> unlist(fromJSON(y))
percent_incoming_nighttime   percent_outgoing_daytime 
                      0.88                       9.29

Answer 2

After removing the braces ( {} ) and double quotes with gsub , read the substring after the : using read.csv into a data.frame and then change the column names with the substring ie before the : 用gsub删除大括号（ {} ）和双引号后，在读取之后的子字符串:使用read.csv到data.frame ，然后使用子字符串更改列名称，即在:之前:

v1 <- gsub('"|[{}]', "", c(x, y))
out <- read.csv(text=paste(gsub("\\w+:\\s+", "", v1), collapse=", "),
       header=FALSE, stringsAsFactors = FALSE)
colnames(out) <- unlist(regmatches(v1, gregexpr("\\w+(?=:)", v1, perl = TRUE)))


out
#  device_codename   brand percent_incoming_nighttime percent_outgoing_daytime
#1           nikel  Xiaomi                       0.88                     9.29

NOTE: No external packages used 注意：不使用外部软件包

Or using RJSONIO and tidyverse 或使用RJSONIO和tidyverse

library(tidyverse)
library(RJSONIO)
list(x, y) %>%
    map(~ fromJSON(.x) %>% 
            as.list %>%
            as_tibble) %>%
       bind_cols
# A tibble: 1 x 4
#  device_codename brand  percent_incoming_nighttime percent_outgoing_daytime
#  <chr>           <chr>                       <dbl>                    <dbl>
#1 nikel           Xiaomi                       0.88                     9.29

data 数据

x <- "{\"device_codename\": \"nikel\", \"brand\": \"Xiaomi\"}"
y <- "{\"percent_incoming_nighttime\": 0.88, \"percent_outgoing_daytime\": 9.29}"

Answer 3

completed jsonlite solution (Roman Luštrik) 完整的jsonlite解决方案（RomanLuštrik）

library(jsonlite)
library(dplyr)

xx_x= "{\"device_codename\": \"nikel\", \"brand\": \"Xiaomi\"}" 
xx_y= "{\"percent_incoming_nighttime\": 0.88, \"percent_outgoing_daytime\": 9.29}"

c(jsonlite::fromJSON(xx_x), jsonlite::fromJSON(xx_y)) %>% 
  reshape2::melt() %>% mutate(myrow = 1) %>% 
  spread(L1, value)

result 结果

  myrow  brand device_codename percent_incoming_nighttime percent_outgoing_daytime
1     1 Xiaomi           nikel                       0.88                     9.29

使用R中的正则表达式从字符串中提取信息

问题描述

3 个解决方案

解决方案1
3 已采纳 2018-08-19 11:12:37

解决方案2
0 2018-08-19 11:03:27

data 数据

解决方案3
0 2018-08-19 12:47:18

使用R中的正则表达式从字符串中提取信息

问题描述

3 个解决方案

解决方案1 3 已采纳 2018-08-19 11:12:37

解决方案2 0 2018-08-19 11:03:27

data 数据

解决方案3 0 2018-08-19 12:47:18

解决方案1
3 已采纳 2018-08-19 11:12:37

解决方案2
0 2018-08-19 11:03:27

解决方案3
0 2018-08-19 12:47:18