[英]extract information from string using regex in R
I have data like this i want to extract some information from x and y 我有这样的数据,我想从x和y中提取一些信息
x= "{\"device_codename\": \"nikel\", \"brand\": \"Xiaomi\"}"
y= {"percent_incoming_nighttime": 0.88, "percent_outgoing_daytime": 9.29}
The result 结果
device_codename brand percent_incoming_nighttime percent_outgoing_daytime
nikel Xiaomi 0.88 9.29
I have tired using grep but iam getting errors any suggestion? 我已经厌倦了使用grep,但是我收到任何建议的错误?
grep("device_codename", x, perl=TRUE, value=TRUE)
This is possibly JSON format. 这可能是JSON格式。 There are tools to handle those.
有处理这些的工具。
library(jsonlite)
x = "{\"device_codename\": \"nikel\", \"brand\": \"Xiaomi\"}"
y = '{"percent_incoming_nighttime": 0.88, "percent_outgoing_daytime": 9.29}'
> unlist(fromJSON(x))
device_codename brand
"nikel" "Xiaomi"
> unlist(fromJSON(y))
percent_incoming_nighttime percent_outgoing_daytime
0.88 9.29
After removing the braces ( {}
) and double quotes with gsub
, read the substring after the :
using read.csv
into a data.frame
and then change the column names with the substring ie before the :
用
gsub
删除大括号( {}
)和双引号后,在读取之后的子字符串:
使用read.csv
到data.frame
,然后使用子字符串更改列名称,即在:
之前:
v1 <- gsub('"|[{}]', "", c(x, y))
out <- read.csv(text=paste(gsub("\\w+:\\s+", "", v1), collapse=", "),
header=FALSE, stringsAsFactors = FALSE)
colnames(out) <- unlist(regmatches(v1, gregexpr("\\w+(?=:)", v1, perl = TRUE)))
out
# device_codename brand percent_incoming_nighttime percent_outgoing_daytime
#1 nikel Xiaomi 0.88 9.29
NOTE: No external packages used 注意:不使用外部软件包
Or using RJSONIO
and tidyverse
或使用
RJSONIO
和tidyverse
library(tidyverse)
library(RJSONIO)
list(x, y) %>%
map(~ fromJSON(.x) %>%
as.list %>%
as_tibble) %>%
bind_cols
# A tibble: 1 x 4
# device_codename brand percent_incoming_nighttime percent_outgoing_daytime
# <chr> <chr> <dbl> <dbl>
#1 nikel Xiaomi 0.88 9.29
x <- "{\"device_codename\": \"nikel\", \"brand\": \"Xiaomi\"}"
y <- "{\"percent_incoming_nighttime\": 0.88, \"percent_outgoing_daytime\": 9.29}"
completed jsonlite solution (Roman Luštrik) 完整的jsonlite解决方案(RomanLuštrik)
library(jsonlite)
library(dplyr)
xx_x= "{\"device_codename\": \"nikel\", \"brand\": \"Xiaomi\"}"
xx_y= "{\"percent_incoming_nighttime\": 0.88, \"percent_outgoing_daytime\": 9.29}"
c(jsonlite::fromJSON(xx_x), jsonlite::fromJSON(xx_y)) %>%
reshape2::melt() %>% mutate(myrow = 1) %>%
spread(L1, value)
result 结果
myrow brand device_codename percent_incoming_nighttime percent_outgoing_daytime
1 1 Xiaomi nikel 0.88 9.29
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.