[英]Escape quotes in R before assigning a string
I am trying to load a JSON file and do some analysis in R. 我正在尝试加载JSON文件并在R中进行一些分析。
The JSON file contains parts like this: JSON文件包含以下部分:
'{"property":"blabla \"some goofy name\" more blabla"}'
Which means there are a couple of double quotes inside a string value of a property. 这意味着在属性的字符串值内有几个双引号。 This is supposed to be valid JSON (or not?).
这应该是有效的JSON(或不是?)。
The problem is that if I try to parse it with jsonlite or any other library, I need to have it assigned to a string variable in R. Like that: 问题是,如果我尝试用jsonlite或任何其他库解析它,我需要将其分配给R中的字符串变量。像这样:
a = '{"property":"blabla \"some goofy name\" more blabla"}'
but then, if I type a
and press enter, I get this back: 但是,如果我键入
a
并按Enter,我将得到以下信息:
[1] "{\"property\":\"blabla \"some goofy name\" more blabla\"}"
Which means that the already existing \\"
instances are now equal to the actual "
instances, so I can't even replace them with regular expression. 这意味着已经存在的
\\"
实例现在等于实际的"
实例,因此我什至不能用正则表达式替换它们。 If I feed this to any JSON parsing library there are errors with invalid characters etc. 如果我将其提供给任何JSON解析库,则会出现无效字符等错误。
Is there any way to 'catch' those nasty \\"
instances before R considers them the same with plain "
, so that I can eliminate the \\"
and continue the JSON parsing? 在R认为这些讨厌的
\\"
实例与普通"
相同之前,是否有任何方法可以“捕捉”这些实例,以便我可以消除\\"
并继续进行JSON解析?
The difference with a similar issue is that the inner quotes are already escaped forming a valid JSON. 与类似问题的不同之处在于,内部引号已被转义以形成有效的JSON。 My ultimate challenge is to parse this JSON: http://next.openspending.org/api/3/cubes/ba94aabb80080745688ad38ccad9bfea:at-austria-at11-burgenland/facts?pagesize=30
我的最终挑战是解析此JSON: http : //next.openspending.org/api/3/cubes/ba94aabb80080745688ad38ccad9bfea : at-austria-at11- burgenland/ facts?pagesize=30
Updated answer following the OP's update OP更新后的更新答案
I think I may still have not understood 100% what you want to accomplish, so let me know if this is not your intended output. 我想我可能仍未100%理解您想要完成的工作,所以请告诉我这是否不是您的预期输出。 I didn't deal with the newline characters in your file since that doesn't seem relevant.
我没有处理文件中的换行符,因为这似乎无关紧要。 Your file contains strings that contain "\\"Bienenkorb\\"" as you described.
您的文件包含的字符串包含您所描述的“ \\” Bienenkorb \\””。
url <- "http://next.openspending.org/api/3/cubes/ba94aabb80080745688ad38ccad9bfea:at-austria-at11-burgenland/facts?pagesize=30"
parsed <- jsonlite::fromJSON(url)
print(parsed$data$activity_project_id.project_name[3])
#[1] "Neugestaltung und\nModernisierung des\nRestaurants \"Bienenkorb\""
cat(parsed$data$activity_project_id.project_name[3])
#Neugestaltung und
#Modernisierung des
#Restaurants "Bienenkorb"
If you want to assign it to a string and then parse it, you can do s <- readLines(url); parsed <- jsonlite::fromJSON(s)
如果要将其分配给字符串然后进行解析,则可以执行
s <- readLines(url); parsed <- jsonlite::fromJSON(s)
s <- readLines(url); parsed <- jsonlite::fromJSON(s)
. s <- readLines(url); parsed <- jsonlite::fromJSON(s)
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.