简体   繁体   English

分配字符串之前在R中转义引号

[英]Escape quotes in R before assigning a string

I am trying to load a JSON file and do some analysis in R. 我正在尝试加载JSON文件并在R中进行一些分析。

The JSON file contains parts like this: JSON文件包含以下部分:

 '{"property":"blabla \"some goofy name\" more blabla"}'

Which means there are a couple of double quotes inside a string value of a property. 这意味着在属性的字符串值有几个双引号。 This is supposed to be valid JSON (or not?). 这应该是有效的JSON(或不是?)。

The problem is that if I try to parse it with jsonlite or any other library, I need to have it assigned to a string variable in R. Like that: 问题是,如果我尝试用jsonlite或任何其他库解析它,我需要将其分配给R中的字符串变量。像这样:

 a = '{"property":"blabla \"some goofy name\" more blabla"}'

but then, if I type a and press enter, I get this back: 但是,如果我键入a并按Enter,我将得到以下信息:

[1] "{\"property\":\"blabla \"some goofy name\" more blabla\"}"

Which means that the already existing \\" instances are now equal to the actual " instances, so I can't even replace them with regular expression. 这意味着已经存在的\\"实例现在等于实际的"实例,因此我什至不能用正则表达式替换它们。 If I feed this to any JSON parsing library there are errors with invalid characters etc. 如果我将其提供给任何JSON解析库,则会出现无效字符等错误。

Is there any way to 'catch' those nasty \\" instances before R considers them the same with plain " , so that I can eliminate the \\" and continue the JSON parsing? 在R认为这些讨厌的\\"实例与普通"相同之前,是否有任何方法可以“捕捉”这些实例,以便我可以消除\\"并继续进行JSON解析?

The difference with a similar issue is that the inner quotes are already escaped forming a valid JSON. 与类似问题的不同之处在于,内部引号已被转义以形成有效的JSON。 My ultimate challenge is to parse this JSON: http://next.openspending.org/api/3/cubes/ba94aabb80080745688ad38ccad9bfea:at-austria-at11-burgenland/facts?pagesize=30 我的最终挑战是解析此JSON: http : //next.openspending.org/api/3/cubes/ba94aabb80080745688ad38ccad9bfea : at-austria-at11- burgenland/ facts?pagesize=30

Updated answer following the OP's update OP更新后的更新答案

I think I may still have not understood 100% what you want to accomplish, so let me know if this is not your intended output. 我想我可能仍未100%理解您想要完成的工作,所以请告诉我这是否不是您的预期输出。 I didn't deal with the newline characters in your file since that doesn't seem relevant. 我没有处理文件中的换行符,因为这似乎无关紧要。 Your file contains strings that contain "\\"Bienenkorb\\"" as you described. 您的文件包含的字符串包含您所描述的“ \\” Bienenkorb \\””。

url <- "http://next.openspending.org/api/3/cubes/ba94aabb80080745688ad38ccad9bfea:at-austria-at11-burgenland/facts?pagesize=30"
parsed <- jsonlite::fromJSON(url)
print(parsed$data$activity_project_id.project_name[3])
#[1] "Neugestaltung und\nModernisierung des\nRestaurants \"Bienenkorb\""
cat(parsed$data$activity_project_id.project_name[3])
#Neugestaltung und
#Modernisierung des
#Restaurants "Bienenkorb"

If you want to assign it to a string and then parse it, you can do s <- readLines(url); parsed <- jsonlite::fromJSON(s) 如果要将其分配给字符串然后进行解析,则可以执行s <- readLines(url); parsed <- jsonlite::fromJSON(s) s <- readLines(url); parsed <- jsonlite::fromJSON(s) . s <- readLines(url); parsed <- jsonlite::fromJSON(s)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM