简体   繁体   English

R gsub特殊字符

[英]R gsub special characters

I Have data frame. 我有数据框。 In one column I have string 一栏中我有字符串

"\t\tStatus: {\\id\\:\\d6b084be-9429-4b4b-8141-1cb5f5a84d2d\\,\\device\\:\\lge LG-H955 (z2_global_com)\\,\\result\\:\\1\\,\\script\\:[{\\timestamp\\:\\1519033801850\\,\\step\\:\\step1\\,\\answer\\:\\1\\},{\\timestamp\\:\\1519033879798\\,\\step\\:\\step2\\,\\answer\\:\\1\\}]},"

I want to remove some special characters and desired output is 我想删除一些特殊字符,所需的输出是

Status: {"id":"d6b084be-9429-4b4b-8141-1cb5f5a84d2d","device":"lge LG-H955 (z2_global_com)","result":"1","script":[{"timestamp":"1519033801850","step":"step1","answer":"1"},{"timestamp":"1519033879798","step":"step2","answer":"1"}]}

I want change every \\ to " and remove \\t\\t from start and remove first " and last ," symbols 我想将每个\\更改为“,并从开始删除\\ t \\ t,并删除第一个”和last“符号

I try gsub but it didnt work correctly 我尝试使用gsub,但无法正常工作

UPDATE : THANKS ITS WORKS!! 更新 :感谢您的工作! BUT I HAVE ONE MORE QUESTION, THE SAME PROBLEM WITH THIS BELOW, but it is more complicated :( there are a lot of \\t 但是我还有一个问题,下面是同样的问题,但问题更加复杂:(有很多\\ t

"Script": "{\t\"id\": \"hh-d6b084be-9429-4b4b-8141-1cb5f5a84d2d\",\t\t\t\t   \"version\": \"1.0.0\",\t\t\t\t\t\"start_step\": \"step0\",\t\t\t\t\t\"script\": [\t\t\t{\t\t\t\t\"id\": \"step0\",\t\t\t\t\"text\": \"hh?\",\t\t\t\t\"interaction\": null,\t\t\t\t\t\"options\": [\t\t\t\t{ \"id\": \"1\", \"text\": \"cc \", \"action\": { \"goto\": \"step1\" } },\t\t\t\t{ \"id\": \"2\", \"text\": \"hh \", \"action\": {\"goto\": \"step2\"} }\t\t\t\t]\t\t\t},\t\t\t{\t\t\t\t\"id\": \"step1\",\t\t\t\t\"text\": \"Chh?\",\t\t\t\t\"interaction\": null,\t\t\t\t\t\"options\": [\t\t\t\t{ \"id\": \"1\", \"text\": \"hhgo hh\", \"action\": { \"goto\": \"step3\" } },\t\t\t\t{ \"id\": \"2\", \"text\": \"jj\", \"action\": { \"goto\": \"step4\" } },\t\t\t\t{ \"id\": \"3\", \"text\": \"jj aa z jj jj \", \"action\": { \"goto\": \"step5\" } }\t\t\t\t]\t\t\t},\t\t\t{\t\t\t\t\"id\": \"step2\",\t\t\t\t\"text\": \"jjj\",\t\t\t\t\"interaction\": null,\t\t\t\t\t\"options\": [\t\t\t\t{ \"id\": \"1\", \"text\": \"jjjj\", \"action\": { \"deeplink\": \"pl://app/nn\" } },\t\t\t\t{ \"id\": \"2\", \"text\": \"jj\", \"action\": { \"deeplink\":\"pl://app/xx/nn/nn\" } },\t\t\t\t{ \"id\": \"3\", \"text\": \"nnn\", \"action\": { \"finish\": \"1\" } },\t\t\t\t{ \"id\": \"4\", \"text\": \"nn\", \"action\": { \"finish\": \"1\" } }\t\t\t\t]\t\t\t},\t\t\t{\t\t\t\t\"id\": \"step3\",\t\t\t\t\t\"text\": \"nnnel. <a href='https://www.dd.pl/dd/dd.pdf'>  </a>\",\t\t\t\"interaction\": null,\t\t\t\t\t\"options\": [\t\t\t\t{ \"id\": \"1\", \"text\": \"fff\", \"action\": { \"deeplink\": \"pl://app/nn/apply?nn=KG&nn=*\" } },\t\t\t\t{ \"id\": \"2\", \"text\": \"hhh\", \"action\": { \"goto\": \"step6\" } },\t\t\t\t{ \"id\": \"3\", \"text\": \"hh \", \"action\": { \"goto\": \"step7\" } }\t\t\t\t]\t\t\t},\t\t   {\t\t\t\t\"id\": \"step4\",\t\t\t\t\"text\": \"hh\",\t\t\t\t\"interaction\": null,\t\t\t\t\t\"options\": [\t\t\t\t{ \"id\": \"1\", \"text\": \"ff\", \"action\": { \"deeplink\": \"https://www.k.uk/hh/hh.html\" } },\t\t\t\t{ \"id\": \"2\", \"text\": \"ss\", \"action\": { \"deeplink\": \"pl://app/ddd\" } },\t\t\t\t{ \"id\": \"3\", \"text\": \"ss\", \"action\": { \"finish\": \"1\" } },\t\t\t\t{ \"id\": \"4\", \"text\": \"ss\", \"action\": { \"finish\": \"1\" } }\t\t\t\t]\t\t\t},\t\t\t{\t\t\t\t\"id\": \"step5\",\t\t\t\t\t\"text\": \"sss?\",\t\t\t\t\t\"interaction\":  {\t\t\t\t\"type\": \"poll\",\t\t\t\t\t\t\"data\": {\t\t\t\t\t\"minimum_checked\": \"1\",\t\t\t\t\t\t\t\"maximum_checked\": \"1\",\t\t\t\t\t\t\t\"fields\": [\t\t\t\t\t{ \"id\": \"1\", \"text\": \"fff\" },\t\t\t\t\t{ \"id\": \"2\", \"text\": \"ff ff\" },\t\t\t\t\t{ \"id\": \"3\", \"text\": \"ff\" }\t\t\t\t\t\t]\t\t\t\t}\t\t\t},\t\t\t\t\"options\": [\t\t\t\t{ \"id\": \"1\", \"text\": \"dd\", \"action\": { \"goto\": \"step8\" } }\t\t\t\t]\t\t\t},\t\t\t{\t\t\t\t\"id\": \"step6\",\t\t\t\t\"text\": \"fff dd ddd dd i dd ff.\",\t\t\t\t\"interaction\": null,\t\t\t\t\t\"options\": [\t\t\t\t{ \"id\": \"1\", \"text\": \"Ok, aa\", \"action\": { \"deeplink\": \"l://app/ff\"} },\t\t\t\t{ \"id\": \"2\", \"text\": \"dddd\", \"action\": { \"deeplink\": \"ff://app/contact/ff/ff\" } },\t\t\t\t{ \"id\": \"3\", \"text\": \"ddd\", \"action\": { \"finish\": \"1\" } },\t\t\t\t{ \"id\": \"4\", \"text\": \"ddd\", \"action\": { \"finish\": \"1\" } }\t\t\t\t]\t\t\t},\t\t\t{\t\t\t\t\"id\": \"step7\",\t\t\t\t\t\"text\": \"ddd\",\t\t\t\t\t\"interaction\": {\t\t\t\t\"type\": \"poll\",\t\t\t\t\t\t\"data\": {\t\t\t\t\t\"minimum_checked\": \"1\",\t\t\t\t\t\t\t\"maximum_checked\": \"3\",\t\t\t\t\t\t\t\"fields\": [\t\t\t\t\t{ \"id\": \"1\", \"text\": \"dddd\" },\t\t\t\t\t{ \"id\": \"2\", \"text\": \"Kssss\" },\t\t\t\t\t{ \"id\": \"3\", \"text\": \"ss ss\" }\t\t\t\t\t\t]\t\t\t\t}\t\t\t},\t\t\t\t\"options\": [\t\t\t\t{ \"id\": \"1\", \"text\": \"ss\", \"action\": { \"goto\": \"step9\" } }\t\t\t\t]\t\t\t},\t\t\t{\t\t\t\t\"id\": \"step8\",\t\t\t\t\"text\": \"sss.\",\t\t\t\t\"interaction\": null,\t\t\t\t\t\"options\": [\t\t\t\t{ \"id\": \"1\", \"text\": \"Ok, aaa\", \"action\": { \"deeplink\": \"ss://app/call\" } },\t\t\t\t{ \"id\": \"2\", \"text\": \"sss\", \"action\": { \"deeplink\": \"ss://app/ss/ss/chat\" } },\t\t\t\t{ \"id\": \"3\", \"text\": \"ssss\", \"action\": { \"finish\": \"1\" } },\t\t\t\t{ \"id\": \"4\", \"text\": \"ss\", \"action\": { \"finish\": \"1\" } }\t\t\t\t]\t\t\t},\t\t\t{\t\t\t\t\"id\": \"step9\",\t\t\t\t\"text\": \"ss.\",\t\t\t\t\"interaction\": null,\t\t\t\t\t\"options\": [\t\t\t\t{ \"id\": \"1\", \"text\": \"Ok, aa\", \"action\": { \"deeplink\": \"ss://app/ss\" } },\t\t\t\t{ \"id\": \"2\", \"text\": \"ss\", \"action\": { \"deeplink\": \"ss://app/ss/cvc/ss\" } },\t\t\t\t{ \"id\": \"3\", \"text\": \"ss\", \"action\": { \"finish\": \"1\" } },\t\t\t\t{ \"id\": \"4\", \"text\": \"aaa\", \"action\": {\"finish\": \"1\" } }\t\t\t\t]\t\t\t}\t   ]\t}"

When I try do this with the same code from DJack's answer 当我尝试使用DJack的答案中的相同代码执行此操作时

text <- gsub("\\\\",'"', gsub("\t|,$","", text))

It looks like this 看起来像这样

"Script": "{    \"id\": \"d6b084be-9429-4b4b-8141-1cb5f5a84d2d\",    \"start_step\": \"step1\",    \"script\": [\t{            \"id\": \"step1\",            \"text\": \"ggg\",            \"interaction\": null,            \"options\": [{ \t\t\t\t\t\"id\": \"1\",\t\t\t\t\t\"text\": \"gg\",\t\t\t\t\t\"action\": {\t\t\t\t\t\t\"goto\": \"step2\"\t\t\t\t\t}\t\t\t\t}, { \t\t\t\t\t\"id\": \"2\",\t\t\t\t\t\"text\": \"gg\",\t\t\t\t\t\"action\": {\t\t\t\t\t\t\"goto\": \"step3\"\t\t\t\t\t}\t\t\t\t}, { \t\t\t\t\t\"id\": \"3\",\t\t\t\t\t\"text\": \"gg\",\t\t\t\t\t\"action\": {\t\t\t\t\t\"goto\": \"step4\"\t\t\t\t\t}\t\t\t\t}            ]       },\t\t{            \"id\": \"step2\",            \"text\": \"gg?\",            \"interaction\": null,            \"options\": [{ \t\t\t\t\t\"id\": \"1\",\t\t\t\t\t\"text\": \"gg\",\t\t\t\t\t\"action\": {\t\t\t\t\t\t\"deeplink\": \"gg://app/gg/apply?type=KG&gg=*\",\t\t\t\t\t\t\"finish\": \"1\",\t\t\t\t\t\t\"goto\": \"step10\"\t\t\t\t\t}\t\t\t\t\t},{ \t\t\t\t\t\"id\": \"2\",\t\t\t\t\t\"text\": \"gg z gg\",\t\t\t\t\t\"action\": {\t\t\t\t\t\t\"deeplink\": \"ww://app/ww\",\t\t\t\t\t\t\"finish\": \"1\",\t\t\t\t\t\t\"goto\": \"step10\"\t\t\t\t\t}\t\t\t\t\t},{ \t\t\t\t\t\"id\": \"3\",\t\t\t\t\t\"text\": \"ww gg\",\t\t\t\t\t\"action\": {\t\t\t\t\t\t\"deeplink\": \"dd://app/aa/cvc/aaa\",\t\t\t\t\t\t\"finish\": \"1\",\t\t\t\t\t\t\"goto\": \"step10\"\t\t\t\t\t}\t\t\t\t}            ]       },{            \"id\": \"step3\",            \"text\": \"ggg\",            \"interaction\": null,            \"options\": [{ \t\t\t\t\t\"id\": \"1\",\t\t\t\t\t\"text\": \"ww\",\t\t\t\t\t\"action\": {\t\t\t\t\t\t\"deeplink\": \"ww://app/ww/apply?type=KG&dd=*\",\t\t\t\t\t\t\"finish\": \"1\",\t\t\t\t\t\t\"goto\": \"step10\"\t\t\t\t\t}\t\t\t\t\t},{ \t\t\t\t\t\"id\": \"2\",\t\t\t\t\t\"text\": \"aaa\",\t\t\t\t\t\"action\": {\t\t\t\t\t\t\"deeplink\": \"ww://app/c2c\",\t\t\t\t\t\t\"finish\": \"1\",\t\t\t\t\t\t\"goto\": \"step10\"\t\t\t\t\t}\t\t\t\t\t},{ \t\t\t\t\t\"id\": \"3\",\t\t\t\t\t\"text\": \"aaa\",\t\t\t\t\t\"action\": {\t\t\t\t\t\t\"deeplink\": \"dd://app/aa/ss/ss\",\t\t\t\t\t\t\"finish\": \"1\",\t\t\t\t\t\t\"goto\": \"step10\"\t\t\t\t\t}\t\t\t\t}            ]       },{            \"id\": \"step4\",            \"text\": \"ddd\",            \"interaction\": null,            \"options\": [{ \t\t\t\t\t\"id\": \"1\",\t\t\t\t\t\"text\": \"ss\",\t\t\t\t\t\"action\": {\t\t\t\t\t\t\"deeplink\": \"dd://app/oneclick/apply?type=KG&profile=*\",\t\t\t\t\t\t\"finish\": \"1\",\t\t\t\t\t\t\"goto\": \"step10\"\t\t\t\t\t}\t\t\t\t\t},{ \t\t\t\t\t\"id\": \"2\",\t\t\t\t\t\"text\": \"sss\",\t\t\t\t\t\"action\": {\t\t\t\t\t\t\"deeplink\": \"dd://app/dd\",\t\t\t\t\t\t\"finish\": \"1\",\t\t\t\t\t\t\"goto\": \"step10\"\t\t\t\t\t}\t\t\t\t\t},{ \t\t\t\t\t\"id\": \"3\",\t\t\t\t\t\"text\": \"aaa\",\t\t\t\t\t\"action\": {\t\t\t\t\t\t\"deeplink\": \"dd://app/aa/cvc/aa\",\t\t\t\t\t\t\"finish\": \"1\",\t\t\t\t\t\t\"goto\": \"step10\"\t\t\t\t\t}\t\t\t\t}            ]       },\t\t{\t\t\t\"id\": \"step10\",\t\t\t\"text\": \"aaa\",\t\t\t\"interaction\": null,\t\t\t\"options\": null\t\t}]}"

And When I try this 当我尝试这个

(
fromJSON(substr(text, 9, nchar(text)))) 
)

I have error 我有错误

Error: lexical error: invalid char in json text.
            "script": ["t{            "id": "step1",            "text"
                     (right here) ------^

As mentioned in the comments, I am not sure what you mean by "removing first and last " ". It just defines the data type (character). Here is a solution (using ' in place of " but in R, they have the same meaning): 如评论中所述,我不确定“删除第一个和最后一个"是什么意思。它仅定义数据类型(字符)。这是一种解决方案(使用'代替"但在R中,它们具有相同的含义):

text <- "\t\tStatus: {\\id\\:\\d6b084be-9429-4b4b-8141-1cb5f5a84d2d\\,\\device\\:\\lge LG-H955 (z2_global_com)\\,\\result\\:\\1\\,\\script\\:[{\\timestamp\\:\\1519033801850\\,\\step\\:\\step1\\,\\answer\\:\\1\\},{\\timestamp\\:\\1519033879798\\,\\step\\:\\step2\\,\\answer\\:\\1\\}]},"

text <- gsub("\\\\","'", gsub("\t|,$","", text))

text

"Status: {'id':'d6b084be-9429-4b4b-8141-1cb5f5a84d2d','device':'lge LG-H955 (z2_global_com)','result':'1','script':[{'timestamp':'1519033801850','step':'step1','answer':'1'},{'timestamp':'1519033879798','step':'step2','answer':'1'}]}"

EDIT based on dienow's answer 根据dienow的答案进行编辑

If you are looking for a valid 'json' (I am not familiar with this format) as suggested by Dienow's answer, fromJSON function requires " . Therefore, you can adapt the code as: 如果您正在按照Dienow的答案所建议的那样寻找有效的“ json”(我不熟悉这种格式),则fromJSON函数需要使用" 。”因此,您可以将代码修改为:

text <- gsub("\\\\",'"', gsub("\t|,$","", text))

This gives the same output than Dienow's answer: 这给出的输出与Dienow的答案相同:

library(jsonlite)
fromJSON(substr(text, 9, nchar(text)))

$id
[1] "d6b084be-9429-4b4b-8141-1cb5f5a84d2d"

$device
[1] "lge LG-H955 (z2_global_com)"

$result
[1] "1"

$script
      timestamp  step answer
1 1519033801850 step1      1
2 1519033879798 step2      1
s <- "\t\tStatus: {\\id\\:\\d6b084be-9429-4b4b-8141-1cb5f5a84d2d\\,\\device\\:\\lge LG-H955 (z2_global_com)\\,\\result\\:\\1\\,\\script\\:[{\\timestamp\\:\\1519033801850\\,\\step\\:\\step1\\,\\answer\\:\\1\\},{\\timestamp\\:\\1519033879798\\,\\step\\:\\step2\\,\\answer\\:\\1\\}]},"
r <- gsub("\t", "", gsub("\\\\", "\"",s))

and here is the proof that the result is a valid json: 这是证明结果是有效的json的证明:

library(jsonlite)
fromJson(substr(r, 9, nchar(r) - 1))

This outputs 这个输出

$id
[1] "d6b084be-9429-4b4b-8141-1cb5f5a84d2d"

$device
[1] "lge LG-H955 (z2_global_com)"

$result
[1] "1"

$script
      timestamp  step answer
1 1519033801850 step1      1
2 1519033879798 step2      1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM