简体   繁体   English

正确转义JSON特殊字符以在CSV文件中使用

[英]Properly escaping JSON special characters for use in a CSV file

When retrieving tweets from Twitter, here is a snippet of the raw JSON received (captured via Fiddler: 从Twitter检索推文时,以下是收到的原始JSON的摘要(通过Fiddler捕获:

[{"text":"\"California GOP Files FEC Complaint Over Obama Visit to Dying Grandmother\"\r\nhttp:\/\/url.com\/6jd5j5"}]

After doing some operations on it, involving deserialization, and then re-serializing it (via JSON.NET ), it ends up in the database like this: 在对其进行了一些操作(包括反序列化),然后重新序列化(通过JSON.NET )之后,它最终在数据库中如下所示:

{"text": "\"California GOP Files FEC Complaint Over Obama Visit to Dying Grandmother\"\r\nhttp://url.com/6jd5j5"}

The only difference, being that the URLs don't have the backslash escapes around the forward slashes. 唯一的区别是,URL在反斜杠周围没有反斜杠。 (I'm not sure if this is a big deal, please chime in if it is) (我不确定这有什么大不了的,如果是的话请鸣叫)

My confusion actually is how to handle these escaped control characters. 我的困惑实际上是如何处理这些转义的控制字符。 Running a SELECT query against my table in the MySQL client, Using MySQL's JSON_UNQUOTE function , it will unescape the characters. 在MySQL客户端中对我的表运行SELECT查询,使用MySQL的JSON_UNQUOTE函数 ,它将取消转义字符。 The \\r\\n are properly escaped, but it keeps the double quotes around the text which is interesting... \\r\\n已正确转义,但它在文本周围保留了双引号,这很有趣...

+----------+-------------------------------------------------------------------------------------------------------+
| user_id  | JSON_UNQUOTE(JSON_EXTRACT(tw.tweet_json, '$.text'))                                                   |
+----------+-------------------------------------------------------------------------------------------------------+
| 12844052 | "California GOP Files FEC Complaint Over Obama Visit to Dying Grandmother"
http://url.com/6jd5j5 |
+----------+-------------------------------------------------------------------------------------------------------+

Here's what it looks like when I don't use the JSON_UNQUOTE unescape function: 这是当我不使用JSON_UNQUOTE unescape函数时的样子:

+-------------------------------------------------------------------------------------------------------------+
| JSON_EXTRACT(tw.tweet_json, '$.text')                                                                       |
+-------------------------------------------------------------------------------------------------------------+
| "\"California GOP Files FEC Complaint Over Obama Visit to Dying Grandmother\"\r\nhttp://url.com/6jd5j5" |
+-------------------------------------------------------------------------------------------------------------+

I need to export these tweets to a CSV file, to be used by Excel or Google Sheets. 我需要将这些推文导出到CSV文件,以供Excel或Google表格使用。

I use the following specifier after my query: 我在查询后使用以下说明符:

INTO OUTFILE 'C:/ProgramData/MySQL/MySQL Server 5.7/Uploads/so.csv' FIELDS TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY '\n';

Opening the CSV file with Excel displays the following: (The second row/entry uses the JSON_UNESCAPE feature) 用Excel打开CSV文件将显示以下内容:(第二行/条目使用JSON_UNESCAPE功能)

Notice how the second entry, while using the JSON_UNESCAPE feature, shows excessive slashes. 请注意,第二个条目在使用JSON_UNESCAPE功能时如何显示过多的斜杠。

在此处输入图片说明

Here's the CSV file opened in notepad: 这是在记事本中打开的CSV文件:

  "\"\\\"California GOP Files FEC Complaint Over Obama Visit to Dying Grandmother\\\"\\r\\nhttp://url.com/6jd5j5\""
"\"California GOP Files FEC Complaint Over Obama Visit to Dying Grandmother\"
\
http://url.com/6jd5j5"

Question: How can I properly escape the tweet here, so that it can be read as original intended? 问题:我如何才能在此处适当地忽略该推文,以便可以将其视为原始意图? Original Tweet Link 原始推文链接

Edit The advice to use ESCAPED BY '"' from @Michael - sqlbot has brought me closer - but now when opening the CSV, the second part of the tweet (the URL) is in a new cell. I've verified that this happens in both Excel and Google Sheets: 编辑 @Michael使用ESCAPED BY '"'的建议-sqlbot使我更加接近-但是现在打开CSV时,tweet的第二部分(URL)在新单元格中。在Excel和Google表格中:

在此处输入图片说明

Rendered CSV image (copy and pasting the text doesn't work well) 渲染的CSV图像(复制和粘贴文本效果不佳) 在此处输入图片说明

After some digging around, some helpful comments from @Michael - sqlbot, and this answer , I got it working properly in Google Sheets and Excel with the following statement: 经过一番挖掘之后,@Michael-sqlbot发表了一些有用的评论,并给出了这个答案 ,我可以使用以下语句在Google Sheets和Excel中正常工作:

    SELECT REPLACE(JSON_UNQUOTE(JSON_EXTRACT({JSON_COL}, {JSON_PROP_TO_RETRIEVE})), '\r\n', '\n')
      ...
    INTO OUTFILE 'C:/ProgramData/MySQL/MySQL Server 5.7/Uploads/{FILE_NAME}.csv' FIELDS TERMINATED BY ',' ENCLOSED BY '"' ESCAPED BY '"' LINES TERMINATED BY '\r\n';

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM