[英]Properly escaping JSON special characters for use in a CSV file
When retrieving tweets from Twitter, here is a snippet of the raw JSON received (captured via Fiddler: 从Twitter检索推文时,以下是收到的原始JSON的摘要(通过Fiddler捕获:
[{"text":"\"California GOP Files FEC Complaint Over Obama Visit to Dying Grandmother\"\r\nhttp:\/\/url.com\/6jd5j5"}]
After doing some operations on it, involving deserialization, and then re-serializing it (via JSON.NET ), it ends up in the database like this: 在对其进行了一些操作(包括反序列化),然后重新序列化(通过JSON.NET )之后,它最终在数据库中如下所示:
{"text": "\"California GOP Files FEC Complaint Over Obama Visit to Dying Grandmother\"\r\nhttp://url.com/6jd5j5"}
The only difference, being that the URLs don't have the backslash escapes around the forward slashes. 唯一的区别是,URL在反斜杠周围没有反斜杠。 (I'm not sure if this is a big deal, please chime in if it is) (我不确定这有什么大不了的,如果是的话请鸣叫)
My confusion actually is how to handle these escaped control characters. 我的困惑实际上是如何处理这些转义的控制字符。 Running a SELECT query against my table in the MySQL client, Using MySQL's JSON_UNQUOTE function , it will unescape the characters. 在MySQL客户端中对我的表运行SELECT查询,使用MySQL的JSON_UNQUOTE函数 ,它将取消转义字符。 The \\r\\n
are properly escaped, but it keeps the double quotes around the text which is interesting... \\r\\n
已正确转义,但它在文本周围保留了双引号,这很有趣...
+----------+-------------------------------------------------------------------------------------------------------+
| user_id | JSON_UNQUOTE(JSON_EXTRACT(tw.tweet_json, '$.text')) |
+----------+-------------------------------------------------------------------------------------------------------+
| 12844052 | "California GOP Files FEC Complaint Over Obama Visit to Dying Grandmother"
http://url.com/6jd5j5 |
+----------+-------------------------------------------------------------------------------------------------------+
Here's what it looks like when I don't use the JSON_UNQUOTE unescape function: 这是当我不使用JSON_UNQUOTE unescape函数时的样子:
+-------------------------------------------------------------------------------------------------------------+
| JSON_EXTRACT(tw.tweet_json, '$.text') |
+-------------------------------------------------------------------------------------------------------------+
| "\"California GOP Files FEC Complaint Over Obama Visit to Dying Grandmother\"\r\nhttp://url.com/6jd5j5" |
+-------------------------------------------------------------------------------------------------------------+
I need to export these tweets to a CSV file, to be used by Excel or Google Sheets. 我需要将这些推文导出到CSV文件,以供Excel或Google表格使用。
I use the following specifier after my query: 我在查询后使用以下说明符:
INTO OUTFILE 'C:/ProgramData/MySQL/MySQL Server 5.7/Uploads/so.csv' FIELDS TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY '\n';
Opening the CSV file with Excel displays the following: (The second row/entry uses the JSON_UNESCAPE feature) 用Excel打开CSV文件将显示以下内容:(第二行/条目使用JSON_UNESCAPE功能)
Notice how the second entry, while using the JSON_UNESCAPE feature, shows excessive slashes. 请注意,第二个条目在使用JSON_UNESCAPE功能时如何显示过多的斜杠。
Here's the CSV file opened in notepad: 这是在记事本中打开的CSV文件:
"\"\\\"California GOP Files FEC Complaint Over Obama Visit to Dying Grandmother\\\"\\r\\nhttp://url.com/6jd5j5\""
"\"California GOP Files FEC Complaint Over Obama Visit to Dying Grandmother\"
\
http://url.com/6jd5j5"
Question: How can I properly escape the tweet here, so that it can be read as original intended? 问题:我如何才能在此处适当地忽略该推文,以便可以将其视为原始意图? Original Tweet Link 原始推文链接
Edit The advice to use ESCAPED BY '"'
from @Michael - sqlbot has brought me closer - but now when opening the CSV, the second part of the tweet (the URL) is in a new cell. I've verified that this happens in both Excel and Google Sheets: 编辑 @Michael使用ESCAPED BY '"'
的建议-sqlbot使我更加接近-但是现在打开CSV时,tweet的第二部分(URL)在新单元格中。在Excel和Google表格中:
Rendered CSV image (copy and pasting the text doesn't work well) 渲染的CSV图像(复制和粘贴文本效果不佳)
After some digging around, some helpful comments from @Michael - sqlbot, and this answer , I got it working properly in Google Sheets and Excel with the following statement: 经过一番挖掘之后,@Michael-sqlbot发表了一些有用的评论,并给出了这个答案 ,我可以使用以下语句在Google Sheets和Excel中正常工作:
SELECT REPLACE(JSON_UNQUOTE(JSON_EXTRACT({JSON_COL}, {JSON_PROP_TO_RETRIEVE})), '\r\n', '\n')
...
INTO OUTFILE 'C:/ProgramData/MySQL/MySQL Server 5.7/Uploads/{FILE_NAME}.csv' FIELDS TERMINATED BY ',' ENCLOSED BY '"' ESCAPED BY '"' LINES TERMINATED BY '\r\n';
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.