[英]encoding while Extracting data from JSON file
I have a simple python script as shown below. 我有一个简单的python脚本,如下所示。
with open(fname, 'r+') as f:
json_data = json.load(f)
message = json_data['Info']
for line in message.split('<br>'):
if(len(line) < 25):
print(line)
if ':' in line:
k,v = line.strip().split(':')
print(k,v)
I get k,v in the following format 我得到以下格式的k,v
(u'Images', u' 23')
(u'Links', u' 225')
The message
output looks as below. message
输出如下所示。
Title: Worlds best websit | mywebsite.com
Links: 225
Images: 23
Browser: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 Ubuntu Chromium/41.0.2272.76 Chrome/41.0.2272.76 Safari/537.36
CPUs: 8
I want to extract the data Images:23
and Links:225
and update that to the same json file f
in the script. 我想提取数据Images:23
和Links:225
并将其更新为脚本中的相同json文件f
。
If I were to do 如果我要做
json_data[k] = v
json.dump(json_data,f)
it corrupts the JSON file.Meaning If I add the above two lines to my code. 它破坏了JSON file.Meaning如果我上面的两行添加到我的代码。 and do 并做
cat output.json | python -m json.tool
from the command line.I get the following error. 从命令行我得到以下错误。
Extra data: line 2 column 1 - line 2 column 45376 (char 2139 - 47514)
I don't understanding what is 'u' in the output? 我不明白输出中的“ u”是什么? Is it some kind of encoding? 它是某种编码吗? If yes how do I process it? 如果是,我该如何处理?
Try this 尝试这个
import sys
import json
import re
fname = sys.argv[1]
openedFile = open(fname, 'r')
content = openedFile.read()
openedFile.close()
pattern = "Links: (\d+?)\nImages: (\d+?)"
matchObj = re.search(pattern, content)
if matchObj:
openedFile = open(fname, 'w')
newContent = {'Links': matchObj.group(1), 'Images': matchObj.group(2)}
json.dump(newContent, openedFile)
openedFile.close()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.