[英]Iterate through json file to get specific attribute values using python
我有一個json文件,如下所示:
[
{
"contributors": null,
"coordinates": null,
"created_at": "Fri Aug 04 21:12:59 +0000 2017",
"entities": {
"hashtags": [
{
"indices": [
32,
39
],
"text": "\ubd80\uc0b0\ucd9c\uc7a5\uc548\ub9c8"
},
{
"indices": [
40,
48
],
"text": "\ubd80\uc0b0\ucd9c\uc7a5\ub9c8\uc0ac\uc9c0"
}
]
},
"text": "\uaedb"
"retweeted_status": {
"contributors": null,
"coordinates": null,
"created_at": "Fri Aug 04 20:30:06 +0000 2017",
"display_text_range": [
0,
0
],
"text": "hjhfbsdjsdbjsd"
},
"extended_tweet": {
"display_text_range": [
0,
137
],
"entities": {
"hashtags": [
{
"indices": [
62,
75
],
"text": "2ndAmendment"
},
{
"indices": [
91,
104
],
"text": "1stAmendment"
}
]
}
}
}
]
我寫了下面的python代碼來計算整個json文件中文text
屬性的數量。
data = json.load(data_file)
for key, value in data1.items():
if key=="text":
cnt+=1
elif key=="retweeted_status":
for k,v in value.items():
if k=="text":
cnt+=1
elif key == "entities":
if key.keys()=="hashtags" :
for k1,v1 in key:
# Difficult to loop further
由於數據結構不能保持恆定,因此很難進行迭代。 此外,我想訪問text
屬性的值並顯示它。 有沒有更簡單的方法而無需多個循環呢?
使用正則表達式呢?
import re
regex_chain = re.compile(r'(text)\": \"(.*)\"')
text_ocurrences=[]
with open('1.json') as file:
for line in file:
match = regex_chain.search(line)
if match:
text_ocurrences.append({ match.group(1) : match.group(2)})
print(text_ocurrences)
您將獲得一列字典,其中每個字典都包含鍵,文本出現的值
[{'text': '\\ubd80\\uc0b0\\ucd9c\\uc7a5\\uc548\\ub9c8'}, {'text': '\\ubd80\\uc0b0\\ucd9c\\uc7a5\\ub9c8\\uc0ac\\uc9c0'}, {'text': '\\uaedb'}, {'text': 'hjhfbsdjsdbjsd'}, {'text': '2ndAmendment'}, {'text': '1stAmendment'}]
我不確定天真地用正則表達式解析JSON的安全性如何,尤其是(text)\\": \\"(.*)\\"
可以從技術上匹配text": "abc", "text": "another"
其中第1組為text
,第2組為abc", "text": "another
。
用python的標准json
庫解析JSON,然后遞歸遍歷該數據,要安全得多。
import json
def count_key(selected_key, obj):
count = 0
if isinstance(obj, list):
for item in obj:
count += count_key(selected_key, item)
elif isinstance(obj, dict):
for key in obj:
if key == selected_key:
count += 1
count += count_key(selected_key, obj[key])
return count
with open("my-json-file", "r") as json_file:
print(count_key("text", json.loads(json_file.read())))
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.