[英]Recursive Generator Function Python Nested JSON Data
我正在嘗試編寫一個遞歸生成器函數來拼合混合類型,列表和字典的嵌套json對象。 我這樣做的部分目的是為了自己學習,因此避免了從互聯網上獲取示例以確保我更好地了解正在發生的事情,但由於遇到了麻煩,我認為yield語句在函數中的正確位置與環。
傳遞到生成器函數的數據源是外部循環的輸出,該循環正在mongo集合中進行迭代。
當我在與Yield語句相同的位置使用print語句時,會得到預期的結果,但是當我將其切換到yield語句時,生成器似乎每次外循環迭代僅產生一項。
希望有人可以告訴我我要去哪里錯了。
columns = ['_id'
, 'name'
, 'personId'
, 'status'
, 'explorerProgress'
, 'isSelectedForReview'
]
db = MongoClient().abcDatabase
coll = db.abcCollection
def dic_recurse(data, fields, counter, source_field):
counter += 1
if isinstance(data, dict):
for k, v in data.items():
if k in fields and isinstance(v, list) is False and isinstance(v, dict) is False:
# print "{0}{1}".format(source_field, k)[1:], v
yield "{0}{1}".format(source_field, k)[1:], v
elif isinstance(v, list):
source_field += "_{0}".format(k)
[dic_recurse(l, fields, counter, source_field) for l in data.get(k)]
elif isinstance(v, dict):
source_field += "_{0}".format(k)
dic_recurse(v, fields, counter, source_field)
elif isinstance(data, list):
[dic_recurse(l, fields, counter, '') for l in data]
for item in coll.find():
for d in dic_recurse(item, columns, 0, ''):
print d
下面是對其進行迭代的數據的示例,但是嵌套的確超出了顯示的范圍。
{
"_id" : ObjectId("5478464ee4b0a44213e36eb0"),
"consultationId" : "54784388e4b0a44213e36d5f",
"modules" : [
{
"_id" : "FF",
"name" : "Foundations",
"strategyHeaders" : [
{
"_id" : "FF_Money",
"description" : "Let's see where you're spending your money.",
"name" : "Managing money day to day",
"statuses" : [
{
"pid" : "54784388e4b0a44213e36d5d",
"status" : "selected",
"whenUpdated" : NumberLong(1425017616062)
},
{
"pid" : "54783da8e4b09cf5d82d4e11",
"status" : "selected",
"whenUpdated" : NumberLong(1425017616062)
}
],
"strategies" : [
{
"_id" : "FF_Money_CF",
"description" : "This option helps you get a picture of how much you're spending",
"name" : "Your spending and savings.",
"relatedGoals" : [
{
"_id" : ObjectId("54784581e4b0a44213e36e2f")
},
{
"_id" : ObjectId("5478458ee4b0a44213e36e33")
},
{
"_id" : ObjectId("547845a5e4b0a44213e36e37")
},
{
"_id" : ObjectId("54784577e4b0a44213e36e2b")
},
{
"_id" : ObjectId("5478456ee4b0a44213e36e27")
}
],
"soaTrashWarning" : "Understanding what you are spending and saving is crucial to helping you achieve your goals. Without this in place, you may be spending more than you can afford. ",
"statuses" : [
{
"personId" : "54784388e4b0a44213e36d5d",
"status" : "selected",
"whenUpdated" : NumberLong(1425017616062)
},
{
"personId" : "54783da8e4b09cf5d82d4e11",
"status" : "selected",
"whenUpdated" : NumberLong(1425017616062)
}
],
"trashWarning" : "This option helps you get a picture of how much you're spending and how much you could save.\nAre you sure you don't want to take up this option now?\n\n",
"weight" : NumberInt(1)
},
更新我對生成器功能進行了一些更改,盡管我不確定它們是否確實更改了任何內容,並且我一直在調試程序中逐行瀏覽印刷版和良品版。 新代碼如下。
def dic_recurse(data, fields, counter, source_field):
print 'Called'
if isinstance(data, dict):
for k, v in data.items():
if isinstance(v, list):
source_field += "_{0}".format(k)
[dic_recurse(l, fields, counter, source_field) for l in v]
elif isinstance(v, dict):
source_field += "_{0}".format(k)
dic_recurse(v, fields, counter, source_field)
elif k in fields and isinstance(v, list) is False and isinstance(v, dict) is False:
counter += 1
yield "L{0}_{1}_{2}".format(counter, source_field, k.replace('_', ''))[1:], v
elif isinstance(data, list):
for l in data:
dic_recurse(l, fields, counter, '')
調試時,這兩個版本之間的主要區別似乎在於點擊了本節代碼之后。
elif isinstance(data, list):
for l in data:
dic_recurse(l, fields, counter, '')
如果我正在測試yield版本,則對dic_recurse(l, fields, counter, '')
行的調用會被命中,但它似乎沒有調用該函數,因為我在函數開頭設置的任何打印語句均未命中,但是如果我使用print做同樣的事情,那么當代碼到達同一部分時,它會很高興地調用該函數並在整個函數中運行回來。
我敢肯定,我可能會誤解有關生成器和yield語句使用的一些基本知識。
代替對此的任何答復,我只是想發布我更新的解決方案,以防對其他人有用。
我需要在函數中添加其他yield語句,以便可以遞歸生成器函數的每個遞歸調用的結果,以供下一個使用,至少這就是我的理解。 很高興得到糾正。
def dic_recurse(data, fields, counter, source_field):
if isinstance(data, dict):
counter += 1
for k, v in data.items():
if isinstance(v, list):
for field_data in v:
for list_field in dic_recurse(field_data, fields, counter, source_field):
yield list_field
elif isinstance(v, dict):
for dic_field in dic_recurse(v, fields, counter, source_field):
yield dic_field
elif k in fields and isinstance(v, list) is False and isinstance(v, dict) is False:
yield counter, {"{0}_L{1}".format(k, counter): v}
elif isinstance(data, list):
counter += 1
for list_item in data:
for li2 in dic_recurse(list_item, fields, counter, ''):
yield li2
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.