[英]python create list of nested dicts
我正在使用beautifulsoup獲取XML數據並將其放入字典數組中。 但是,它沒有按預期工作。 只是將相同的字典添加到列表中。 如何在嵌套for循環的正確階段將正確的字典添加到列表中?
打印的清單應如下所示:
[OrderedDict([('name', ‘dogs’), ('type', ‘housed’), ('value', ‘123’)]),
OrderedDict([('name', ‘cats’), ('type', ‘wild’), ('value', ‘456’)]),
OrderedDict([('name', ‘mice’), ('type', ‘housed’), ('value', ‘789’)])]
最好將它放在字典而不是列表中?
Here is the XML:
<window>
<window class="Obj" name="ray" type="housed">
<animal name="dogs", value = "123" />
<species name="sdogs", value = "s123" />
</window>
<window class="Obj" name="james" type="wild">
<animal name="cats", type="wild", value = "456" />
<species name="scats", type="swild", value = "s456" />
</window>
<window class="Obj" name="bob" type="housed">
<animal name="mice", value = "789" />
<species name="smice", value = "s789" />
</window>
</window>
和以下代碼(很抱歉,如果有一些錯誤,我可以更正它們,因為這是較大代碼的示例):
import sys
import pprint
from bs4 import BeautifulSoup as bs
from collections import OrderedDict
soup = bs(open("test.xml"),"lxml")
dicty = OrderedDict()
listy = [];
Objs=soup.findAll('window',{"class":"Obj"})
#print Objs
for Obj in Objs:
Objarr = OrderedDict() #### move this down
#I want to add data to the array here:
#print Obj
for child in Obj.children:
Objarr.update({"namesss" : Obj['name']})
if child.name is not None:
if child.name == "species":
print Obj['name']
print child['value']
#Also, adding data to the array here:
Objarr.update({"name" : Obj['name']})
Objarr.update({"type" : Obj['type']})
Objarr.update({"value": child['name']})
listy.append(Objarr) #### dedent this
pprint.pprint(listy)
您正在更新字典並將其添加到列表中。 結果是您一次又一次地使用同一詞典。 您應該在子循環的開頭之前創建一個新字典,並在循環之后而不是在循環之后添加。
我猜是這樣的:
import sys
import pprint
from bs4 import BeautifulSoup as bs
from collections import OrderedDict
soup = bs(open("my.xml"),"lxml")
dicty = OrderedDict()
listy = [];
Objs=soup.findAll('window',{"class":"Obj"})
#print Objs
for Obj in Objs:
Objarr = OrderedDict() #### move this down ####
#I want to add data to the array here:
for child in Obj.children:
if child.name is not None:
if child.name == "variable":
#Also, adding data to the array here:
Objarr.update({"name" : Obj['text']})
Objarr.update({"type" : " matrix”})
Objarr.update({"value": child['name']})
listy.append(Objarr) #### dedent this ####
pprint.pprint(listy)
查看以下內容以了解您的objs
包含的內容:
>>> soup = bs(open("my_xml.xml"),"lxml")
>>>
>>> objs = soup.findAll('window',{"class":"Obj"})
>>>
>>> for obj in objs:
... for child in obj.children:
... print child
...
<animal name="dogs" type="housed" value="123"></animal>
<animal name="cats" type="wild" value="456"></animal>
<animal name="mice" type="housed" value="789"></animal>
<window>
</window>
意味着,在第一元件objs
是一個\\n
和最后一個元素是<window>\\n</window>
和之間彼此元件有一個\\n
每個兩個元件分開。
要解決此問題,您需要將listiterator
( obj.children
)轉換為類似於此list(obj.children)
的普通list
,然后將這些值用於列表切片: start: 1, end: -2, step: 2
,例如此list(obj.children)[1:-2:2]
在這種情況下,這是輸出:
>>> for obj in objs:
... for child in list(obj.children)[1:-2:2]:
... print child
...
<animal name="dogs" type="housed" value="123"></animal>
<animal name="cats" type="wild" value="456"></animal>
<animal name="mice" type="housed" value="789"></animal>
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.