[英]How to merge the dictionary with same key?
我想將下面的代碼與Keys Action和Thriller合並。 僅會顯示2個鍵{'Action':[電影列表],'Thriller':[電影列表]}。 也歡迎使用新代碼,例如lxml或BeautifulSoup。
import xml.etree.ElementTree as ET
from collections import defaultdict
tree = ET.parse('movies.xml')
root = tree.getroot()
d = {}
for child in root:
#print( child.attrib.values())
for movie in root.findall("./genre/decade/movie[@title]"):
#print(movie.attrib)
#print (list(movie.attrib.values())[1])
d[child.attrib.values()]=list(movie.attrib.values())[1]
d
{dict_values(['Action']): 'Indiana Jones: The raiders of the lost Ark',
dict_values(['Action']): 'THE KARATE KID',
dict_values(['Action']): 'Back 2 the Future',
dict_values(['Action']): 'X-Men',
dict_values(['Action']): 'Batman Returns',
dict_values(['Action']): 'Reservoir Dogs',
dict_values(['Action']): 'ALIEN',
dict_values(['Action']): "Ferris Bueller's Day Off",
dict_values(['Action']): 'American Psycho',
dict_values(['Thriller']): 'Indiana Jones: The raiders of the lost Ark',
dict_values(['Thriller']): 'THE KARATE KID',
dict_values(['Thriller']): 'Back 2 the Future',
dict_values(['Thriller']): 'X-Men',
dict_values(['Thriller']): 'Batman Returns',
dict_values(['Thriller']): 'Reservoir Dogs',
dict_values(['Thriller']): 'ALIEN',
dict_values(['Thriller']): "Ferris Bueller's Day Off",
dict_values(['Thriller']): 'American Psycho'}
我的xml來自datacamp。 數據營提供有關報廢的信息,下面是xml,我將其保存在本地文件夾中並命名為電影
<?xml version="1.0" encoding="UTF-8" ?>
<collection>
<genre category="Action">
<decade years="1980s">
<movie favorite="True" title="Indiana Jones: The raiders of the lost Ark">
<format multiple="No">DVD</format>
<year>1981</year>
<rating>PG</rating>
<description>
'Archaeologist and adventurer Indiana Jones
is hired by the U.S. government to find the Ark of the
Covenant before the Nazis.'
</description>
</movie>
<movie favorite="True" title="THE KARATE KID">
<format multiple="Yes">DVD,Online</format>
<year>1984</year>
<rating>PG</rating>
<description>None provided.</description>
</movie>
<movie favorite="False" title="Back 2 the Future">
<format multiple="False">Blu-ray</format>
<year>1985</year>
<rating>PG</rating>
<description>Marty McFly</description>
</movie>
</decade>
<decade years="1990s">
<movie favorite="False" title="X-Men">
<format multiple="Yes">dvd, digital</format>
<year>2000</year>
<rating>PG-13</rating>
<description>Two mutants come to a private academy for their kind whose resident superhero team must
oppose a terrorist organization with similar powers.</description>
</movie>
<movie favorite="True" title="Batman Returns">
<format multiple="No">VHS</format>
<year>1992</year>
<rating>PG13</rating>
<description>NA.</description>
</movie>
<movie favorite="False" title="Reservoir Dogs">
<format multiple="No">Online</format>
<year>1992</year>
<rating>R</rating>
<description>WhAtEvER I Want!!!?!</description>
</movie>
</decade>
</genre>
<genre category="Thriller">
<decade years="1970s">
<movie favorite="False" title="ALIEN">
<format multiple="Yes">DVD</format>
<year>1979</year>
<rating>R</rating>
<description>"""""""""</description>
</movie>
</decade>
<decade years="1980s">
<movie favorite="True" title="Ferris Bueller's Day Off">
<format multiple="No">DVD</format>
<year>1986</year>
<rating>PG13</rating>
<description>Funny movie about a funny guy</description>
</movie>
<movie favorite="FALSE" title="American Psycho">
<format multiple="No">blue-ray</format>
<year>2000</year>
<rating>Unrated</rating>
<description>psychopathic Bateman</description>
</movie>
</decade>
</genre>
</collection>
您的代碼可以很好地獲取數據,這就是您解析數據的方式。 在字典中, .values()
返回值的視圖,您可以根據需要將其存儲到列表中。 在這種情況下,您需要字典本身的值,只需按鍵選擇即可。 child.attrib['category']
。 一旦知道了,您要做的就是更新字典。 在這里,我們將使用defaultdict
,當首次遇到該鍵時,它將返回一個空列表,以便我們可以附加電影標題。
import xml.etree.ElementTree as ET
from collections import defaultdict
tree = ET.parse('movies.xml')
root = tree.getroot()
d = defaultdict(list)
for child in root:
for movie in root.findall("./genre/decade/movie[@title]"):
d[child.attrib['category']].append(movie.attrib['title'])
>>d
defaultdict(list,
{'Action': ['Indiana Jones: The raiders of the lost Ark',
'THE KARATE KID',
'Back 2 the Future',
'X-Men',
'Batman Returns',
'Reservoir Dogs',
'ALIEN',
"Ferris Bueller's Day Off",
'American Psycho'],
'Thriller': ['Indiana Jones: The raiders of the lost Ark',
'THE KARATE KID',
'Back 2 the Future',
'X-Men',
'Batman Returns',
'Reservoir Dogs',
'ALIEN',
"Ferris Bueller's Day Off",
'American Psycho']})
如果您只想說“動作”,則可以像普通字典鍵一樣選擇。
d['Action']
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.