[英]How to merge the dictionary with same key?
我想将下面的代码与Keys Action和Thriller合并。 仅会显示2个键{'Action':[电影列表],'Thriller':[电影列表]}。 也欢迎使用新代码,例如lxml或BeautifulSoup。
import xml.etree.ElementTree as ET
from collections import defaultdict
tree = ET.parse('movies.xml')
root = tree.getroot()
d = {}
for child in root:
#print( child.attrib.values())
for movie in root.findall("./genre/decade/movie[@title]"):
#print(movie.attrib)
#print (list(movie.attrib.values())[1])
d[child.attrib.values()]=list(movie.attrib.values())[1]
d
{dict_values(['Action']): 'Indiana Jones: The raiders of the lost Ark',
dict_values(['Action']): 'THE KARATE KID',
dict_values(['Action']): 'Back 2 the Future',
dict_values(['Action']): 'X-Men',
dict_values(['Action']): 'Batman Returns',
dict_values(['Action']): 'Reservoir Dogs',
dict_values(['Action']): 'ALIEN',
dict_values(['Action']): "Ferris Bueller's Day Off",
dict_values(['Action']): 'American Psycho',
dict_values(['Thriller']): 'Indiana Jones: The raiders of the lost Ark',
dict_values(['Thriller']): 'THE KARATE KID',
dict_values(['Thriller']): 'Back 2 the Future',
dict_values(['Thriller']): 'X-Men',
dict_values(['Thriller']): 'Batman Returns',
dict_values(['Thriller']): 'Reservoir Dogs',
dict_values(['Thriller']): 'ALIEN',
dict_values(['Thriller']): "Ferris Bueller's Day Off",
dict_values(['Thriller']): 'American Psycho'}
我的xml来自datacamp。 数据营提供有关报废的信息,下面是xml,我将其保存在本地文件夹中并命名为电影
<?xml version="1.0" encoding="UTF-8" ?>
<collection>
<genre category="Action">
<decade years="1980s">
<movie favorite="True" title="Indiana Jones: The raiders of the lost Ark">
<format multiple="No">DVD</format>
<year>1981</year>
<rating>PG</rating>
<description>
'Archaeologist and adventurer Indiana Jones
is hired by the U.S. government to find the Ark of the
Covenant before the Nazis.'
</description>
</movie>
<movie favorite="True" title="THE KARATE KID">
<format multiple="Yes">DVD,Online</format>
<year>1984</year>
<rating>PG</rating>
<description>None provided.</description>
</movie>
<movie favorite="False" title="Back 2 the Future">
<format multiple="False">Blu-ray</format>
<year>1985</year>
<rating>PG</rating>
<description>Marty McFly</description>
</movie>
</decade>
<decade years="1990s">
<movie favorite="False" title="X-Men">
<format multiple="Yes">dvd, digital</format>
<year>2000</year>
<rating>PG-13</rating>
<description>Two mutants come to a private academy for their kind whose resident superhero team must
oppose a terrorist organization with similar powers.</description>
</movie>
<movie favorite="True" title="Batman Returns">
<format multiple="No">VHS</format>
<year>1992</year>
<rating>PG13</rating>
<description>NA.</description>
</movie>
<movie favorite="False" title="Reservoir Dogs">
<format multiple="No">Online</format>
<year>1992</year>
<rating>R</rating>
<description>WhAtEvER I Want!!!?!</description>
</movie>
</decade>
</genre>
<genre category="Thriller">
<decade years="1970s">
<movie favorite="False" title="ALIEN">
<format multiple="Yes">DVD</format>
<year>1979</year>
<rating>R</rating>
<description>"""""""""</description>
</movie>
</decade>
<decade years="1980s">
<movie favorite="True" title="Ferris Bueller's Day Off">
<format multiple="No">DVD</format>
<year>1986</year>
<rating>PG13</rating>
<description>Funny movie about a funny guy</description>
</movie>
<movie favorite="FALSE" title="American Psycho">
<format multiple="No">blue-ray</format>
<year>2000</year>
<rating>Unrated</rating>
<description>psychopathic Bateman</description>
</movie>
</decade>
</genre>
</collection>
您的代码可以很好地获取数据,这就是您解析数据的方式。 在字典中, .values()
返回值的视图,您可以根据需要将其存储到列表中。 在这种情况下,您需要字典本身的值,只需按键选择即可。 child.attrib['category']
。 一旦知道了,您要做的就是更新字典。 在这里,我们将使用defaultdict
,当首次遇到该键时,它将返回一个空列表,以便我们可以附加电影标题。
import xml.etree.ElementTree as ET
from collections import defaultdict
tree = ET.parse('movies.xml')
root = tree.getroot()
d = defaultdict(list)
for child in root:
for movie in root.findall("./genre/decade/movie[@title]"):
d[child.attrib['category']].append(movie.attrib['title'])
>>d
defaultdict(list,
{'Action': ['Indiana Jones: The raiders of the lost Ark',
'THE KARATE KID',
'Back 2 the Future',
'X-Men',
'Batman Returns',
'Reservoir Dogs',
'ALIEN',
"Ferris Bueller's Day Off",
'American Psycho'],
'Thriller': ['Indiana Jones: The raiders of the lost Ark',
'THE KARATE KID',
'Back 2 the Future',
'X-Men',
'Batman Returns',
'Reservoir Dogs',
'ALIEN',
"Ferris Bueller's Day Off",
'American Psycho']})
如果您只想说“动作”,则可以像普通字典键一样选择。
d['Action']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.