This is the second part of my previous question ( Parsing xml file using Python3 and BeautifulSoup ).
I'm wondering how I parse the following lists, given their different xml structures. Also, I need to differentiate the different lists (or 'poll titles' in the single xml file. I can search for the 'results' element, but that element is present in 3 separate lists in the file.
The first poll title xml list uses this code to extract the data. The 'numplayers = True' argument differentiates this list from the other two, but there is no attribute in the results line for these.
for result in soup.find_all('results', numplayers = True):
numplayers = result['numplayers']
best = result.find('result', {'value': 'Best'})['numvotes']
recommended = result.find('result', {'value': 'Recommended'})['numvotes']
not_recommended = result.find('result', {'value': 'Not Recommended'})['numvotes']
print (numplayers, best, recommended, not_recommended)
I can't seem to figure out how to write something similar to this code for the following two xml lists. Thank you.
<poll title="Language Dependence" name="language_dependence" totalvotes="32">
<results>
<result value="No necessary in-game text" numvotes="32" level="1"/>
<result value="Some necessary text - easily memorized or small crib sheet" numvotes="0" level="2"/>
<result value="Moderate in-game text - needs crib sheet or paste ups" numvotes="0" level="3"/>
<result value="Extensive use of text - massive conversion needed to be playable" numvotes="0" level="4"/>
<result value="Unplayable in another language" numvotes="0" level="5"/>
</results>
</poll>
<poll title="User Suggested Player Age" name="suggested_playerage" totalvotes="32">
<results>
<result value="2" numvotes="0"/>
<result value="3" numvotes="0"/>
<result value="4" numvotes="0"/>
<result value="5" numvotes="1"/>
<result value="6" numvotes="6"/>
<result value="8" numvotes="15"/>
<result value="10" numvotes="10"/>
<result value="12" numvotes="0"/>
<result value="14" numvotes="0"/>
<result value="16" numvotes="0"/>
<result value="18" numvotes="0"/>
<result value="21 and up" numvotes="0"/>
</results>
</poll>
Here's what I think should work for the language dependence list, but it doesn't.
for result in soup.find_all('result',level=True):
level = result['level']
None = result.find('result', {'level': '1'})['numvotes']
Some = result.find('result', {'level': '2'})['numvotes']
Mod = result.find('result', {'level': '3'})['numvotes']
Ext = result.find('result', {'level': '4'})['numvotes']
Unp = result.find('result', {'level': '5'})['numvotes']
You have to use two different condition, see the code below.
from bs4 import BeautifulSoup
xml = """<poll title="Language Dependence" name="language_dependence" totalvotes="32">
<results>
<result value="No necessary in-game text" numvotes="32" level="1"/>
<result value="Some necessary text - easily memorized or small crib sheet" numvotes="0" level="2"/>
<result value="Moderate in-game text - needs crib sheet or paste ups" numvotes="0" level="3"/>
<result value="Extensive use of text - massive conversion needed to be playable" numvotes="0" level="4"/>
<result value="Unplayable in another language" numvotes="0" level="5"/>
</results>
</poll>
<poll title="User Suggested Player Age" name="suggested_playerage" totalvotes="32">
<results>
<result value="2" numvotes="0"/>
<result value="3" numvotes="0"/>
<result value="4" numvotes="0"/>
<result value="5" numvotes="1"/>
<result value="6" numvotes="6"/>
<result value="8" numvotes="15"/>
<result value="10" numvotes="10"/>
<result value="12" numvotes="0"/>
<result value="14" numvotes="0"/>
<result value="16" numvotes="0"/>
<result value="18" numvotes="0"/>
<result value="21 and up" numvotes="0"/>
</results>
</poll>"""
soup = BeautifulSoup(xml,'lxml')
for i in soup.find_all('poll',{'name':'language_dependence'})[0].find_all('result'):
value = i['value']
numvotes = i['numvotes']
level = i['level']
print('Value:',value,'\n','Numvotes:',numvotes,'\n','Level:',level)
print('--------------------------------------------')
for i in soup.find_all('poll',{'name':'suggested_playerage'})[0].find_all('result'):
value = i['value']
numvotes = i['numvotes']
print('Value:',value,'\n','Numvotes:',numvotes)
Output
Value: No necessary in-game text
Numvotes: 32
Level: 1
Value: Some necessary text - easily memorized or small crib sheet
Numvotes: 0
Level: 2
Value: Moderate in-game text - needs crib sheet or paste ups
Numvotes: 0
Level: 3
Value: Extensive use of text - massive conversion needed to be playable
Numvotes: 0
Level: 4
Value: Unplayable in another language
Numvotes: 0
Level: 5
--------------------------------------------
Value: 2
Numvotes: 0
Value: 3
Numvotes: 0
Value: 4
Numvotes: 0
Value: 5
Numvotes: 1
Value: 6
Numvotes: 6
Value: 8
Numvotes: 15
Value: 10
Numvotes: 10
Value: 12
Numvotes: 0
Value: 14
Numvotes: 0
Value: 16
Numvotes: 0
Value: 18
Numvotes: 0
Value: 21 and up
Numvotes: 0
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.