Create a CSV table with name and ID from local HTML file using Python

Question

I'm a newbie trying to practice using Python to get data from a local HTML file to extract name and ID to save as a table in CSV file. The HTML is as follow:

<td>
  <a href="https:............" data_id="45498" class="roster_user_name 
......
<span name="Clarence Alan" src="
</a>
    
</td>

<td>
  
    88889999
  
</td>

My code to have the name list:

all_urls = [a['name']
for a in soup('span')
if a.has_attr('name')]

good_urls = list(set(all_urls))
print(len(good_urls))
good_urls

I don't know how to extract the ID ('88889999') and combine them into a 2-column table.

I am very new to Python. Thank you for who answer for this.

Answer 1

I asked you if the HTML has <tr> tags and your reply show that number of tr tags equals the number of entries you want to scrape.

Using beautifulsoup, you can loop through all tr tags, and for each tr tag you can extract the required information.

Example (replace first parameter in BeautifulSoup to html string)

from bs4 import BeautifulSoup

soup = BeautifulSoup('<html> </html>', 'html.parser')
for row in soup.find_all('tr'):
    name = row.find_all('td')[0].text
    number = row.find_all('td')[1].text

This should loop through all rows and get name and number.

Then you could you CSV library to store the data.

Example

import csv
with open('file.csv', 'a+', newline='') as file:
    writer = csv.writer(file)
    writer.writerow(["COL1", "COL2"])

Create a CSV table with name and ID from local HTML file using Python

Question

1 answers

solution1
0 2022-01-09 23:57:28

Create a CSV table with name and ID from local HTML file using Python

Question

1 answers

solution1 0 2022-01-09 23:57:28

solution1
0 2022-01-09 23:57:28