[英]Write data in a column using CSV module in python
我用以下格式編寫了漂亮的湯代碼來創建輸出
Quatermass 2
Ghostbusters
Life of Brian
我現在想將其寫入一個csv文件中。 但是,我唯一熟悉的csv寫入功能是write_row。 當我使用它時,它僅打印我抓取的最后一個“ title_content”對象,即-> Brian的生活
我的python代碼是
from bs4 import BeautifulSoup
import requests
import re
import csv
html = ['table.html']
with open("table.html", "r") as f:
contents = f.read()
outputfilename = 'row_writer.csv'
print(outputfilename)
outputfile = open(outputfilename, 'w') #wb = write and binary - indicates file open for writing in binary
writer = csv.writer(outputfile)
writer.writerow(['Title'])
soup = BeautifulSoup(contents, "lxml")
for name in soup.find_all("td", {"width": "41%"}, string=re.compile(r'^(?!Title$)')):
title_content = ((name).get_text())
print(title_content)
writer.write_row([title_content])
有人可以幫忙將我的全部內容寫到一個csv列中嗎?
的HTML
<table width='100%' border='0' cellpadding='0' class='blackbg textheadtitle'>
<tr>
<td width='41%' align='left'>Title</td>
<table width='99%' border='0' cellpadding='1' class="normal">
<tr>
<td width='41%' align='left'><strong>Quatermass 2</strong></td>
<table width='100%' border='0' cellpadding='0' class='blackbg textheadtitle'>
<tr>
<td width='41%' align='left'>Title</td>
<table width='99%' border='0' cellpadding='1' class="normal">
<tr>
<td width='41%' align='left'><strong>Ghostbusters</strong></td>
<table width='100%' border='0' cellpadding='0' class='blackbg textheadtitle'>
<tr>
<td width='41%' align='left'>Title</td>
<table width='99%' border='0' cellpadding='1' class="normal">
<tr>
<td width='41%' align='left'><strong>Life of Brian</strong></td>
標題似乎都帶有標簽<strong>
。 因此,您可以做的是創建帶有該標記的文本列表,然后轉換為使用pandas寫入文件的表格。 您也可以使用writer來做,但是我喜歡使用pandas,因為您可以根據需要操縱表格。
html = ''' <table width='100%' border='0' cellpadding='0' class='blackbg textheadtitle'>
<tr>
<td width='41%' align='left'>Title</td>
<table width='99%' border='0' cellpadding='1' class="normal">
<tr>
<td width='41%' align='left'><strong>Quatermass 2</strong></td>
<table width='100%' border='0' cellpadding='0' class='blackbg textheadtitle'>
<tr>
<td width='41%' align='left'>Title</td>
<table width='99%' border='0' cellpadding='1' class="normal">
<tr>
<td width='41%' align='left'><strong>Ghostbusters</strong></td>
<table width='100%' border='0' cellpadding='0' class='blackbg textheadtitle'>
<tr>
<td width='41%' align='left'>Title</td>
<table width='99%' border='0' cellpadding='1' class="normal">
<tr>
<td width='41%' align='left'><strong>Life of Brian</strong></td>'''
import pandas as pd
from bs4 import BeautifulSoup
soup = BeautifulSoup(html, 'html.parser')
titles = soup.find_all('strong')
titles_list = [each.text for each in titles ]
df = pd.DataFrame(titles_list, columns=['Title'])
df.to_csv('output.csv', index=False)
輸出:
print (df)
Title
0 Quatermass 2
1 Ghostbusters
2 Life of Brian
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.