簡體   English   中英

使用python中的CSV模塊將數據寫入列中

[英]Write data in a column using CSV module in python

我用以下格式編寫了漂亮的湯代碼來創建輸出

Quatermass 2
Ghostbusters
Life of Brian

我現在想將其寫入一個csv文件中。 但是,我唯一熟悉的csv寫入功能是write_row。 當我使用它時,它僅打印我抓取的最后一個“ title_content”對象,即-> Brian的生活

我的python代碼是

from bs4 import BeautifulSoup
import requests
import re
import csv

html = ['table.html']

with open("table.html", "r") as f:
    contents = f.read()

outputfilename = 'row_writer.csv'
print(outputfilename)

outputfile = open(outputfilename, 'w')          #wb = write and binary - indicates file open for writing in binary
writer = csv.writer(outputfile)
writer.writerow(['Title'])

soup = BeautifulSoup(contents, "lxml")
for name in soup.find_all("td", {"width": "41%"}, string=re.compile(r'^(?!Title$)')):
    title_content = ((name).get_text())
    print(title_content)

writer.write_row([title_content])

有人可以幫忙將我的全部內容寫到一個csv列中嗎?

的HTML

    <table width='100%' border='0' cellpadding='0' class='blackbg textheadtitle'>
        <tr>
            <td width='41%' align='left'>Title</td>
                <table width='99%' border='0' cellpadding='1' class="normal">
        <tr>
            <td width='41%' align='left'><strong>Quatermass 2</strong></td>

    <table width='100%' border='0' cellpadding='0' class='blackbg textheadtitle'>
        <tr>
            <td width='41%' align='left'>Title</td>
                <table width='99%' border='0' cellpadding='1' class="normal">
        <tr>
            <td width='41%' align='left'><strong>Ghostbusters</strong></td>

    <table width='100%' border='0' cellpadding='0' class='blackbg textheadtitle'>
        <tr>
            <td width='41%' align='left'>Title</td>
                <table width='99%' border='0' cellpadding='1' class="normal">
        <tr>
            <td width='41%' align='left'><strong>Life of Brian</strong></td>

標題似乎都帶有標簽<strong> 因此,您可以做的是創建帶有該標記的文本列表,然后轉換為使用pandas寫入文件的表格。 您也可以使用writer來做,但是我喜歡使用pandas,因為您可以根據需要操縱表格。

html = ''' <table width='100%' border='0' cellpadding='0' class='blackbg textheadtitle'>
        <tr>
            <td width='41%' align='left'>Title</td>
                <table width='99%' border='0' cellpadding='1' class="normal">
        <tr>
            <td width='41%' align='left'><strong>Quatermass 2</strong></td>

    <table width='100%' border='0' cellpadding='0' class='blackbg textheadtitle'>
        <tr>
            <td width='41%' align='left'>Title</td>
                <table width='99%' border='0' cellpadding='1' class="normal">
        <tr>
            <td width='41%' align='left'><strong>Ghostbusters</strong></td>

    <table width='100%' border='0' cellpadding='0' class='blackbg textheadtitle'>
        <tr>
            <td width='41%' align='left'>Title</td>
                <table width='99%' border='0' cellpadding='1' class="normal">
        <tr>
            <td width='41%' align='left'><strong>Life of Brian</strong></td>'''




import pandas as pd
from bs4 import BeautifulSoup

soup = BeautifulSoup(html, 'html.parser')
titles = soup.find_all('strong')

titles_list = [each.text for each in titles ]

df = pd.DataFrame(titles_list, columns=['Title'])

df.to_csv('output.csv', index=False)

輸出:

print (df)
           Title
0   Quatermass 2
1   Ghostbusters
2  Life of Brian

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM