CSV Writer僅寫入文件的第一行

Question

因此，我擁有希望從XML存儲到CSV文件的專利數據。 我已經能夠在發明名稱，日期，國家和專利號的每次迭代中運行我的代碼，但是當我嘗試將結果寫入CSV文件時，出現了問題。

XML數據看起來像這樣（很多部分中的一部分）：

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE us-patent-grant SYSTEM "us-patent-grant-v42-2006-08-23.dtd" [ ]>
<us-patent-grant lang="EN" dtd-version="v4.2 2006-08-23" file="USD0584026-20090106.XML" status="PRODUCTION" id="us-patent-grant" country="US" date-produced="20081222" date-publ="20090106">
<us-bibliographic-data-grant>
<publication-reference>
<document-id>
<country>US</country>
<doc-number>D0584026</doc-number>
<kind>S1</kind>
<date>20090106</date>
</document-id>
</publication-reference>

我用於逐行編寫這些行的代碼是：

for xml_string in separated_xml(infile): # Calls the output of the separated and read file to parse the data
    soup = BeautifulSoup(xml_string, "lxml")     # BeautifulSoup parses the data strings where the XML is converted to Unicode
    pub_ref = soup.findAll("publication-reference") # Beginning parsing at every instance of a publication
    lst = []  # Creating empty list to append into

    for info in pub_ref:  # Looping over all instances of publication

# The final loop finds every instance of invention name, patent number, date, and country to print and append into

        with open('./output.csv', 'wb') as f:
            writer = csv.writer(f, dialect = 'excel')

            for inv_name, pat_num, date_num, country in zip(soup.findAll("invention-title"), soup.findAll("doc-number"), soup.findAll("date"), soup.findAll("country")):
            #print(inv_name.text, pat_num.text, date_num.text, country.text)
            #lst.append((inv_name.text, pat_num.text, date_num.text, country.text))
                writer.writerow([inv_name.text, pat_num.text, date_num.text, country.text])

最后，我的.csv文件中的輸出是這樣的：

"Content addressable information encapsulation, representation, and transfer",07475432,20090106,US

我不確定問題出在哪里，我知道我仍然是Python的新手，但是有人可以找到問題嗎？

Answer 1

問題在於with open('./output.csv', 'wb') as f:這一行with open('./output.csv', 'wb') as f:

如果要將所有行寫入單個文件，請使用模式a 。 使用wb將覆蓋文件，因此您只會得到最后一行。

在此處閱讀有關文件模式的更多信息： https : //docs.python.org/2/tutorial/inputoutput.html#reading-and-writing-files

Answer 2

您可以在循環內以覆蓋模式（ 'wb' ）打開文件。 在每次迭代中，您將擦除以前可能寫的內容。 正確的方法是在循環外打開文件：

...
with open('./output.csv', 'wb') as f:
    writer = csv.writer(f, dialect = 'excel')

    for info in pub_ref:  # Looping over all instances of publication

# The final loop finds every instance of invention name, patent number, date, and country to print and append into



        for inv_name, pat_num, date_num, country in zip(soup.findAll("invention-title"), soup.findAll("doc-number"), soup.findAll("date"), soup.findAll("country")):
            ...

CSV Writer僅寫入文件的第一行

問題描述

2 個解決方案

解決方案1
1 2017-08-03 15:22:44

解決方案2
1 已采納 2017-08-03 15:27:54

CSV Writer僅寫入文件的第一行

問題描述

2 個解決方案

解決方案1 1 2017-08-03 15:22:44

解決方案2 1 已采納 2017-08-03 15:27:54

解決方案1
1 2017-08-03 15:22:44

解決方案2
1 已采納 2017-08-03 15:27:54