简体   繁体   English

如何根据输入将 append 数据转换为现有的 Excel 文件?

[英]How to append data to an existing Excel file based on the input?

I am currently writing a Scrapy Webcrawler that is meant to extract data from a site's page and append those data to an existing excel( ".tmp.xlsx" ) file.我目前正在编写一个 Scrapy Webcrawler,用于从网站页面中提取数据,并将 append 这些数据提取到现有的 excel( ".tmp.xlsx" ) 文件中。 The file comes with prepopulated column headers like "name", "country", "state", "zip code", "address", "phone number" .该文件带有预填充的列标题,例如"name"、"country"、"state"、"zip code"、"address"、"phone number" The sites i will be scraping most times wont have data to populate all columns.我最常抓取的网站不会有数据来填充所有列。 Some can have data for just "country", "state", "zip code" and "phone number" .. I need help setting up my pipelines.py in a way whereby i will be appending to the file based on the type of data i get from the site im crawling..有些可能只有“国家”、“州”、“邮政编码”和“电话号码”的数据。我需要帮助设置我的pipelines.py ,以便我根据类型附加到文件中我从我正在爬行的网站获得的数据..

One option (which may not be what you are looking for) is to just append the data to a CSV (using Scrapy's builtin CsvItemExporter ).一种选择(可能不是您正在寻找的)是将 append 数据发送到 CSV (使用 Scrapy 的内置CsvItemExporter )。 Then in the close_spider method, convert it to an excel file (using eg, pandas ).然后在close_spider方法中,将其转换为 excel 文件(使用例如pandas )。

this code maybe help you put this in setting.py此代码可能会帮助您将其放入 setting.py

FEED_FORMAT = 'csv'  #format
FEED_URI =  "tmp.csv" #the path of output
# put this in the last of spider 
    def close(self, reason):
        df=pd.read_csv("tmp.csv")
        df.to_excel("tmp.xlsx",index=False) #to do not  make index

If you need any help do not hesitate to ask如果您需要任何帮助,请随时询问

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何 append dataframe 到现有的 Excel 文件基于带有 Z23EEEB4347BDD755BFC6B7EE9DDZA 的标头 - How to append dataframe to existing Excel file based on headers with python 如何在不使用pandas更改文件中的现有数据的情况下将新列附加到Excel文件? - How to append a new column to an Excel file without changing the existing data on the file using pandas? 将数据附加到现有的Excel电子表格 - Append data to an existing excel spreadsheet 使用Python将大量数据框追加到现有的Excel文件中 - Using Python to append large set of data frames to an existing excel file Append 新数据到现有 excel 文件 pandas python - Append new data into existing excel file pandas python 如何将列表数据附加到现有的 XML 文件? - How to append list data to an existing XML file? 如何使用 python 将 append 数据写入数据块中的现有文件? - How to append data to an existing file in databricks with python? 如何将值附加到现有的逗号分隔的csv(excel)文件 - How to append values to existing comma delimited csv (excel) file 如何将 Excel 转换为 JSON 和 append 将其转换为现有的 Z0ECD11C1D7A2877401D148A2F8 文件? - How to convert Excel to JSON and append it to existing JSON file? 如何使用python将数据附加到现有excel表的特定单元格? - How to use python to append data to specific cell to existing excel table?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM