简体   繁体   中英

how to format excel file using python?

I have a script that scrapes data from list of websites using beautifulSoup package and save in an excel file using pandas and xlsxwriter packages.

What i want is to be able to format the excel file as i need like the width of the columns

but when i run the script it crash and display the below error.

AttributeError: 'NoneType' object has no attribute 'write'


import pandas as pd

import requests
from bs4 import BeautifulSoup
import xlsxwriter

def scrap_website():
    url_list = ["https://www.bayt.com/en/international/jobs/executive-chef-jobs/",
    joineddd = []
    for url in url_list:
        soup = BeautifulSoup(requests.get(url).content,"lxml")
        links = []
        for a in soup.select("h2.m0.t-regular a"):
            if a['href'] not in links:
                links.append("https://www.bayt.com"+ a['href'])
        for link in links:
            s = BeautifulSoup(requests.get(link).content, "lxml") 
            ### update Start ###
            alldd = dict()
            alldd['link'] = link
            dd_div = [i for i in s.select("div[class='card-content is-spaced'] div") 
                    if ('<dd>' in str(i) ) and ( "<dt>" in str(i))]

            for div in dd_div:
                k = div.select_one('dt').get_text(';', True)
                v = div.select_one('dd').get_text(';', True)
                alldd[k] = v
            ### update End  ###    

# result
        df = pd.DataFrame(joineddd)
        df_to_excel = df.to_excel(r"F:\\AIenv\web_scrapping\\jobDesc.xlsx", index = False, header=True)
        workbook = xlsxwriter.Workbook(df_to_excel)
        worksheet = workbook.add_worksheet()
        worksheet.set_column(0, 0,50)


where is the error and how to fix it?

  1. to_excel function returns nothing. It's why you got the error message.
# save excel file
excel_file_name = r"jobDesc.xlsx"
df.to_excel(excel_file_name, index = False, header=True)

# open excel file for change col width or something
workbook = xlsxwriter.Workbook(excel_file_name)
  1. Basically, you can't change existing file with xlsxwriter . There is a way to do so, but it is not recommended. I recommend openpyxl package instead of this. FYI, xlsxwriter: is there a way to open an existing worksheet in my workbook?

To access and format the Excel workbook or worksheet created by to_excel() you need to create an ExcelWriter object first. Something like this:

import pandas as pd

# Create a Pandas dataframe from some data.
df = pd.DataFrame({'Data': [10, 20, 30, 20, 15, 30, 45]})

# Create a Pandas Excel writer using XlsxWriter as the engine.
writer = pd.ExcelWriter('pandas_simple.xlsx', engine='xlsxwriter')

# Convert the dataframe to an XlsxWriter Excel object.
df.to_excel(writer, sheet_name='Sheet1', index=False, header=True)

# Get the xlsxwriter objects from the dataframe writer object.
workbook  = writer.book
worksheet = writer.sheets['Sheet1']

# Set the column width.
worksheet.set_column(0, 0, 50)

# Close the Pandas Excel writer and output the Excel file.



See Working with Python Pandas and XlsxWriter for more details.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM