简体   繁体   English

如何使用 Pandas 在现有 excel 文件中保存新工作表?

[英]How to save a new sheet in an existing excel file, using Pandas?

I want to use excel files to store data elaborated with python.我想使用 excel 文件来存储用 python 详细说明的数据。 My problem is that I can't add sheets to an existing excel file.我的问题是我无法将工作表添加到现有的 excel 文件中。 Here I suggest a sample code to work with in order to reach this issue在这里,我建议使用示例代码来解决此问题

import pandas as pd
import numpy as np

path = r"C:\Users\fedel\Desktop\excelData\PhD_data.xlsx"

x1 = np.random.randn(100, 2)
df1 = pd.DataFrame(x1)

x2 = np.random.randn(100, 2)
df2 = pd.DataFrame(x2)

writer = pd.ExcelWriter(path, engine = 'xlsxwriter')
df1.to_excel(writer, sheet_name = 'x1')
df2.to_excel(writer, sheet_name = 'x2')
writer.save()
writer.close()

This code saves two DataFrames to two sheets, named "x1" and "x2" respectively.此代码将两个 DataFrame 保存到两张表中,分别命名为“x1”和“x2”。 If I create two new DataFrames and try to use the same code to add two new sheets, 'x3' and 'x4', the original data is lost.如果我创建两个新 DataFrame 并尝试使用相同的代码添加两个新工作表“x3”和“x4”,则原始数据将丢失。

import pandas as pd
import numpy as np

path = r"C:\Users\fedel\Desktop\excelData\PhD_data.xlsx"

x3 = np.random.randn(100, 2)
df3 = pd.DataFrame(x3)

x4 = np.random.randn(100, 2)
df4 = pd.DataFrame(x4)

writer = pd.ExcelWriter(path, engine = 'xlsxwriter')
df3.to_excel(writer, sheet_name = 'x3')
df4.to_excel(writer, sheet_name = 'x4')
writer.save()
writer.close()

I want an excel file with four sheets: 'x1', 'x2', 'x3', 'x4'.我想要一个包含四张纸的 excel 文件:“x1”、“x2”、“x3”、“x4”。 I know that 'xlsxwriter' is not the only "engine", there is 'openpyxl'.我知道“xlsxwriter”不是唯一的“引擎”,还有“openpyxl”。 I also saw there are already other people that have written about this issue, but still I can't understand how to do that.我也看到已经有其他人写过这个问题,但我仍然不明白该怎么做。

Here a code taken from this link这是取自此链接的代码

import pandas
from openpyxl import load_workbook

book = load_workbook('Masterfile.xlsx')
writer = pandas.ExcelWriter('Masterfile.xlsx', engine='openpyxl') 
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)

data_filtered.to_excel(writer, "Main", cols=['Diff1', 'Diff2'])

writer.save()

They say that it works, but it is hard to figure out how.他们说它有效,但很难弄清楚如何。 I don't understand what "ws.title", "ws", and "dict" are in this context.我不明白在这种情况下“ws.title”、“ws”和“dict”是什么。

Which is the best way to save "x1" and "x2", then close the file, open it again and add "x3" and "x4"?保存“x1”和“x2”,然后关闭文件,再次打开并添加“x3”和“x4”的最佳方法是什么?

Thank you.谢谢你。 I believe that a complete example could be good for anyone else who have the same issue:我相信一个完整的例子可能对其他有同样问题的人有好处:

import pandas as pd
import numpy as np

path = r"C:\Users\fedel\Desktop\excelData\PhD_data.xlsx"

x1 = np.random.randn(100, 2)
df1 = pd.DataFrame(x1)

x2 = np.random.randn(100, 2)
df2 = pd.DataFrame(x2)

writer = pd.ExcelWriter(path, engine = 'xlsxwriter')
df1.to_excel(writer, sheet_name = 'x1')
df2.to_excel(writer, sheet_name = 'x2')
writer.save()
writer.close()

Here I generate an excel file, from my understanding it does not really matter whether it is generated via the "xslxwriter" or the "openpyxl" engine.这里我生成了一个excel文件,根据我的理解,它是通过“xslxwriter”还是“openpyxl”引擎生成的并不重要。

When I want to write without loosing the original data then当我想在不丢失原始数据的情况下写入时

import pandas as pd
import numpy as np
from openpyxl import load_workbook

path = r"C:\Users\fedel\Desktop\excelData\PhD_data.xlsx"

book = load_workbook(path)
writer = pd.ExcelWriter(path, engine = 'openpyxl')
writer.book = book

x3 = np.random.randn(100, 2)
df3 = pd.DataFrame(x3)

x4 = np.random.randn(100, 2)
df4 = pd.DataFrame(x4)

df3.to_excel(writer, sheet_name = 'x3')
df4.to_excel(writer, sheet_name = 'x4')
writer.save()
writer.close()

this code do the job!这段代码做的工作!

In the example you shared you are loading the existing file into book and setting the writer.book value to be book .在您共享的示例中,您将现有文件加载到book并将writer.book值设置为book In the line writer.sheets = dict((ws.title, ws) for ws in book.worksheets) you are accessing each sheet in the workbook as ws .writer.sheets = dict((ws.title, ws) for ws in book.worksheets)您将工作簿中的每个工作表作为ws访问。 The sheet title is then ws so you are creating a dictionary of {sheet_titles: sheet} key, value pairs.然后工作表标题是ws因此您正在创建{sheet_titles: sheet}键值对的字典。 This dictionary is then set to writer.sheets.然后将该词典设置为 writer.sheets。 Essentially these steps are just loading the existing data from 'Masterfile.xlsx' and populating your writer with them.本质上,这些步骤只是从'Masterfile.xlsx'加载现有数据并用它们填充您'Masterfile.xlsx'

Now let's say you already have a file with x1 and x2 as sheets.现在假设您已经有一个包含x1x2作为工作表的文件。 You can use the example code to load the file and then could do something like this to add x3 and x4 .您可以使用示例代码加载文件,然后可以执行类似的操作来添加x3x4

path = r"C:\Users\fedel\Desktop\excelData\PhD_data.xlsx"
writer = pd.ExcelWriter(path, engine='openpyxl')
df3.to_excel(writer, 'x3', index=False)
df4.to_excel(writer, 'x4', index=False)
writer.save()

That should do what you are looking for.那应该做你正在寻找的。

A simple example for writing multiple data to excel at a time.一次将多个数据写入 excel 的简单示例。 And also when you want to append data to a sheet on a written excel file (closed excel file).以及当您想将数据附加到书面 excel 文件(关闭的 excel 文件)上的工作表时。

When it is your first time writing to an excel.当这是您第一次写入 Excel 时。 (Writing "df1" and "df2" to "1st_sheet" and "2nd_sheet") (将“df1”和“df2”写入“1st_sheet”和“2nd_sheet”)

import pandas as pd 
from openpyxl import load_workbook

df1 = pd.DataFrame([[1],[1]], columns=['a'])
df2 = pd.DataFrame([[2],[2]], columns=['b'])
df3 = pd.DataFrame([[3],[3]], columns=['c'])

excel_dir = "my/excel/dir"

with pd.ExcelWriter(excel_dir, engine='xlsxwriter') as writer:    
    df1.to_excel(writer, '1st_sheet')   
    df2.to_excel(writer, '2nd_sheet')   
    writer.save()    

After you close your excel, but you wish to "append" data on the same excel file but another sheet, let's say "df3" to sheet name "3rd_sheet".关闭 excel 后,但您希望将数据“附加”到同一个 excel 文件但另一个工作表上,让我们说“df3”到工作表名称“3rd_sheet”。

book = load_workbook(excel_dir)
with pd.ExcelWriter(excel_dir, engine='openpyxl') as writer:
    writer.book = book
    writer.sheets = dict((ws.title, ws) for ws in book.worksheets)    

    ## Your dataframe to append. 
    df3.to_excel(writer, '3rd_sheet')  

    writer.save()     

Be noted that excel format must not be xls, you may use xlsx one.需要注意的是excel格式不能是xls,你可以用xlsx之一。

For creating a new file用于创建新文件

x1 = np.random.randn(100, 2)
df1 = pd.DataFrame(x1)
with pd.ExcelWriter('sample.xlsx') as writer:  
    df1.to_excel(writer, sheet_name='x1')

For appending to the file, use the argument mode='a' in pd.ExcelWriter .要附加到文件,请在pd.ExcelWriter使用参数mode='a'

x2 = np.random.randn(100, 2)
df2 = pd.DataFrame(x2)
with pd.ExcelWriter('sample.xlsx', engine='openpyxl', mode='a') as writer:  
    df2.to_excel(writer, sheet_name='x2')

Default is mode ='w' .默认为mode ='w' See documentation .请参阅文档

I would strongly recommend you work directly with openpyxl since it now supports Pandas DataFrames .我强烈建议您直接使用openpyxl,因为它现在支持 Pandas DataFrames

This allows you to concentrate on the relevant Excel and Pandas code.这使您可以专注于相关的 Excel 和 Pandas 代码。

Can do it without using ExcelWriter, using tools in openpyxl This can make adding fonts to the new sheet much easier using openpyxl.styles可以在不使用 ExcelWriter 的情况下完成,使用 openpyxl 中的工具这可以使用openpyxl.styles更轻松地将字体添加到新工作表

import pandas as pd
from openpyxl import load_workbook
from openpyxl.utils.dataframe import dataframe_to_rows

#Location of original excel sheet
fileLocation =r'C:\workspace\data.xlsx'

#Location of new file which can be the same as original file
writeLocation=r'C:\workspace\dataNew.xlsx'

data = {'Name':['Tom','Paul','Jeremy'],'Age':[32,43,34],'Salary':[20000,34000,32000]}

#The dataframe you want to add
df = pd.DataFrame(data)

#Load existing sheet as it is
book = load_workbook(fileLocation)
#create a new sheet
sheet = book.create_sheet("Sheet Name")

#Load dataframe into new sheet
for row in dataframe_to_rows(df, index=False, header=True):
    sheet.append(row)

#Save the modified excel at desired location    
book.save(writeLocation)

You can read existing sheets of your interests, for example, 'x1', 'x2', into memory and 'write' them back prior to adding more new sheets (keep in mind that sheets in a file and sheets in memory are two different things, if you don't read them, they will be lost).您可以将您感兴趣的现有工作表(例如,“x1”、“x2”)读入内存并在添加更多新工作表之前将它们“写回”(请记住,文件中的工作表和内存中的工作表是两种不同的东西,如果你不读它们,它们就会丢失)。 This approach uses 'xlsxwriter' only, no openpyxl involved.此方法仅使用“xlsxwriter”,不涉及 openpyxl。

import pandas as pd
import numpy as np

path = r"C:\Users\fedel\Desktop\excelData\PhD_data.xlsx"

# begin <== read selected sheets and write them back
df1 = pd.read_excel(path, sheet_name='x1', index_col=0) # or sheet_name=0
df2 = pd.read_excel(path, sheet_name='x2', index_col=0) # or sheet_name=1
writer = pd.ExcelWriter(path, engine='xlsxwriter')
df1.to_excel(writer, sheet_name='x1')
df2.to_excel(writer, sheet_name='x2')
# end ==>

# now create more new sheets
x3 = np.random.randn(100, 2)
df3 = pd.DataFrame(x3)

x4 = np.random.randn(100, 2)
df4 = pd.DataFrame(x4)

df3.to_excel(writer, sheet_name='x3')
df4.to_excel(writer, sheet_name='x4')
writer.save()
writer.close()

If you want to preserve all existing sheets, you can replace above code between begin and end with:如果要保留所有现有工作表,可以将开始和结束之间的上述代码替换为:

# read all existing sheets and write them back
writer = pd.ExcelWriter(path, engine='xlsxwriter')
xlsx = pd.ExcelFile(path)
for sheet in xlsx.sheet_names:
    df = xlsx.parse(sheet_name=sheet, index_col=0)
    df.to_excel(writer, sheet_name=sheet)

Another fairly simple way to go about this is to make a method like this:另一种相当简单的方法是创建一个这样的方法:

def _write_frame_to_new_sheet(path_to_file=None, sheet_name='sheet', data_frame=None):
    book = None
    try:
        book = load_workbook(path_to_file)
    except Exception:
        logging.debug('Creating new workbook at %s', path_to_file)
    with pd.ExcelWriter(path_to_file, engine='openpyxl') as writer:
        if book is not None:
            writer.book = book
        data_frame.to_excel(writer, sheet_name, index=False)

The idea here is to load the workbook at path_to_file if it exists and then append the data_frame as a new sheet with sheet_name .这里的想法是在path_to_file加载工作簿(如果存在),然后将data_frame作为带有sheet_name的新工作表附加 If the workbook does not exist, it is created.如果工作簿不存在,则会创建它。 It seems that neither openpyxl or xlsxwriter append, so as in the example by @Stefano above, you really have to load and then rewrite to append.似乎openpyxlxlsxwriter都没有追加,因此在上面@Stefano 的示例中,您确实必须加载然后重写才能追加。

#This program is to read from excel workbook to fetch only the URL domain names and write to the existing excel workbook in a different sheet..
#Developer - Nilesh K
import pandas as pd
from openpyxl import load_workbook #for writting to the existing workbook

df = pd.read_excel("urlsearch_test.xlsx")

#You can use the below for the relative path.
# r"C:\Users\xyz\Desktop\Python\

l = [] #To make a list in for loop

#begin
#loop starts here for fetching http from a string and iterate thru the entire sheet. You can have your own logic here.
for index, row in df.iterrows():
    try: 
        str = (row['TEXT']) #string to read and iterate
        y = (index)
        str_pos = str.index('http') #fetched the index position for http
        str_pos1 = str.index('/', str.index('/')+2) #fetched the second 3rd position of / starting from http
        str_op = str[str_pos:str_pos1] #Substring the domain name
        l.append(str_op) #append the list with domain names

    #Error handling to skip the error rows and continue.
    except ValueError:
            print('Error!')
print(l)
l = list(dict.fromkeys(l)) #Keep distinct values, you can comment this line to get all the values
df1 = pd.DataFrame(l,columns=['URL']) #Create dataframe using the list
#end

#Write using openpyxl so it can be written to same workbook
book = load_workbook('urlsearch_test.xlsx')
writer = pd.ExcelWriter('urlsearch_test.xlsx',engine = 'openpyxl')
writer.book = book
df1.to_excel(writer,sheet_name = 'Sheet3')
writer.save()
writer.close()

#The below can be used to write to a different workbook without using openpyxl
#df1.to_excel(r"C:\Users\xyz\Desktop\Python\urlsearch1_test.xlsx",index='false',sheet_name='sheet1')

Every time you want to save a Pandas DataFrame to an Excel, you may call this function:每次你想将 Pandas DataFrame 保存到 Excel 时,你可以调用这个函数:

import os

def save_excel_sheet(df, filepath, sheetname, index=False):
    # Create file if it does not exist
    if not os.path.exists(filepath):
        df.to_excel(filepath, sheet_name=sheetname, index=index)

    # Otherwise, add a sheet. Overwrite if there exists one with the same name.
    else:
        with pd.ExcelWriter(filepath, engine='openpyxl', if_sheet_exists='replace', mode='a') as writer:
            df.to_excel(writer, sheet_name=sheetname, index=index)

if you want to add empty sheet如果你想添加空工作表

xw = pd.ExcelWriter(file_path, engine='xlsxwriter')    
pd.DataFrame().to_excel(xw, 'sheet11')

if you get empty sheet如果你得到空纸

sheet = xw.sheets['sheet11']

The following solution worked for me:以下解决方案对我有用:

df = pd.DataFrame({"A":[1,2], "B":[3,4]})

path = "./..../..../.../test.xlsx"

if os.path.isfile(path):  
  with pd.ExcelWriter(path, mode='a') as writer:
    df.to_excel(writer, sheet_name= "sheet_2")
else:
  with pd.ExcelWriter(path) as writer:
    df.to_excel(writer, sheet_name= "sheet_1")
import pandas as pd
import openpyxl

writer = pd.ExcelWriter('test.xlsx', engine='openpyxl')
data_df.to_excel(writer, 'sheet_name')
writer.save()
writer.close()

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使用熊猫将工作表添加到现有的Excel文件中? - How to add sheet to existing excel file with pandas? "使用 python pandas 将现有的 excel 表附加到新的数据框" - Append existing excel sheet with new dataframe using python pandas Python-使用现有格式保存新的Excel工作表 - Python - save new Excel sheet with existing format 如何在 .xlsm 文件中使用 Pandas 将数据框添加到现有 Excel 工作表 - How to add a dataframe to an existing Excel sheet with Pandas, on a .xlsm file 如何使用Pandas将python Web抓取数据导出到现有excel文件中的特定工作表? - How can I export my python web scrape data to a specific sheet in an existing excel file using pandas? Pandas 在尝试 append 到现有工作表时创建新的 excel 工作表 - Pandas creates new excel sheet when trying to append to existing sheet 使用Pandas对现有Excel工作表进行操作 - Operations on existing Excel sheet using Pandas 如何在不使用pandas更改文件中的现有数据的情况下将新列附加到Excel文件? - How to append a new column to an Excel file without changing the existing data on the file using pandas? 尝试将熊猫数据框保存到现有Excel工作表时出现AttributeError - AttributeError when trying to save pandas dataframe to existing excel sheet 使用python熊猫将现有的Excel工作表与新的数据框追加到新的数据框,而无需加载旧的 - Append existing excel sheet with new dataframe using python pandas without loading the old one
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM