[英]How can I open an Excel file in Python?
How do I open a file that is an Excel file for reading in Python? 如何打开一个Excel文件以便在Python中读取?
I've opened text files, for example, sometextfile.txt
with the reading command. 我已经使用阅读命令打开了文本文件,例如sometextfile.txt
。 How do I do that for an Excel file? 我该如何处理Excel文件?
Edit: 编辑:
In the newer version of pandas, you can pass the sheet name as a parameter. 在较新版本的熊猫中,您可以将工作表名称作为参数传递。
file_name = # path to file + file name
sheet = # sheet name or sheet number or list of sheet numbers and names
import pandas as pd
df = pd.read_excel(io=file_name, sheet_name=sheet)
print(df.head(5)) # print first 5 rows of the dataframe
Check the docs for examples on how to pass sheet_name
: 检查文档以获取有关如何传递sheet_name
:
https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_excel.html https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_excel.html
Old version: 旧版:
you can use pandas
package as well.... 您也可以使用pandas
包 ...。
When you are working with an excel file with multiple sheets, you can use: 当您使用具有多个工作表的excel文件时,可以使用:
import pandas as pd
xl = pd.ExcelFile(path + filename)
xl.sheet_names
>>> [u'Sheet1', u'Sheet2', u'Sheet3']
df = xl.parse("Sheet1")
df.head()
df.head()
will print first 5 rows of your Excel file df.head()
将打印Excel文件的前5行
If you're working with an Excel file with a single sheet, you can simply use: 如果要使用一张工作表的Excel文件,则可以简单地使用:
import pandas as pd
df = pd.read_excel(path + filename)
print df.head()
Try the xlrd library . 尝试xlrd库 。
[Edit] - from what I can see from your comment, something like the snippet below might do the trick. [编辑] -从您的评论中可以看到,下面的代码片段可能可以解决问题。 I'm assuming here that you're just searching one column for the word 'john', but you could add more or make this into a more generic function. 我在这里假设您只是在一列中搜索单词“ john”,但是您可以添加更多内容或将其添加到更通用的函数中。
from xlrd import open_workbook
book = open_workbook('simple.xls',on_demand=True)
for name in book.sheet_names():
if name.endswith('2'):
sheet = book.sheet_by_name(name)
# Attempt to find a matching row (search the first column for 'john')
rowIndex = -1
for cell in sheet.col(0): #
if 'john' in cell.value:
break
# If we found the row, print it
if row != -1:
cells = sheet.row(row)
for cell in cells:
print cell.value
book.unload_sheet(name)
This isn't as straightforward as opening a plain text file and will require some sort of external module since nothing is built-in to do this. 这不像打开纯文本文件那样简单,并且将需要某种外部模块,因为没有内置模块可以执行此操作。 Here are some options: 以下是一些选项:
http://www.python-excel.org/ http://www.python-excel.org/
If possible, you may want to consider exporting the excel spreadsheet as a CSV file and then using the built-in python csv module to read it: 如果可能,您可能要考虑将excel电子表格导出为CSV文件,然后使用内置的python csv模块读取它:
http://docs.python.org/library/csv.html http://docs.python.org/library/csv.html
There's the openpxyl package: 有openpxyl包:
>>> from openpyxl import load_workbook
>>> wb2 = load_workbook('test.xlsx')
>>> print wb2.get_sheet_names()
['Sheet2', 'New Title', 'Sheet1']
>>> worksheet1 = wb2['Sheet1'] # one way to load a worksheet
>>> worksheet2 = wb2.get_sheet_by_name('Sheet2') # another way to load a worksheet
>>> print(worksheet1['D18'].value)
3
>>> for row in worksheet1.iter_rows():
>>> print row[0].value()
You can use xlpython package that requires xlrd only. 您可以使用仅需要xlrd的xlpython软件包。 Find it here https://pypi.python.org/pypi/xlpython and its documentation here https://github.com/morfat/xlpython 在这里https://pypi.python.org/pypi/xlpython及其文档在这里https://github.com/morfat/xlpython查找
This may help: 这可能会有所帮助:
This creates a node that takes a 2D List (list of list items) and pushes them into the excel spreadsheet. 这将创建一个接受2D列表(列表项列表)的节点,并将其推入excel电子表格。 make sure the IN[]s are present or will throw and exception. 确保IN []存在或将抛出异常。
this is a re-write of the Revit excel dynamo node for excel 2013 as the default prepackaged node kept breaking. 这是针对Excel 2013的Revit excel dynamo节点的重写,因为默认的预打包节点一直在中断。 I also have a similar read node. 我也有一个类似的读取节点。 The excel syntax in Python is touchy. Python中的excel语法很敏感。
thnx @CodingNinja - updated : ) thnx @CodingNinja-更新:)
###Export Excel - intended to replace malfunctioning excel node
import clr
clr.AddReferenceByName('Microsoft.Office.Interop.Excel, Version=15.0.0.0, Culture=neutral, PublicKeyToken=71e9bce111e9429c')
##AddReferenceGUID("{00020813-0000-0000-C000-000000000046}") ''Excel C:\Program Files\Microsoft Office\Office15\EXCEL.EXE
##Need to Verify interop for version 2015 is 15 and node attachemnt for it.
from Microsoft.Office.Interop import * ##Excel
################################Initialize FP and Sheet ID
##Same functionality as the excel node
strFileName = IN[0] ##Filename
sheetName = IN[1] ##Sheet
RowOffset= IN[2] ##RowOffset
ColOffset= IN[3] ##COL OFfset
Data=IN[4] ##Data
Overwrite=IN[5] ##Check for auto-overwtite
XLVisible = False #IN[6] ##XL Visible for operation or not?
RowOffset=0
if IN[2]>0:
RowOffset=IN[2] ##RowOffset
ColOffset=0
if IN[3]>0:
ColOffset=IN[3] ##COL OFfset
if IN[6]<>False:
XLVisible = True #IN[6] ##XL Visible for operation or not?
################################Initialize FP and Sheet ID
xlCellTypeLastCell = 11 #####define special sells value constant
################################
xls = Excel.ApplicationClass() ####Connect with application
xls.Visible = XLVisible ##VISIBLE YES/NO
xls.DisplayAlerts = False ### ALerts
import os.path
if os.path.isfile(strFileName):
wb = xls.Workbooks.Open(strFileName, False) ####Open the file
else:
wb = xls.Workbooks.add# ####Open the file
wb.SaveAs(strFileName)
wb.application.visible = XLVisible ####Show Excel
try:
ws = wb.Worksheets(sheetName) ####Get the sheet in the WB base
except:
ws = wb.sheets.add() ####If it doesn't exist- add it. use () for object method
ws.Name = sheetName
#################################
#lastRow for iterating rows
lastRow=ws.UsedRange.SpecialCells(xlCellTypeLastCell).Row
#lastCol for iterating columns
lastCol=ws.UsedRange.SpecialCells(xlCellTypeLastCell).Column
#######################################################################
out=[] ###MESSAGE GATHERING
c=0
r=0
val=""
if Overwrite == False : ####Look ahead for non-empty cells to throw error
for r, row in enumerate(Data): ####BASE 0## EACH ROW OF DATA ENUMERATED in the 2D array #range( RowOffset, lastRow + RowOffset):
for c, col in enumerate (row): ####BASE 0## Each colmn in each row is a cell with data ### in range(ColOffset, lastCol + ColOffset):
if col.Value2 >"" :
OUT= "ERROR- Cannot overwrite"
raise ValueError("ERROR- Cannot overwrite")
##out.append(Data[0]) ##append mesage for error
############################################################################
for r, row in enumerate(Data): ####BASE 0## EACH ROW OF DATA ENUMERATED in the 2D array #range( RowOffset, lastRow + RowOffset):
for c, col in enumerate (row): ####BASE 0## Each colmn in each row is a cell with data ### in range(ColOffset, lastCol + ColOffset):
ws.Cells[r+1+RowOffset,c+1+ColOffset].Value2 = col.__str__()
##run macro disbled for debugging excel macro
##xls.Application.Run("Align_data_and_Highlight_Issues")
This code worked for me with Python 3.5.2. 这段代码适用于Python 3.5.2。 It opens and saves and excel. 它打开并保存并表现出色。 I am currently working on how to save data into the file but this is the code: 我目前正在研究如何将数据保存到文件中,但这是代码:
import csv
excel = csv.writer(open("file1.csv", "wb"))
import pandas as pd
import os
files = os.listdir('path/to/files/directory/')
desiredFile = files[i]
filePath = 'path/to/files/directory/%s'
Ofile = filePath % desiredFile
xls_import = pd.read_csv(Ofile)
Now you can use the power of pandas DataFrames! 现在您可以使用pandas DataFrames的功能了!
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.