[英]Read one cell from each sheet of an Excel workbook and compile into a list (pandas/python)
I have an Excel workbook with a bunch of sheets that I want to pull the contents of cell B4 and put all of those cells into a list, not a dictionary.我有一个带有一堆工作表的 Excel 工作簿,我想提取单元格 B4 的内容并将所有这些单元格放入列表中,而不是字典中。 I'm also pulling information from other parts of each sheet but I don't think I can do both at the same time so my code looks like this:我还从每张工作表的其他部分提取信息,但我认为我不能同时做这两件事,所以我的代码如下所示:
sheets = pd.read_excel(f, sheet_name=None, index_col=None, usecols="D:G", skiprows=14, nrows=8)
cells = pd.read_excel(f, sheet_name=None, index_col=None, header=None, usecols="B", skiprows=5, nrows=0)
I then tried to convert cells into a dataframe (because each of the items in sheets is a dataframe):然后我尝试将单元格转换为数据框(因为工作表中的每个项目都是数据框):
cells2 = pd.DataFrame.from_dict(cells)
and I get this error:我得到这个错误:
ValueError: If using all scalar values, you must pass an index
I still didn't get the whole picture of your data.我仍然没有得到你的数据的全貌。
But here's an approach.但这里有一个方法。
The example of excel content I used(only one sheet).我使用的excel内容示例(只有一张)。
I didn't use argument: sheet_name=None
(if it only contains one sheet)我没有使用参数: sheet_name=None
(如果它只包含一张纸)
cells = pd.read_excel('example.xlsx',header=None, index_col=None, usecols="B", skiprows=5)
cells = cells.rename(columns={1:"B"}) # rename column for readability
print(cells)
Output输出
B
0 1
1 4
2 2
3 10
4 1
5 4
6 3
It looks like you wanna put a series into list,看起来你想把一个系列放到列表中,
therefore we select the series, and its column name is 'B'因此我们选择系列,其列名是'B'
cells['B'].values.tolist()
Output输出
[1, 4, 2, 10, 1, 4, 3]
And we try to put every B4 of each sheet into a list.我们尝试将每张纸的每个 B4 放入一个列表中。
sheets = pd.read_excel('example_3sheets.xlsx',sheet_name=None,header=None, index_col=None, usecols="B", skiprows=3)
print(sheets)
Output输出
{'sheet1': 1
0 19
1 12
2 15, 'sheet2': 1
0 11
1 17
2 13, 'sheet3': 1
0 20
1 15
2 15}
That is actually three key-value pairs, like three boxes with name on it, and what is in the box?那实际上是三个键值对,就像三个盒子,上面有名字,盒子里有什么? a DataFrame
!一个DataFrame
! That's why you got the error syntax, while the dict's values are not list
-like content.这就是你得到错误语法的原因,而 dict 的值不是类似list
的内容。
You can select the first dataframe
by:您可以通过以下方式选择第一个dataframe
:
sheets['sheet1']
Output输出
1
0 19
1 12
2 15
And check the type:并检查类型:
type(sheets['sheet1'])
Output输出
pandas.core.frame.DataFrame
Iterate for every dataframe
迭代每个dataframe
for key in sheets:
print(sheets[key])
Output输出
1
0 19
1 12
2 15
1
0 11
1 17
2 13
1
0 20
1 15
2 15
According to the skiprows=3
, B4
is in every dataframe
's first element, we can select by iloc
根据skiprows=3
, B4
在每个dataframe
的第一个元素中,我们可以通过iloc
选择
b4_in_every_sheet = []
for key in sheets:
b4_in_every_sheet.append(sheets[key].iloc[0,0])
print(b4_in_every_sheet)
Output输出
[19, 11, 20]
another way with list comprehension列表理解的另一种方式
b4_in_every_sheet_2 = [sheets[key].iloc[0,0] for key in sheets]
print(b4_in_every_sheet_2)
Output输出
[19, 11, 20]
Hope it helps.希望能帮助到你。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.