[英]How to extract values in a specific row range from Excel with Python
enter image description here在此处输入图片说明
I want to print out all row values corresponding to a specific month in Excel using Python.我想使用 Python 在 Excel 中打印出与特定月份相对应的所有行值。 Please look at the picture.
请看图片。
If I put "2021.10" in the function value using the function, I want the values [2021.10.03", "2021.10.03", "2021.10.03", "2021.10.03","2021.10.03", None "2021.10.15", "2021.10.15", None] to be entered in the list.如果我使用函数将“2021.10”放在函数值中,我想要值 [2021.10.03”、“2021.10.03”、“2021.10.03”、“2021.10.03”、“2021.10.03”、“无” 2021.10.15", "2021.10.15", None] 要输入到列表中。
Simply, if I put the value "2021.10" into the function, I would like to extract the values from rows 5 to 13 of column A.简单地说,如果我将值“2021.10”放入函数中,我想从 A 列的第 5 行到第 13 行中提取值。
What should I do?我该怎么办?
I'm reading the sheet in openpyxl now.我现在正在阅读 openpyxl 中的表格。
import openpyxl as oxl load_excel = oxl.load_workbook('C:/Users/Homework.xlsx',data_only = True) load_sheet = load_excel['Sheet']
This can extract rows as what you want这可以根据需要提取行
# dataframe from the excel file
A B C
0 2021.09.23 E 1
1 2021.09.23 A 1
2 2021.09.23 E 1
3 None None 3
4 2021.10.03 A 1
5 2021.10.03 A 2
6 2021.10.03 B 2
7 2021.10.03 E 1
8 2021.10.03 A 1
9 None None 7
10 2021.10.15 A 2
11 2021.10.15 B 3
12 None None 5
13 2021.11.03 C 2
14 2021.11.03 B 1
15 2021.11.03 F 2
def extract(df, value):
df = df.reset_index() # to make index column
first_index = df[(df['A'].str.startswith(value))].iloc[0]['index'] # to get index of first 2021.10 value
last_index = df[(df['A'].str.startswith(value))].iloc[-1]['index'] # to get index of last 2021.10 value
sub_df = df.iloc[first_index:last_index+1,] # to make dataframe from first index to last index
for i, row in df[last_index+1:].iterrows():
# to add all None right after the last value
if row['A'] == 'None':
sub_df = sub_df.append(row)
else:
break
print(sub_df['A'].to_list())
import pandas as pd
df = pd.read_excel('C:/Users/Homework.xlsx') # I read xlsx file using pandas instead of `openpyxl`
extract(df, '2021.10')
This will be printed by the function:这将由函数打印:
['2021.10.03', '2021.10.03', '2021.10.03', '2021.10.03', '2021.10.03', 'None', '2021.10.15', '2021.10.15', 'None']
you can declare a list first and then write a loop which will traverse the rows which value you need.您可以先声明一个列表,然后编写一个循环,该循环将遍历您需要的值的行。 After every iteration you can append the value to the list.
每次迭代后,您都可以将该值附加到列表中。 after completing the loop you will get your desired extracted values.
完成循环后,您将获得所需的提取值。
list = []
for i,j in range of(3,10):
list.append(sheet.cell(row=i, col=j).value = list[i][j])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.