[英]Selecting specific excel rows for analysis in pandas/ipython?
This question is probably quite elementary, but I am totally stuck here so I would appreciate any help: Is there a way to extract data for analysis from an excel file by selecting specific row numbers? 这个问题可能很基本,但我完全被困在这里,所以我将不胜感激任何帮助:有没有办法通过选择特定的行号从excel文件中提取数据进行分析? For example, if I have an excel file with 30 rows, and I want to add up the values of row 5+10+21+27 ?
例如,如果我有一个包含30行的excel文件,并且我想将第5 + 10 + 21 + 27行的值加起来?
I only managed to learn how to select adjacent ranges with the iloc function like this: 我只是学会了如何使用iloc函数选择相邻的范围,如下所示:
import pandas as pd
df = pd.read_excel("example.xlsl")
df.iloc[1:5]
If this is not possible in Pandas, I would appreciate advice how to copy selected rows from a spreadsheet into a new spreadsheet via openpyxl, then I could just load the new worksheet into Pandas. 如果在Pandas中无法做到这一点,我将非常感谢如何通过openpyxl将电子表格中的选定行复制到新的电子表格中,然后我可以将新的工作表加载到Pandas中。
You can do like so, passing a list of indices: 您可以这样做,传递索引列表:
df.iloc[[4,9,20,26]].sum()
Mind that pyton uses 0-indexing, so these indices are one below the desired row numbers. 请注意,pyton使用0索引,因此这些索引低于所需的行数。
import pandas as pd
df = pd.read_excel("example.xlsx")
sum(df.data[i - 1] for i in [5, 10, 21, 27])
My df: 我的df:
data
0 1
1 2
2 3
3 4
4 5
...
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.