简体   繁体   English

在pandas / ipython中选择特定的excel行进行分析?

[英]Selecting specific excel rows for analysis in pandas/ipython?

This question is probably quite elementary, but I am totally stuck here so I would appreciate any help: Is there a way to extract data for analysis from an excel file by selecting specific row numbers? 这个问题可能很基本,但我完全被困在这里,所以我将不胜感激任何帮助:有没有办法通过选择特定的行号从excel文件中提取数据进行分析? For example, if I have an excel file with 30 rows, and I want to add up the values of row 5+10+21+27 ? 例如,如果我有一个包含30行的excel文件,并且我想将第5 + 10 + 21 + 27行的值加起来?

I only managed to learn how to select adjacent ranges with the iloc function like this: 我只是学会了如何使用iloc函数选择相邻的范围,如下所示:

import pandas as pd

df = pd.read_excel("example.xlsl")

df.iloc[1:5]

If this is not possible in Pandas, I would appreciate advice how to copy selected rows from a spreadsheet into a new spreadsheet via openpyxl, then I could just load the new worksheet into Pandas. 如果在Pandas中无法做到这一点,我将非常感谢如何通过openpyxl将电子表格中的选定行复制到新的电子表格中,然后我可以将新的工作表加载到Pandas中。

You can do like so, passing a list of indices: 您可以这样做,传递索引列表:

df.iloc[[4,9,20,26]].sum()

Mind that pyton uses 0-indexing, so these indices are one below the desired row numbers. 请注意,pyton使用0索引,因此这些索引低于所需的行数。

import pandas as pd

df = pd.read_excel("example.xlsx")   
sum(df.data[i - 1] for i in [5, 10, 21, 27])

My df: 我的df:

    data
0      1
1      2
2      3
3      4
4      5
...

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM