简体   繁体   English

pandas dataframe:len(df)不等于df.iterrows()中的迭代次数

[英]pandas dataframe: len(df) is not equal to number of iterations in df.iterrows()

I have a dataframe where I want to print each row to a different file. 我有一个数据框,我想将每一行打印到不同的文件。 When the dataframe consists of eg only 50 rows, len(df) will print 50 and iterating over the rows of the dataframe like 当数据帧仅包含50行时, len(df)将打印50并迭代数据帧的行,如

for index, row in df.iterrows():
    print(index)

will print the index from 0 to 49 . 将从049打印索引。

However, if my dataframe contains more than 50'000 rows, len(df) and the number of iterations when iterating over df.iterrows() differ significantly. 但是,如果我的数据帧包含超过50'000行,则len(df)和迭代df.iterrows()时的迭代次数df.iterrows()很大差异。 For example, len(df) will say eg 50'554 and printing the index will go up to over 400'000. 例如, len(df)将说例如50'554并且打印索引将超过400'000。

How can this be? 怎么会这样? What am I missing here? 我在这里错过了什么?

First, as @EdChum noted in the comment, your question's title refers to iterrows , but the example you give refers to iteritems , which loops in the orthogonal direction to that relevant to len . 首先,正如@EdChum在评论中指出的那样,你的问题的标题是指iterrows ,但你给出的例子是指iteritems ,它在与len相关的正交方向上循环。 I assume you meant iterrows (as in the title). 我假设你的意思是iterrows (如标题中所示)。

Note that a DataFrame's index need not be a running index, irrespective of the size of the DataFrame. 请注意,无论DataFrame的大小如何,DataFrame的索引都不必是运行索引。 For example: 例如:

df = pd.DataFrame({'a': [1, 2, 3, 4]}, index=[2, 4, 5, 1000])

>>> for index, row in df.iterrows():
...     print index
2
4
5
1000

Presumably, your long DataFrame was just created differently, then, or underwent some manipulation, affecting the index. 据推测,您的长DataFrame只是以不同方式创建,或者经历了一些操作,影响了索引。

If you really must iterate with a running index, you can use Python's enumerate : 如果您真的必须使用正在运行的索引进行迭代,则可以使用Python的enumerate

>>> for index, row in enumerate(df.iterrows()):
...     print index
0
1
2
3

(Note that, in this case, row is itself a tuple.) (注意,在这种情况下, row本身就是一个元组。)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM