简体   繁体   English

Python:pandas数据框架和调​​试器中{Series} 0的含义

[英]Python: pandas Data Frame and meaning of {Series}0 in debugger

I am using pandas in Python 2.7 and read a csv file like this: 我在Python 2.7中使用熊猫,并读取了这样的csv文件:

import pandas as pd

df = pd.read_csv("test_file.csv")

df has a column titled rating , and a column titled 'review', I do some manipulations on df for example: df有一个标题为rating的列和一个名为'review'的列,例如,我对df做了一些操作:

df3 = df[df['rating'] != 3]

Now if I look in a debugger at df['review'] and df3['review'] I see this information: 现在,如果我在df['review']df3['review']查看调试器,则会看到以下信息:

df['review'] = {Series}0
df3['review'] = {Series}1

Also if I want to see the first element of df['review'] I use: 另外,如果我想查看df['review']的第一个元素,请使用:

df['review'][0]

which is fine, but if I do the same for df3 , I get this error: 很好,但是如果我对df3做同样的操作,则会收到此错误:

df3['review'][0]
{KeyError}0L

However, it looks like I can do this: 但是,看来我可以这样做:

df3['review'][1]

Can someone please explain the difference? 有人可以解释一下区别吗?

Indexing with an integer on a Series doesn't work like a list. 在Series上使用整数索引不像列表那样工作。 In particular, df['review'][0] doesn't get the first element of the "review" column, it gets the element with index 0: 特别是, df['review'][0]不会获得“评论”列的第一个元素,它会获得索引为0的元素:

In [4]: s = pd.Series(['a', 'b', 'c', 'd'], index=[1, 0, 2, 3])

In [5]: s
Out[5]:
1    a
0    b
2    c
3    d
dtype: object

In [6]: s[0]
Out[6]: 'b'

Presumably, in generating df3 you dropped the row with index 0. If you actually want to get the first element regardless of the index, use iloc : 大概在生成df3您删除了索引为0的行。如果实际上无论索引如何,都希望获取第一个元素,请使用iloc

In [7]: s.iloc[0]
Out[7]: 'a'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM