Python：pandas数据框架和调试器中{Series} 0的含义

Question

I am using pandas in Python 2.7 and read a csv file like this: 我在Python 2.7中使用熊猫，并读取了这样的csv文件：

import pandas as pd

df = pd.read_csv("test_file.csv")

df has a column titled rating , and a column titled 'review', I do some manipulations on df for example: df有一个标题为rating的列和一个名为'review'的列，例如，我对df做了一些操作：

df3 = df[df['rating'] != 3]

Now if I look in a debugger at df['review'] and df3['review'] I see this information: 现在，如果我在df['review']和df3['review']查看调试器，则会看到以下信息：

df['review'] = {Series}0
df3['review'] = {Series}1

Also if I want to see the first element of df['review'] I use: 另外，如果我想查看df['review']的第一个元素，请使用：

df['review'][0]

which is fine, but if I do the same for df3 , I get this error: 很好，但是如果我对df3做同样的操作，则会收到此错误：

df3['review'][0]
{KeyError}0L

However, it looks like I can do this: 但是，看来我可以这样做：

df3['review'][1]

Can someone please explain the difference? 有人可以解释一下区别吗？

Answer 1

Indexing with an integer on a Series doesn't work like a list. 在Series上使用整数索引不像列表那样工作。 In particular, df['review'][0] doesn't get the first element of the "review" column, it gets the element with index 0: 特别是， df['review'][0]不会获得“评论”列的第一个元素，它会获得索引为0的元素：

In [4]: s = pd.Series(['a', 'b', 'c', 'd'], index=[1, 0, 2, 3])

In [5]: s
Out[5]:
1    a
0    b
2    c
3    d
dtype: object

In [6]: s[0]
Out[6]: 'b'

Presumably, in generating df3 you dropped the row with index 0. If you actually want to get the first element regardless of the index, use iloc : 大概在生成df3您删除了索引为0的行。如果实际上无论索引如何，都希望获取第一个元素，请使用iloc ：

In [7]: s.iloc[0]
Out[7]: 'a'

Python：pandas数据框架和调试器中{Series} 0的含义

问题描述

1 个解决方案

解决方案1
1 已采纳 2015-10-14 15:09:05

Python：pandas数据框架和调​​试器中{Series} 0的含义

问题描述

1 个解决方案

解决方案1 1 已采纳 2015-10-14 15:09:05

Python：pandas数据框架和调试器中{Series} 0的含义

解决方案1
1 已采纳 2015-10-14 15:09:05